# The Limits of Statistical Methodology: Why A “Statistically Significant” Number of Published Scientific Research Findings are False, #4.

# By Joseph Wayne Smith

**1. **Introduction

**2.** Troubles in Statistical Paradise

**4.** The Limits of Probability Theory

**5. **Conclusion

The essay that follows below will be published in four installments; this is the fourth and final installment.

But you can also download and read or share a .pdf of the complete text of this essay, including the REFERENCES, by scrolling down to the bottom of this post and clicking on the **Download **tab.

**4. The Limits of Probability Theory**

There are many unsolved logical problems facing probability theory, especially involving infinite events (Hild, 2000; Shackel, 2007; Hájek, 1997, 2003, 2007). For example, what is the probability of an infinite sequence of heads tossed with an unbiased coin (Williamson, 2007)? Assume that the coin is “fair” by hypothesis. Multiplying the conjunctive probabilities leads to a sequence converging to 0 probability. Yet, an infinite sequence of heads is one logical possibility. Williamson argues that the use of infinitesimal probabilities does not resolve the contradiction:

*Cantor showed that some natural, apparently compelling forms of reasoning fail for infinite sets. This moral applies to forms of probabilistic and decision-theoretic reasoning in a more radical way than may have been realised. Infinitesimals do not solve the problem.* (Williamson, 2007: p. 179)

Another relevant problem is that of the definition of conditional probability as a ratio of unconditional probabilities (Hájek, 2003):

*Pr(A/B) = Pr(A&B), Pr(B) > 0 *

*Pr(B)*.

Hájek notes that that zero probability events are not necessarily impossible and can be of real scientific interest. He points out that Kolmogorov deals with this problem by analyzing conditional probability as a random variable. But even here there are problems because conditional probabilities can be defined in situations where the ratio is undefined because Pr(A&B) and Pr(B) are undefined. For example, if there is an urn with 90 red balls and 10 white balls, well mixed, the probability of drawing a red ball given that a ball is drawn at random is 0.9. However, the ratio analysis gives:

*Pr (X draws a red ball & X draws a ball at random from the urn)*

*Pr (X draws a ball at random from the urn)*

which does not have a defined numerator nor denominator (Hájek,2007).

Apart from these logical problems facing probability, one of the most important unsolved philosophical/methodological problems involving probabilities is the reference class problem: any sentence, event, or proposition can be classified in various ways; hence the probability of the sentence, event, or proposition, is dependent upon the classification (Colyvan et al., 2001; Kaye, 2004; Pardo, 2007; Colyvan & Regan, 2007; Rhee, 2007; Allen & Pardo, 2007a). The reference problem is not merely a problem for probabilistic evidence but as Roberts explains, is more general:

*Every factual generalisation implies a reference class, and this in turn entails that the reference class problem is an inescapable concomitant of inferential reasoning and fact-finding in legal proceedings. *(Roberts, 2007: p. 245)

Nevertheless, the problem has frequently been discussed in the narrower context of probability problems by leading theorists such as John Venn (Venn, 1876) and Hans Reichenbach (Reichenbach, 1949: p. 374). Although the problem has been regarded by many inductive logicians as providing a decisive refutation of the frequentist interpretation of probability, the reference problem also arises for classical, logical, propensity and subjectivist Bayesian interpretation as well (Hájek, 2007). The reference class problem has also been discussed in a legal context, and if the problem turns out to be insuperable for one area of human cognitive activity, then this establishes a general problem.

The reference class problem has been discussed in the. jurisprudential literature, in the case of *United States v Shonubi* (1992, 1995, 1997). A Nigerian citizen, Charles Shonubi, was convicted of smuggling heroin into New York by the Kennedy airport. Shonubi had made seven previous drug-smuggling trips. Since sentencing is based on the total quantity of drugs smuggled, the prosecution estimated the quantity of heroin smuggled on those prior trips. In the trial, the US Second Circuit Court of Appeals did not allow the statistical evidence. Consequently, Shonubi was prosecuted on the basis of the actual quantity of drugs in his possession at the time he was arrested. The statistical data were based upon estimates using the reference class of other Nigerians smuggling heroin into Kennedy airport using Shonubi’s method of ingesting balloons containing heroin paste. But if use were made of a different reference class to which Shonubi also belonged, a conflicting probability would have been be obtained.

Ronald J. Allen and Michael S. Pardo, in their paper “The Problematic Value of Mathematical Models of Evidence” (Allen & Pardo, 2007a), have concluded that the reference class problem shows the epistemological limits of mathematical models of evidence for, at least, law:

*The reference-class problem demonstrates that objective probabilities based on a particular class of which an item of evidence is a member cannot typically (and maybe never) capture the probative value of that evidence for establishing facts relating to a specific event. The only class that would accurately capture the ‘objective’ value would be the event itself, which would have a probability of one or zero, respectively.* (Allen & Pardo, 2007a: p. 114).

There may be “practical” solutions to the reference class problem, because people make statistical inferences regularly in daily life (Cheng, 2009, 2089). Nevertheless, the theoretical issue, like that of making inductive inferences, is to show that such inferences are *justified*. Thus, Mike Redmayne concludes that the reference class problem is not intractable, but merely shows that probability judgments are relative to our evidence pool (Redmayne, 2008: p. 288). Agreed: but the issue in the debate is whether or not a rationally justified choice can be made between *prima facie* plausible, but conflicting probabilities, generated from different reference classes. Saying that our probability judgments are relative to our evidence pool, is true, but in fact only restates the problem: what is the “correct” evidence pool?

**5. Conclusion**

In this essay, I have examined the question raised by John Ioannidis, of why most published research findings, primarily in the social and biomedical sciences, are false. There are many reasons for this, such as small sample sizes, and even fraud, which when exposed lead to substantial numbers of papers being retracted. There is also the quality control issue as well, whereby journals are reluctant to publish refutations of papers, so that there is a build-up of intellectual “rubbish,” just as a creek might get clogged up with weeds. However, as I discussed above, the crisis of statistical methodology is also genuinely important, for if the foundational methodologies are flawed, then we cannot have reasoned faith in the conclusions reached. And that is precisely the situation in disciplines like psychology, for example, as far as much or even most empirical scientific research in those disciplines goes. Therefore, a constructive or healthy skepticism about empirical science is strongly recommended. At the same time, however, since constructive or healthy skepticism is itself a product of human rationality, then a cautious optimism about human rationality is also strongly recommended.

