How to avoid pernicious fictional statistics Mathematics is a limited component of solutions to real-world problems, as it expresses only what is expected to be true if all our assumptions are correct, including implicit assumptions that are omnipresent and often incorrect. Statistical methods are rife with implicit assumptions whose violation can be life-threatening when results from them are used to set policy. Among them are that there is human equipoise or unbiasedness in data generation, management, analysis, and reporting. These assumptions correspond to levels of cooperation, competence, neutrality, and integrity that are absent more often than we would like to believe. Given this harsh reality, we should ask what meaning, if any, we can assign
Topics:
Lars Pålsson Syll considers the following as important: Statistics & Econometrics
This could be interesting, too:
Lars Pålsson Syll writes What statistics teachers get wrong!
Lars Pålsson Syll writes Statistical uncertainty
Lars Pålsson Syll writes The dangers of using pernicious fictions in statistics
Lars Pålsson Syll writes Interpreting confidence intervals
How to avoid pernicious fictional statistics
Mathematics is a limited component of solutions to real-world problems, as it expresses only what is expected to be true if all our assumptions are correct, including implicit assumptions that are omnipresent and often incorrect. Statistical methods are rife with implicit assumptions whose violation can be life-threatening when results from them are used to set policy. Among them are that there is human equipoise or unbiasedness in data generation, management, analysis, and reporting. These assumptions correspond to levels of cooperation, competence, neutrality, and integrity that are absent more often than we would like to believe.
Given this harsh reality, we should ask what meaning, if any, we can assign to the P-values, “statistical significance” declarations, “confidence” intervals, and posterior probabilities that are used to decide what and how to present (or spin) discussions of analyzed data. By themselves, P-values and CI do not test any hypothesis, nor do they measure the significance of results or the confidence we should have in them. The sense otherwise is an ongoing cultural error perpetuated by large segments of the statistical and research community via misleading terminology.
So-called “inferential” statistics can only become contextually interpretable when derived explicitly from causal stories about the real data generator (such as randomization), and can only become reliable when those stories are based on valid and public documentation of the physical mechanisms that generated the data. Absent these assurances, traditional interpretations of statistical results become pernicious fictions that need to be replaced by far more circumspect descriptions of data and model relations.
All science entails human judgment, and using statistical models does not relieve us of that necessity. Working with misspecified models, the scientific value of ‘significance testing’ is actually zero — even though you’re making valid statistical ‘inferences’! Statistical models and concomitant significance tests are no substitutes for doing real science.
In its standard form, a significance test is not the kind of ‘severe test’ that we are looking for in our search for being able to confirm or disconfirm empirical scientific hypotheses. This is problematic for many reasons, one being that there is a strong tendency to accept the ‘null hypothesis’ since it can’t be rejected at the standard 5% significance level. And as shown over and over again when it is applied, people have a tendency to read ‘not disconfirmed’ as ‘probably confirmed.’
The excessive reliance on significance testing in science is disturbing and should be fought. But it is also important to put significance testing abuse in perspective. The real problem in today’s social sciences is not significance testing per se. No, the real problem has to do with the often unqualified and mechanistic application of statistical methods to real-world phenomena without having even the slightest idea of how the assumptions behind the statistical models condition and severely limit the value of the inferences made.