Monday , November 4 2024
Home / Lars P. Syll / What statistics teachers get wrong!

What statistics teachers get wrong!

Summary:
What statistics teachers get wrong! .[embedded content] This insightful video confirms what I always like to emphasize to my doctoral students: Statistics is no substitute for thinking. A non-trivial part of teaching statistics is made up of learning students to perform significance testing. A problem I have noticed repeatedly over the years, however, is that no matter how careful you try to be in explicating what the probabilities generated by these statistical tests — p-values — really are, still most students misinterpret them. Giving a statistics course for the Swedish National Research School in History, I asked the students at the exam to explain how one should correctly interpret p-values. Although the correct definition is p(data|null

Topics:
Lars Pålsson Syll considers the following as important:

This could be interesting, too:

Lars Pålsson Syll writes Statistical uncertainty

Lars Pålsson Syll writes The dangers of using pernicious fictions in statistics

Lars Pålsson Syll writes Interpreting confidence intervals

Lars Pålsson Syll writes What kind of ‘rigour’ do RCTs provide?

What statistics teachers get wrong!

.

This insightful video confirms what I always like to emphasize to my doctoral students:

Statistics is no substitute for thinking.

A non-trivial part of teaching statistics is made up of learning students to perform significance testing. A problem I have noticed repeatedly over the years, however, is that no matter how careful you try to be in explicating what the probabilities generated by these statistical tests — p-values — really are, still most students misinterpret them.

Giving a statistics course for the Swedish National Research School in History, I asked the students at the exam to explain how one should correctly interpret p-values. Although the correct definition is p(data|null hypothesis), a majority of the students either misinterpreted the p-value as being the likelihood of a sampling error (which of course is wrong, since the very computation of the p-value is based on the assumption that sampling errors are what causes the sample statistics not coinciding with the null hypothesis) or that the p-value is the probability of the null hypothesis being true, given the data (which of course also is wrong, since that is p(null hypothesis|data) rather than the correct p(data|null hypothesis)).

All science entails human judgment, and using statistical models doesn’t relieve us of that necessity. Working with misspecified models, the scientific value of significance testing is actually zero — even though you’re making valid statistical inferences! Statistical models and concomitant significance tests are no substitutes for doing real science.

In its standard form, a significance test is not the kind of ‘severe test’ that we are looking for in our search for being able to confirm or disconfirm empirical scientific hypotheses. This is problematic for many reasons, one being that there is a strong tendency to accept the null hypothesis since it can’t be rejected at the standard 5% significance level. In their standard form, significance tests bias against new hypotheses by making it hard to disconfirm the null hypothesis.

We should never forget that the underlying parameters we use when performing significance tests are model constructions. Our p-values mean next to nothing if the model is wrong. Statistical​ significance tests do not validate models!

What statistics teachers get wrong!In journal articles a typical regression equation will have an intercept and several explanatory variables. The regression output will usually include an F-test, with p-1 degrees of freedom in the numerator and n-p in the denominator. The null hypothesis will not be stated. The missing null hypothesis is that all the coefficients vanish, except the intercept.

If F is significant, that is often thought to validate the model. Mistake. The F-test takes the model as given. Significance only means this: if the model is right and the coefficients are 0, it is very unlikely to get such a big F-statistic. Logically, there are three possibilities on the table:
i) An unlikely event occurred.
ii) Or the model is right and some of the coefficients differ from 0.
iii) Or the model is wrong.
So?

Lars Pålsson Syll
Professor at Malmö University. Primary research interest - the philosophy, history and methodology of economics.

Leave a Reply

Your email address will not be published. Required fields are marked *