The biggest problem in science

Lars Pålsson Syll July 30, 2019 Real-World Economics Review

by
Lars Pålsson Syll
My articles My site About me My books My videos
Follow on:Twitter

Summary:
From Lars Syll There’s a huge debate going on in social science right now. The question is simple, and strikes near the heart of all research: What counts as solid evidence? … Prominent statisticians, psychologists, economists, sociologists, political scientists, biomedical researchers, and others … argue that results should only be deemed “statistically significant” if they pass a higher threshold. “We propose a change to P< 0.005,” the authors write. “This simple step would immediately improve the reproducibility of scientific research in many fields” … There’s a critique of the proposal the authors whom I spoke to agree completely with: Changing the definition of statistical significance doesn’t address the real problem. And the real problem is the culture of science. In 2016, Vox

Topics:
Lars Pålsson Syll considers the following as important: Uncategorized

This could be interesting, too:

tom writes The Ukraine war and Europe’s deepening march of folly

Stavros Mavroudeas writes CfP of Marxist Macroeconomic Modelling workgroup – 18th WAPE Forum, Istanbul August 6-8, 2025

Lars Pålsson Syll writes The pretence-of-knowledge syndrome

Dean Baker writes Crypto and Donald Trump’s strategic baseball card reserve

from Lars Syll

There’s a huge debate going on in social science right now. The question is simple, and strikes near the heart of all research: What counts as solid evidence? …

Prominent statisticians, psychologists, economists, sociologists, political scientists, biomedical researchers, and others … argue that results should only be deemed “statistically significant” if they pass a higher threshold.

“We propose a change to P< 0.005,” the authors write. “This simple step would immediately improve the reproducibility of scientific research in many fields” …

There’s a critique of the proposal the authors whom I spoke to agree completely with: Changing the definition of statistical significance doesn’t address the real problem. And the real problem is the culture of science.

In 2016, Vox sent out a survey to more than 200 scientists, asking, “If you could change one thing about how science works today, what would it be and why?” One of the clear themes in the responses: The institutions of science need to get better at rewarding failure.

One young scientist told us, “I feel torn between asking questions that I know will lead to statistical significance and asking questions that matter.”

The biggest problem in science isn’t statistical significance. It’s the culture. She felt torn because young scientists need publications to get jobs. Under the status quo, in order to get publications, you need statistically significant results. Statistical significance alone didn’t lead to the replication crisis. The institutions of science incentivized the behaviors that allowed it to fester.

Brian Resnick

As shown over and over again when significance tests are applied, people have a tendency to read ‘not disconfirmed’ as ‘probably confirmed.’ Standard scientific methodology tells us that when there is only say a 10 % probability that pure sampling error could account for the observed difference between the data and the null hypothesis, it would be more ‘reasonable’ to conclude that we have a case of disconfirmation. Especially if we perform many independent tests of our hypothesis and they all give about the same 10 % result as our reported one, I guess most researchers would count the hypothesis as even more disconfirmed.

We should never forget that the underlying parameters we use when performing significance tests are model constructions. Our p-values mean nothing if the model is wrong. And most importantly — statistical significance tests DO NOT validate models!

In journal articles a typical regression equation will have an intercept and several explanatory variables. The regression output will usually include an F-test, with p – 1 degrees of freedom in the numerator and n – p in the denominator. The null hypothesis will not be stated. The missing null hypothesis is that all the coefficients vanish, except the intercept.

If F is significant, that is often thought to validate the model. Mistake. The F-test takes the model as given. Significance only means this: if the model is right and the coefficients are 0, it is very unlikely to get such a big F-statistic. Logically, there are three possibilities on the table:
i) An unlikely event occurred.
ii) Or the model is right and some of the coefficients differ from 0.
iii) Or the model is wrong.
So?

Full story here

Are you the author?

0 0

The biggest problem in science

Related Articles

Leave a Reply Cancel reply

More on this subject