Why p-values cannot be taken at face value A researcher is interested in differences between Democrats and Republicans in how they perform in a short mathematics test when it is expressed in two different contexts, either involving health care or the military. The research hypothesis is that context matters, and one would expect Democrats to do better in the health- care context and Republicans in the military context … At this point there is a huge number of possible comparisons that can be performed—all consistent with the data. For example, the pattern could be found (with statistical significance) among men and not among women— explicable under the theory that men are more ideological than women. Or the pattern could be found among women but not among men—explicable under the theory that women are more sensitive to context, compared to men … A single overarching research hypothesis—in this case, the idea that issue context interacts with political partisanship to affect mathematical problem-solving skills—corresponds to many different possible choices of the decision variable. At one level, these multiplicities are obvious.
Topics:
Lars Pålsson Syll considers the following as important: Economics, Statistics & Econometrics
This could be interesting, too:
Lars Pålsson Syll writes En statsbudget för Sveriges bästa
Lars Pålsson Syll writes MMT — debunking the deficit myth
Lars Pålsson Syll writes Daniel Waldenströms rappakalja om ojämlikheten
Peter Radford writes AJR, Nobel, and prompt engineering
Why p-values cannot be taken at face value
A researcher is interested in differences between Democrats and Republicans in how they perform in a short mathematics test when it is expressed in two different contexts, either involving health care or the military. The research hypothesis is that context matters, and one would expect Democrats to do better in the health- care context and Republicans in the military context … At this point there is a huge number of possible comparisons that can be performed—all consistent with the data. For example, the pattern could be found (with statistical significance) among men and not among women— explicable under the theory that men are more ideological than women. Or the pattern could be found among women but not among men—explicable under the theory that women are more sensitive to context, compared to men … A single overarching research hypothesis—in this case, the idea that issue context interacts with political partisanship to affect mathematical problem-solving skills—corresponds to many different possible choices of the decision variable.
At one level, these multiplicities are obvious. And it would take a highly unscrupulous researcher to perform test after test in a search for statistical significance … Given a particular data set, it is not so difficult to look at the data and construct completely reasonable rules for data exclusion, coding, and data analysis that can lead to statistical significance—thus, the researcher needs only perform one test, but that test is conditional on the data … A researcher when faced with multiple reasonable measures can reason (perhaps correctly) that the one that produces a significant result is more likely to be the least noisy measure, but then decide (incorrectly) to draw inferences based on that one only.