The one place that preregistration is really needed … is if you want clean p-values. A p-value is very explicitly a statement about how you would’ve analyzed the data, had they come out differently. Sometimes when I’ve criticized published p-values on the grounds of forking paths, the original authors have fought back angrily, saying how unfair it is for me to first make an assumption about what they would’ve done under different conditions, and then make conclusions based on these assumptions. But they’re getting things backward: By stating a p-value at all, they’re the ones who are making a very strong assumption about their hypothetical behavior—an assumption that, in general, I have no reason to believe. Preregistration is in fact the only way to ensure that p-values can be taken at their nominal values. In that way, preregistration is like random sampling which, strictly speaking, is the only way that sampling probabilities, estimates, standard errors, etc., can be taken at their nominal values … Yes, you can do surveys and get estimates and standard errors without ever taking a random sample … but to do this we need to make assumptions.
Topics:
Lars Pålsson Syll considers the following as important: Statistics & Econometrics
This could be interesting, too:
Lars Pålsson Syll writes The history of econometrics
Lars Pålsson Syll writes What statistics teachers get wrong!
Lars Pålsson Syll writes Statistical uncertainty
Lars Pålsson Syll writes The dangers of using pernicious fictions in statistics
The one place that preregistration is really needed … is if you want clean p-values. A p-value is very explicitly a statement about how you would’ve analyzed the data, had they come out differently. Sometimes when I’ve criticized published p-values on the grounds of forking paths, the original authors have fought back angrily, saying how unfair it is for me to first make an assumption about what they would’ve done under different conditions, and then make conclusions based on these assumptions. But they’re getting things backward: By stating a p-value at all, they’re the ones who are making a very strong assumption about their hypothetical behavior—an assumption that, in general, I have no reason to believe.
Preregistration is in fact the only way to ensure that p-values can be taken at their nominal values. In that way, preregistration is like random sampling which, strictly speaking, is the only way that sampling probabilities, estimates, standard errors, etc., can be taken at their nominal values …
Yes, you can do surveys and get estimates and standard errors without ever taking a random sample … but to do this we need to make assumptions.
And, yes, you can do causal inference from observational studies—indeed, in many settings this is absolutely necessary—but, again, assumptions are needed …
Just as a serious social science journal—or even Psychological Science or PPNAS—would never accept a paper on sampling without some discussion of the representativeness of the sample, and just as they would never accept a causal inference based on a simple regression with no identification strategy and no discussion of imbalance between treatment and control groups, so should they not take seriously a p-value without a careful assessment of the assumptions underlying it.