The two statistics books every social scientist should read Mathematical statistician David Freedman‘s Statistical Models and Causal Inference (Cambridge University Press, 2010) and Statistical Models: Theory and Practice (Cambridge University Press, 2009) are marvellous books. They ought to be mandatory reading for every serious social scientist — including economists and econometricians — who doesn’t want to succumb to ad hoc assumptions and unsupported statistical conclusions! How do we calibrate the uncertainty introduced by data collection? Nowadays, this question has become quite salient, and it is routinely answered using wellknown methods of statistical inference, with standard errors, t -tests, and P-values … These conventional answers,
Topics:
Lars Pålsson Syll considers the following as important: Statistics & Econometrics
This could be interesting, too:
Lars Pålsson Syll writes What statistics teachers get wrong!
Lars Pålsson Syll writes Statistical uncertainty
Lars Pålsson Syll writes The dangers of using pernicious fictions in statistics
Lars Pålsson Syll writes Interpreting confidence intervals
The two statistics books every social scientist should read
Mathematical statistician David Freedman‘s Statistical Models and Causal Inference (Cambridge University Press, 2010) and Statistical Models: Theory and Practice (Cambridge University Press, 2009) are marvellous books. They ought to be mandatory reading for every serious social scientist — including economists and econometricians — who doesn’t want to succumb to ad hoc assumptions and unsupported statistical conclusions!
How do we calibrate the uncertainty introduced by data collection? Nowadays, this question has become quite salient, and it is routinely answered using wellknown methods of statistical inference, with standard errors, t -tests, and P-values … These conventional answers, however, turn out to depend critically on certain rather restrictive assumptions, for instance, random sampling …
Thus, investigators who use conventional statistical technique turn out to be making, explicitly or implicitly, quite restrictive behavioral assumptions about their data collection process … More typically, perhaps, the data in hand are simply the data most readily available …
The moment that conventional statistical inferences are made from convenience samples, substantive assumptions are made about how the social world operates … When applied to convenience samples, the random sampling assumption is not a mere technicality or a minor revision on the periphery; the assumption becomes an integral part of the theory …
In particular, regression and its elaborations … are now standard tools of the trade. Although rarely discussed, statistical assumptions have major impacts on analytic results obtained by such methods.
Invariance assumptions need to be made in order to draw causal conclusions from non-experimental data: parameters are invariant to interventions, and so are errors or their distributions. Exogeneity is another concern. In a real example, as opposed to a hypothetical, real questions would have to be asked about these assumptions. Why are the equations “structural,” in the sense that the required invariance assumptions hold true? Applied papers seldom address such assumptions, or the narrower statistical assumptions: for instance, why are errors IID?
The tension here is worth considering. We want to use regression to draw causal inferences from non-experimental data. To do that, we need to know that certain parameters and certain distributions would remain invariant if we were to intervene. Invariance can seldom be demonstrated experimentally. If it could, we probably wouldn’t be discussing invariance assumptions. What then is the source of the knowledge?
“Economic theory” seems like a natural answer, but an incomplete one. Theory has to be anchored in reality. Sooner or later, invariance needs empirical demonstration, which is easier said than done.