When no formal theory is available, as is often the case, then the analyst needs to justify statistical specifications by showing that they fit the data. That means more than just “running things.” It means careful graphical and crosstabular analysis … When I present this argument … one or more scholars say, “But shouldn’t I control for every-thing I can? If not, aren’t my regression coefficients biased due to excluded variables?” But this argument is not as persuasive as it may seem initially. First of all, if what you are doing is mis-specified already, then adding or excluding other variables has no tendency to make things consistently better or worse. The excluded variable argument only works if you are sure your specification is precisely correct with all variables
Topics:
Lars Pålsson Syll considers the following as important: Economics
This could be interesting, too:
Lars Pålsson Syll writes Daniel Waldenströms rappakalja om ojämlikheten
Peter Radford writes AJR, Nobel, and prompt engineering
Lars Pålsson Syll writes MMT explained
Lars Pålsson Syll writes Statens finanser funkar inte som du tror
When no formal theory is available, as is often the case, then the analyst needs to justify statistical specifications by showing that they fit the data. That means more than just “running things.” It means careful graphical and crosstabular analysis …
When I present this argument … one or more scholars say, “But shouldn’t I control for every-thing I can? If not, aren’t my regression coefficients biased due to excluded variables?” But this argument is not as persuasive as it may seem initially.
First of all, if what you are doing is mis-specified already, then adding or excluding other variables has no tendency to make things consistently better or worse. The excluded variable argument only works if you are sure your specification is precisely correct with all variables included. But no one can know that with more than a handful of explanatory variables.
Still more importantly, big, mushy regression and probit equations seem to need a great many control variables precisely because they are jamming together all sorts of observations that do not belong together. Countries, wars, religious preferences, education levels, and other variables that change people’s coefficients are “controlled” with dummy variables that are completely inadequate to modeling their effects. The result is a long list of independent variables, a jumbled bag of nearly unrelated observations, and often, a hopelessly bad specification with meaningless (but statistically significant with several asterisks!) results.
This article is one of my absolute favourites. Why? Because it reaffirms yours truly’s view that since there is no absolutely certain knowledge at hand in social sciences — including economics — explicit argumentation and justification ought to play an extremely important role if purported knowledge claims are to be sustainably warranted. As Achen puts it, without careful supporting arguments, “just dropping variables into SPSS, STATA, S or R programs accomplishes nothing.”