From Lars Syll The impossibility of proper specification is true generally in regression analyses across the social sciences, whether we are looking at the factors affecting occupational status, voting behavior, etc. The problem is that as implied by the three conditions for regression analyses to yield accurate, unbiased estimates, you need to investigate a phenomenon that has underlying mathematical regularities – and, moreover, you need to know what they are. Neither seems true. I have no reason to believe that the way in which multiple factors affect earnings, student achievement, and GNP have some underlying mathematical regularity across individuals or countries. More likely, each individual or country has a different function, and one that changes over time. Even if there was
Topics:
Lars Pålsson Syll considers the following as important: Uncategorized
This could be interesting, too:
Dean Baker writes Health insurance killing: Economics does have something to say
Lars Pålsson Syll writes Debunking mathematical economics
John Quiggin writes RBA policy is putting all our futures at risk
Merijn T. Knibbe writes ´Extra Unordinarily Persistent Large Otput Gaps´ (EU-PLOGs)
from Lars Syll
The impossibility of proper specification is true generally in regression analyses across the social sciences, whether we are looking at the factors affecting occupational status, voting behavior, etc. The problem is that as implied by the three conditions for regression analyses to yield accurate, unbiased estimates, you need to investigate a phenomenon that has underlying mathematical regularities – and, moreover, you need to know what they are. Neither seems true. I have no reason to believe that the way in which multiple factors affect earnings, student achievement, and GNP have some underlying mathematical regularity across individuals or countries. More likely, each individual or country has a different function, and one that changes over time. Even if there was some constancy, the processes are so complex that we have no idea of what the function looks like.
Researchers recognize that they do not know the true function and seem to treat, usually implicitly, their results as a good-enough approximation. But there is no basis for the belief that the results of what is run in practice is anything close to the underlying phenomenon, even if there is an underlying phenomenon. This just seems to be wishful thinking. Most regression analysis research doesn’t even pay lip service to theoretical regularities. But you can’t just regress anything you want and expect the results to approximate reality. And even when researchers take somewhat seriously the need to have an underlying theoretical framework – as they have, at least to some extent, in the examples of studies of earnings, educational achievement, and GNP that I have used to illustrate my argument – they are so far from the conditions necessary for proper specification that one can have no confidence in the validity of the results.
Most work in econometrics and regression analysis is done on the assumption that the researcher has a theoretical model that is ‘true.’ Based on this belief of having a correct specification for an econometric model or running a regression, one proceeds as if the only problem remaining to solve has to do with measurement and observation.
The problem is that there is little to support the perfect specification assumption. Looking around in social science and economics we don’t find a single regression or econometric model that lives up to the standards set by the ‘true’ theoretical model — and there is nothing that gives us reason to believe things will be different in the future.
To think that we are being able to construct a model where all relevant variables are included and correctly specify the functional relationships that exist between them is not only a belief with little support but a belief impossible to support.
The theories we work with when building our econometric regression models are insufficient. No matter what we study, there are always some variables missing, and we don’t know the correct way to functionally specify the relationships between the variables.
Every regression model constructed is misspecified. There is always an endless list of possible variables to include, and endless possible ways to specify the relationships between them. So every applied econometrician comes up with his own specification and ‘parameter’ estimates. The econometric Holy Grail of consistent and stable parameter values is nothing but a dream.
The theoretical conditions that have to be fulfilled for regression analysis and econometrics to really work are nowhere even closely met in reality. Making outlandish statistical assumptions does not provide a solid ground for doing relevant social science and economics. Although regression analysis and econometrics have become the most used quantitative methods in social sciences and economics today, it’s still a fact that the inferences made from them are of strongly questionable validity.
The econometric art as it is practiced at the computer … involves fitting many, perhaps thousands, of statistical models….There can be no doubt that such a specification search invalidates the traditional theories of inference … All the concepts of traditional theory utterly lose their meaning by the time an applied researcher pulls from the bramble of computer output the one thorn of a model he likes best, the one he chooses to portray as a rose.