The inherent epistemological limitation of econometric testing To understand the relationship between economic data and economic phenomena, it is helpful first to be clear about what we mean by each of these terms. Following Jim Woodward (1989), we can characterize “phenomena” as features of our experience that we take to be “relatively stable” and “which are potential objects of explanation and prediction by general theory.” The phenomena themselves are in general not directly observable, and so in order to investigate claims about them, we require some observable representation. Data play this role. And although it is a crucial role, it is a supporting rather than a starring role. As Woodward suggests, “data are typically not viewed as potential
Topics:
Lars Pålsson Syll considers the following as important: Statistics & Econometrics
This could be interesting, too:
Lars Pålsson Syll writes What statistics teachers get wrong!
Lars Pålsson Syll writes Statistical uncertainty
Lars Pålsson Syll writes The dangers of using pernicious fictions in statistics
Lars Pålsson Syll writes Interpreting confidence intervals
The inherent epistemological limitation of econometric testing
To understand the relationship between economic data and economic phenomena, it is helpful first to be clear about what we mean by each of these terms. Following Jim Woodward (1989), we can characterize “phenomena” as features of our experience that we take to be “relatively stable” and “which are potential objects of explanation and prediction by general theory.” The phenomena themselves are in general not directly observable, and so in order to investigate claims about them, we require some observable representation. Data play this role. And although it is a crucial role, it is a supporting rather than a starring role. As Woodward suggests, “data are typically not viewed as potential objects of explanation by or derivation from general theory; indeed, they typically are of no theoretical interest except insofar as they constitute evidence” for claims about the phenomena. Data are simply matrices of numbers. Economically speaking, characterizing the internal relations of a matrix of numbers is not of inherent interest. It only becomes so when we claim that the numbers represent in some way actual phenomena of interest.
What is the nature of this representation? Data are, in a sense, meant to be a quantitative crystallization of the phenomena. In order to determine what will count as data for a particular phenomenon or set of phenomena, one must specify particular observable and quantifiable features of the world that can capture the meaning of the phenomena adequately for the purposes of one’s particular inquiry …
Inferences about the data are inferences about model objects and are therefore a part of the model narrative. We can validly interpret such inferences about the data as possible inferences about the underlying social phenomena only to the extent that we have established the plausibility of a homomorphic relationship between the data and the aspects of the underlying phenomena they are meant to represent. This homomorphism requirement, then, is an extension of the essential compatibility requirement: in empirical modeling exercises, the requirement of essential compatibility between model and target includes a requirement of homomorphism between data and target (because the data are a part of the model) …
Econometricians are, of course, well aware of the importance of the relationship between the data and the underlying phenomena of interest. In the literature, this relationship is generally couched in terms of a data-generating process (DGP) … If we were to be able to perceive the true DGP in its entirety, we would essentially know the complete underlying structure whose observable precipitates are the data. Our only evidence of the DGP, however, is the data …
It is important to note, however, that characterizing pieces of the data-generating process is an intra-model activity. It reveals the possible mathematical structure underlying a matrix of numbers, and it is properly judged according to (and only according to) the relevant rules of mathematics. In contrast, the requirement that a relation of homomorphism exist between the data and the underlying phenomena is concerned with the relationship between model and target entities. The extent to which data satisfy this requirement in any given case cannot be determined through econometric analysis, nor does econometric analysis obviate the need to establish that the requirement is met. On the contrary, the results of an econometric analysis of a given data set — i.e. the characterization of a piece of its DGP — can be validly interpreted as providing epistemic access to the target only if it is plausible that a relation of homomorphism holds between the data and the aspects of the target they ostensibly represent.
Econometrics is supposed to be able to test economic theories. But to serve as a testing device you have to make many assumptions, many of which cannot be tested or verified. To make things worse, there are also rarely strong and reliable ways of telling us which set of assumptions is preferred. Trying to test and infer causality from data you have to rely on assumptions such as disturbance terms being ‘independent and identically distributed’; functions being additive, linear, and with constant coefficients; parameters being’ ‘invariant under intervention; variables being ‘exogenous’, ‘identifiable’, ‘structural and so on. Unfortunately, we are seldom or never informed of where that kind of ‘knowledge’ comes from, beyond referring to the economic theory that one is supposed to test.
That leaves us in the awkward position of admitting that if the assumptions made do not hold, the inferences, conclusions and testing outcomes econometricians come up with simply do not follow the data and statistics they use.
The central question is ‘How do we learn from empirical data?’ But we have to remember that the value of testing hinges on our ability to validate the — often unarticulated — assumptions on which the testing models build. If the model is wrong, the test apparatus simply gives us fictional values. There is always a risk that one puts a blind eye to some of those non-fulfilled technical assumptions that actually make the testing results — and the inferences we build on them — unwarranted. Econometric testing builds on the assumption that the hypotheses can be treated as hypotheses about (joint) probability distributions and that economic variables can be treated as if pulled out of an urn as a random sample. Most economic phenomena are nothing of the kind.
Most users of the econometric toolbox seem to have a built-in blindness to the fact that mathematical-statistical modelling in social sciences is inherently incomplete since it builds on the presupposition that the model properties are — without serious argumentation or warrant — assumed to also apply to the intended real-world target systems studied. Many of the processes and structures that we know play essential roles in the target systems do not show up — often for mathematical-statistical tractability reasons — in the models. The bridge between model and reality is failing. Valid and relevant information is unrecognized and lost, making the models harmfully misleading and largely irrelevant if our goal is to learn, explain or understand anything about actual economies and societies. Without giving strong evidence for an essential compatibility between model and reality the analysis becomes nothing but a fictitious storytelling of questionable scientific value.
It is difficult to find any hard evidence that econometric testing has been able to exclude any economic theory. If we are to judge econometrics based on its capacity of eliminating invalid theories, it has not been a very successful business.