An ongoing concern is that excessive focus on formal modeling and statistics can lead to neglect of practical issues and to overconfidence in formal results … Analysis interpretation depends on contextual judgments about how reality is to be mapped onto the model, and how the formal analysis results are to be mapped back into reality. But overconfidence in formal outputs is only to be expected when much labor has gone into deductive reasoning. First, there is a need to feel the labor was justified, and one way to do so is to believe the formal deduction produced important conclusions. Second, there seems to be a pervasive human aversion to uncertainty, and one way to reduce feelings of uncertainty is to invest faith in deduction as a sufficient guide to
Topics:
Lars Pålsson Syll considers the following as important: Statistics & Econometrics
This could be interesting, too:
Lars Pålsson Syll writes What statistics teachers get wrong!
Lars Pålsson Syll writes Statistical uncertainty
Lars Pålsson Syll writes The dangers of using pernicious fictions in statistics
Lars Pålsson Syll writes Interpreting confidence intervals
An ongoing concern is that excessive focus on formal modeling and statistics can lead to neglect of practical issues and to overconfidence in formal results … Analysis interpretation depends on contextual judgments about how reality is to be mapped onto the model, and how the formal analysis results are to be mapped back into reality. But overconfidence in formal outputs is only to be expected when much labor has gone into deductive reasoning. First, there is a need to feel the labor was justified, and one way to do so is to believe the formal deduction produced important conclusions. Second, there seems to be a pervasive human aversion to uncertainty, and one way to reduce feelings of uncertainty is to invest faith in deduction as a sufficient guide to truth. Unfortunately, such faith is as logically unjustified as any religious creed, since a deduction produces certainty about the real world only when its assumptions about the real world are certain …
Unfortunately, assumption uncertainty reduces the status of deductions and statistical computations to exercises in hypothetical reasoning – they provide best-case scenarios of what we could infer from specific data (which are assumed to have only specific, known problems). Even more unfortunate, however, is that this exercise is deceptive to the extent it ignores or misrepresents available information, and makes hidden assumptions that are unsupported by data …
Despite assumption uncertainties, modelers often express only the uncertainties derived within their modeling assumptions, sometimes to disastrous consequences. Econometrics supplies dramatic cautionary examples in which complex modeling has failed miserably in important applications …
Much time should be spent explaining the full details of what statistical models and algorithms actually assume, emphasizing the extremely hypothetical nature of their outputs relative to a complete (and thus nonidentified) causal model for the data-generating mechanisms. Teaching should especially emphasize how formal ‘‘causal inferences’’ are being driven by the assumptions of randomized (‘‘ignorable’’) system inputs and random observational selection that justify the ‘‘causal’’ label.
Yes, indeed, complex modeling when applying statistics theory fails miserably over and over again. One reason why it does — prominent in econometrics — is that the error term in the regression models standardly used is thought of as representing the effect of the variables that were omitted from the models. The error term is somehow thought to be a ‘cover-all’ term representing omitted content in the model and necessary to include to ‘save’ the assumed deterministic relation between the other random variables included in the model. Error terms are usually assumed to be orthogonal (uncorrelated) to the explanatory variables. But since they are unobservable, they are also impossible to empirically test. And without justification of the orthogonality assumption, there is as a rule nothing to ensure identifiability:
With enough math, an author can be confident that most readers will never figure out where a FWUTV (facts with unknown truth value) is buried. A discussant or referee cannot say that an identification assumption is not credible if they cannot figure out what it is and are too embarrassed to ask.
Distributional assumptions about error terms are a good place to bury things because hardly anyone pays attention to them. Moreover, if a critic does see that this is the identifying assumption, how can she win an argument about the true expected value the level of aether? If the author can make up an imaginary variable, “because I say so” seems like a pretty convincing answer to any question about its properties.