Econometrics — the signal-to-noise problem When we first encounter the term, “noisy data,” in econometrics, we are usually told that it refers to the problem of measurement error, or errors-in-variables—especially in the explanatory variables (x). Most textbooks contain a discussion of measurement error bias. In the case of a bivariate regression, y = a + bx + u, measurement error in x means the ordinary least squares (OLS) estimator is biased. The magnitude of the bias depends on the ratio of the measurement error variance to the variance of x. If that ratio is very small, then the bias is negligible; but if the ratio is large, that means the measurement error can “drown” the true variation in x, and the bias is large. In principle, the extent of the
Topics:
Lars Pålsson Syll considers the following as important: Statistics & Econometrics
This could be interesting, too:
Lars Pålsson Syll writes What statistics teachers get wrong!
Lars Pålsson Syll writes Statistical uncertainty
Lars Pålsson Syll writes The dangers of using pernicious fictions in statistics
Lars Pålsson Syll writes Interpreting confidence intervals
Econometrics — the signal-to-noise problem
When we first encounter the term, “noisy data,” in econometrics, we are usually told that it refers to the problem of measurement error, or errors-in-variables—especially in the explanatory variables (x). Most textbooks contain a discussion of measurement error bias. In the case of a bivariate regression, y = a + bx + u, measurement error in x means the ordinary least squares (OLS) estimator is biased. The magnitude of the bias depends on the ratio of the measurement error variance to the variance of x. If that ratio is very small, then the bias is negligible; but if the ratio is large, that means the measurement error can “drown” the true variation in x, and the bias is large.
In principle, the extent of the bias can be assessed by a simple formula, but in practice, this is rarely done. This is partly because we need to know the variance of the measurement error and, in most cases, we simply don’t know that. But there is more to it than that. There is a common opinion among many econometricians that, relative to the other problems of econometrics, a little bit of measurement error really doesn’t matter very much. Unfortunately, this misses the point. It is not the absolute size of the measurement error that matters, but its size relative to the variation in x. Nevertheless, many econometricians just ignore the problem …
Kalman proposed to “adopt the contemporary—very wide—implications of the word “noise,” as used in physics and engineering: any causal or random factors that should not or cannot be modeled, about which further information is not available, which are not analyzable, which may not recur reproducibly, etc. Thus, “noise” = the “unexplained.” This is a much more comprehensive category.”
This means that “noise” should include not just measurement errors and ambiguities in our economic concepts, but also any idiosyncracies and peculiarities in individual observations, which are not explained by the economic relationship we are interested in, and indeed, which obscure that relationship. Noisy data becomes a problem when it dominates the signal we want to observe. For Kalman, moreover, noisy data cannot be ignored, because noisy data must imply a noisy model. More precisely: “When we have noisy data, the uncertainty in the data will be inherited by the model. This is a fundamental difficulty; it can be camouflaged by adopting some prejudice but it cannot be eliminated.”