Thursday , August 14 2025

Home / Lars P. Syll / On randomization and regression (wonkish)

On randomization and regression (wonkish)

See on Internet Archive

Lars Pålsson Syll February 21, 2020 Lars P. Syll

by
Lars Pålsson Syll
My articles My site About me My books My videos
Follow on:Twitter

Summary:
On randomization and regression (wonkish) Randomization does not justify the regression model, so that bias can be expected, and the usual formulas do not give the right variances. Moreover, regression need not improve precision … What is the source of the bias when regression models are applied to experimental data? In brief, the regression model assumes linear additive effects. Given the assignments, the response is taken to be a linear combina- tion of treatment dummies and covariates, with an additive random error; coefficients are assumed to be constant across subjects. The Neyman [potential outcome] model makes no assumptions about linearity and additivity. If we write the expected response given the assignments as a linear combination of treatment

Topics:
Lars Pålsson Syll considers the following as important: Statistics & Econometrics

This could be interesting, too:

Lars Pålsson Syll writes Keynes’ critique of econometrics is still valid

Lars Pålsson Syll writes The history of random walks

Lars Pålsson Syll writes The history of econometrics

Lars Pålsson Syll writes What statistics teachers get wrong!

Related Articles

On randomization and regression (wonkish)

Randomization does not justify the regression model, so that bias can be expected, and the usual formulas do not give the right variances. Moreover, regression need not improve precision …

What is the source of the bias when regression models are applied to experimental data? In brief, the regression model assumes linear additive effects. Given the assignments, the response is taken to be a linear combina- tion of treatment dummies and covariates, with an additive random error; coefficients are assumed to be constant across subjects. The Neyman [potential outcome] model makes no assumptions about linearity and additivity. If we write the expected response given the assignments as a linear combination of treatment dummies, coefficients will vary across subjects. That is the source of the bias …

To put this more starkly, in the Neyman model, inferences are based on the random assignment to the several treatments. Indeed, the only stochastic element in the model is the randomization. With regression, inferences are made conditional on the assignments. The stochastic element is the error term, and the inferences depend on assumptions about that error term. Those assumptions are not justified by randomization. The breakdown in assumptions explains why regression comes up short when calibrated against the Neyman model …

Variances in the Neyman model are (necessarily) computed across the assignments, for it is the assignments that are the random elements in the model. With regression, variances are computed conditionally on the assignments, from an error term assumed to be IID across subjects, and independent of the assignment variables as well as the covariates. These assumptions do not follow from the randomization, explaining why the usual formulas break down.

David Freedman

Full story here

Are you the author?

0 0

Tags Statistics & Econometrics

About Lars Pålsson Syll

Professor at Malmö University. Primary research interest - the philosophy, history and methodology of economics.

My articles My site About me My books
Follow on:Twitter

Leave a Reply Cancel reply