Sunday , November 24 2024
Home / Lars P. Syll / Discrimination and the use of ‘statistical controls’

Discrimination and the use of ‘statistical controls’

Summary:
The gender pay gap is a fact that, sad to say, to a non-negligible extent is the result of discrimination. And even though many women are not deliberately discriminated against, but rather self-select into lower-wage jobs, this in no way magically explains away the discrimination gap. As decades of socialization research has shown, women may be ‘structural’ victims of impersonal social mechanisms that in different ways aggrieve them. Wage discrimination is unacceptable. Wage discrimination is a shame. You see it all the time in studies. “We controlled for…” And then the list starts … The more things you can control for, the stronger your study is — or, at least, the stronger your study seems. Controls give the feeling of specificity, of precision. But sometimes, you can

Topics:
Lars Pålsson Syll considers the following as important:

This could be interesting, too:

Lars Pålsson Syll writes What statistics teachers get wrong!

Lars Pålsson Syll writes Statistical uncertainty

Lars Pålsson Syll writes The dangers of using pernicious fictions in statistics

Lars Pålsson Syll writes Interpreting confidence intervals

The gender pay gap is a fact that, sad to say, to a non-negligible extent is the result of discrimination. And even though many women are not deliberately discriminated against, but rather self-select into lower-wage jobs, this in no way magically explains away the discrimination gap. As decades of socialization research has shown, women may be ‘structural’ victims of impersonal social mechanisms that in different ways aggrieve them. Wage discrimination is unacceptable. Wage discrimination is a shame.

You see it all the time in studies. “We controlled for…” And then the list starts … The more things you can control for, the stronger your study is — or, at least, the stronger your study seems. Controls give the feeling of specificity, of precision. But sometimes, you can control for too much. Sometimes you end up controlling for the thing you’re trying to measure …

Discrimination and the use of ‘statistical controls’An example is research around the gender wage gap, which tries to control for so many things that it ends up controlling for the thing it’s trying to measure …

Take hours worked, which is a standard control in some of the more sophisticated wage gap studies. Women tend to work fewer hours than men. If you control for hours worked, then some of the gender wage gap vanishes. As Yglesias wrote, it’s “silly to act like this is just some crazy coincidence. Women work shorter hours because as a society we hold women to a higher standard of housekeeping, and because they tend to be assigned the bulk of childcare responsibilities.”

Controlling for hours worked, in other words, is at least partly controlling for how gender works in our society. It’s controlling for the thing that you’re trying to isolate.

Ezra Klein

Trying to reduce the risk of having established only ‘spurious relations’ when dealing with observational data, statisticians and econometricians standardly add control variables. The hope is that one thereby will be able to make more reliable causal inferences. But — as Keynes showed already back in the 1930s when criticizing statistical-econometric applications of regression analysis — if you do not manage to get hold of all potential confounding factors, the model risks producing estimates of the variable of interest that are even worse than models without any control variables at all. Conclusion: think twice before you simply include ‘control variables’ in your models!

Discrimination and the use of ‘statistical controls’When I present this argument … one or more scholars say, “But shouldn’t I control for everything I can in my regressions? If not, aren’t my coefficients biased due to excluded variables?” … The excluded variable argument only works if you are sure your specification is precisely correct with all variables included. But no one can know that with more than a handful of explanatory variables …

A preferable approach is to separate the observations into meaningful subsets—internally compatible statistical regimes … If this can’t be done, then statistical analysis can’t be done. A researcher claiming that nothing else but the big, messy regression is possible because, after all, some results have to be produced, is like a jury that says, “Well, the evidence was weak, but somebody had to be convicted.”

Christopher H. Achen

Kitchen sink econometric models are often the result of researchers trying to control for confounding. But what they usually haven’t understood is that the confounder problem requires a causal solution and not statistical ‘control.’ Controlling for everything opens up the risk that we control for ‘collider’ variables and thereby create ‘back-door paths’ which gives us confounding that wasn’t there to begin with.

Lars Pålsson Syll
Professor at Malmö University. Primary research interest - the philosophy, history and methodology of economics.

Leave a Reply

Your email address will not be published. Required fields are marked *