Wednesday , May 8 2024
Home / Lars P. Syll / Simpson’s paradox, Trump voters and the limits of econometrics

Simpson’s paradox, Trump voters and the limits of econometrics

Summary:
Simpson’s paradox, Trump voters and the limits of econometrics [embedded content] From a more theoretical perspective, Simpson’s paradox importantly shows that causality can never be reduced to a question of statistics or probabilities, unless you are — miraculously — able to keep constant all other factors that influence the probability of the outcome studied. To understand causality we always have to relate it to a specific causal structure. Statistical correlations are never enough. No structure, no causality. Simpson’s paradox is an interesting paradox in itself, but it can also highlight a deficiency in the traditional econometric approach towards causality. Say you have 1000 observations on men and an equal amount of  observations on women applying for admission to university studies, and that 70% of men are admitted, but only 30% of women. Running a logistic regression to find out the odds ratios (and probabilities) for men and women on admission, females seem to be in a less favourable position (‘discriminated’ against) compared to males (male odds are 2.33, female odds are 0.43, giving an odds ratio of 5.44).

Topics:
Lars Pålsson Syll considers the following as important:

This could be interesting, too:

Lars Pålsson Syll writes Monte Carlo simulation explained (student stuff)

Lars Pålsson Syll writes The importance of ‘causal spread’

Lars Pålsson Syll writes Applied econometrics — a messy business

Lars Pålsson Syll writes Feynman’s trick (student stuff)

Simpson’s paradox, Trump voters and the limits of econometrics


From a more theoretical perspective, Simpson’s paradox importantly shows that causality can never be reduced to a question of statistics or probabilities, unless you are — miraculously — able to keep constant all other factors that influence the probability of the outcome studied.

To understand causality we always have to relate it to a specific causal structure. Statistical correlations are never enough. No structure, no causality.

Simpson’s paradox is an interesting paradox in itself, but it can also highlight a deficiency in the traditional econometric approach towards causality. Say you have 1000 observations on men and an equal amount of  observations on women applying for admission to university studies, and that 70% of men are admitted, but only 30% of women. Running a logistic regression to find out the odds ratios (and probabilities) for men and women on admission, females seem to be in a less favourable position (‘discriminated’ against) compared to males (male odds are 2.33, female odds are 0.43, giving an odds ratio of 5.44). But once we find out that males and females apply to different departments we may well get a Simpson’s paradox result where males turn out to be ‘discriminated’ against (say 800 male apply for economics studies (680 admitted) and 200 for physics studies (20 admitted), and 100 female apply for economics studies (90 admitted) and 900 for physics studies (210 admitted) — giving odds ratios of 0.62 and 0.37).

Econometric patterns should never be seen as anything else than possible clues to follow. From a critical realist perspective it is obvious that behind observable data there are real structures and mechanisms operating, things that are  — if we really want to understand, explain and (possibly) predict things in the real world — more important to get hold of than to simply correlate and regress observable variables.

Math cannot establish the truth value of a fact. Never has. Never will.

Paul Romer

Advertisements
Lars Pålsson Syll
Professor at Malmö University. Primary research interest - the philosophy, history and methodology of economics.

Leave a Reply

Your email address will not be published. Required fields are marked *