Wednesday , May 1 2024
Home / Lars P. Syll / Observational data and causal inference

Observational data and causal inference

Summary:
Observational data and causal inference Distinguished Professor of social psychology Richard E. Nisbett takes on the idea of intelligence and IQ testing in his Intelligence and How to Get It (Norton 2011). He also has some interesting thoughts on multiple-regression analysis and writes: Researchers often determine the individual’s contemporary IQ or IQ earlier in life, socioeconomic status of the family of origin, living circumstances when the individual was a child, number of siblings, whether the family had a library card, educational attainment of the individual, and other variables, and put all of them into a multiple-regression equation predicting adult socioeconomic status or income or social pathology or whatever. Researchers then report the

Topics:
Lars Pålsson Syll considers the following as important:

This could be interesting, too:

Lars Pålsson Syll writes The importance of ‘causal spread’

Lars Pålsson Syll writes Applied econometrics — a messy business

Lars Pålsson Syll writes Feynman’s trick (student stuff)

Lars Pålsson Syll writes Difference in Differences (student stuff)

Observational data and causal inference

Distinguished Professor of social psychology Richard E. Nisbett takes on the idea of intelligence and IQ testing in his Intelligence and How to Get It (Norton 2011). He also has some interesting thoughts on multiple-regression analysis and writes:

Observational data and causal inferenceResearchers often determine the individual’s contemporary IQ or IQ earlier in life, socioeconomic status of the family of origin, living circumstances when the individual was a child, number of siblings, whether the family had a library card, educational attainment of the individual, and other variables, and put all of them into a multiple-regression equation predicting adult socioeconomic status or income or social pathology or whatever. Researchers then report the magnitude of the contribution of each of the variables in the regression equation, net of all the others (that is, holding constant all the others). It always turns out that IQ, net of all the other variables, is important to outcomes. But … the independent variables pose a tangle of causality – with some causing others in goodness-knows-what ways and some being caused by unknown variables that have not even been measured. Higher socioeconomic status of parents is related to educational attainment of the child, but higher-socioeconomic-status parents have higher IQs, and this affects both the genes that the child has and the emphasis that the parents are likely to place on education and the quality of the parenting with respect to encouragement of intellectual skills and so on. So statements such as “IQ accounts for X percent of the variation in occupational attainment” are built on the shakiest of statistical foundations. What nature hath joined together, multiple regressions cannot put asunder.

Now, I think this is right as far as it goes, although it would certainly have strengthened Nisbett’s argumentation if he had elaborated more on the methodological question around causality, or at least had given some mathematical-statistical-econometric references. Unfortunately, his alternative approach is not more convincing than regression analysis. Like so many other contemporary social scientists today, Nisbett seems to think that randomization may solve the empirical problem. By randomizing we are getting different “populations” that are homogeneous in regards to all variables except the one we think is a genuine cause. In that way, we are supposed being able not having to actually know what all these other factors are.

If you succeed in performing ideal randomization with different treatment groups and control groups that is attainable. But it presupposes that you really have been able to establish – and not just assume – that the probability of all other causes but the putative have the same probability distribution in the treatment and control groups, and that the probability of assignment to treatment or control groups is independent of all other possible causal variables.

Unfortunately, real experiments and real randomizations seldom or never achieve this. So, yes, we may do without knowing all causes, but it takes ideal experiments and ideal randomizations to do that, not real ones.

As yours truly has argued more than once on this blog, that means that in practice we do have to have sufficient background knowledge to deduce causal knowledge. Without old knowledge, we can’t get new knowledge, and — no causes in, no causes out.

Lars Pålsson Syll
Professor at Malmö University. Primary research interest - the philosophy, history and methodology of economics.

Leave a Reply

Your email address will not be published. Required fields are marked *