Monday , December 23 2024
Home / Lars P. Syll / Evidence-based policy — a façade of precision

Evidence-based policy — a façade of precision

Summary:
Evidence-based policy — a façade of precision The façade of precision … is perhaps the most important in debunking SABER (the World Bank’s Systems Approach for Better Education Results initiative), GEEAP (the World Bank’s and UK Aid’s new Global Education Evidence Advisory Panel), and other attempts to make evidence-based policy. To assess quantitatively the impact of an intervention, there are two ways to rule out confounding variables – statistical controls and experimental controls. Both are fundamentally problematic in theory and in practice. To trust in statistical controls via some form of regression analysis, you cannot just include ad hoc a few control variables but need three conditions: include all variables that affect the dependent variable,

Topics:
Lars Pålsson Syll considers the following as important: ,

This could be interesting, too:

Lars Pålsson Syll writes Andreas Cervenka och den svenska bostadsbubblan

Lars Pålsson Syll writes Debunking the balanced budget superstition

Lars Pålsson Syll writes How inequality causes financial crises

Lars Pålsson Syll writes Income inequality and the saving glut of the rich

Evidence-based policy — a façade of precision

Evidence-based policy — a façade of precisionThe façade of precision … is perhaps the most important in debunking SABER (the World Bank’s Systems Approach for Better Education Results initiative), GEEAP (the World Bank’s and UK Aid’s new Global Education Evidence Advisory Panel), and other attempts to make evidence-based policy. To assess quantitatively the impact of an intervention, there are two ways to rule out confounding variables – statistical controls and experimental controls. Both are fundamentally problematic in theory and in practice. To trust in statistical controls via some form of regression analysis, you cannot just include ad hoc a few control variables but need three conditions: include all variables that affect the dependent variable, measure them correctly, and specify the proper functional form. These conditions never hold, and the result is different studies come to different conclusions …

Experimental controls via RCTs have been touted as a better strategy for impact assessment, indeed as the “gold standard” of research methods … In practice, RCTs very often come to inconsistent and divergent conclusions. What this all comes down to again is that the evidence supporting the impact of policies is cherry-picked and “best practice” and “what works” are in the eye of the beholder …

The promise of the policy sciences — that social science could give us clear facts — is belied in theory and in practice as I have argued here and elsewhere (Klees, 2020).  We need to recognize that and be much more modest in our claims and much more aggressive in ensuring that our policy choices are made with widespread debate and participation.

Steven Klees

Klees’ interesting article highlights some of the fundamental problems with the present idolatry of ‘evidence-based’ policies and randomization designs in the field of education. Unfortunately, we face the same problems in economics.

The point of making a randomized experiment is often said to be that it ‘ensures’ that any correlation between a supposed cause and effect indicates a causal relation. This is believed to hold since randomization (allegedly) ensures that a supposed causal variable does not correlate with other variables that may influence the effect.

The problem with that simplistic view of randomization is that the claims made are exaggerated and sometimes even false:

• Even if you manage to do the assignment to treatment and control groups ideally random, the sample selection certainly is — except in extremely rare cases — not random. Even if we make a proper randomized assignment, if we apply the results to a biased sample, there is always the risk that the experimental findings will not apply. What works ‘there,’ does not work ‘here.’ Randomization hence does not ‘guarantee ‘ or ‘ensure’ making the right causal claim. Although randomization may help us rule out certain possible causal claims, randomization per se does not guarantee anything!

• Even if both sampling and assignment are made in an ideal random way, performing standard randomized experiments only gives you averages. The problem here is that although we may get an estimate of the ‘true’ average causal effect, this may ‘mask’ important heterogeneous effects of a causal nature. Although we get the right answer of the average causal effect being 0, those who are ‘treated’  may have causal effects equal to -100, and those ‘not treated’ may have causal effects equal to 100. Contemplating whether being treated or not, most people would probably be interested in knowing about this underlying heterogeneity and would not consider the average effect particularly enlightening.

• There is almost always a trade-off between bias and precision. In real-world settings, a little bias often does not overtrump greater precision. And — most importantly — in case we have a population with sizeable heterogeneity, the average treatment effect of the sample may differ substantially from the average treatment effect in the population. If so, the value of any extrapolating inferences made from trial samples to other populations is highly questionable.

• Since most real-world experiments and trials build on performing single randomization, what would happen if you kept on randomizing forever, does not help you to ‘ensure’ or ‘guarantee’ that you do not make false causal conclusions in the one particular randomized experiment you actually do perform. It is indeed difficult to see why thinking about what you know you will never do, would make you happy about what you actually do.

• And then there is also the problem that ‘Nature’ may not always supply us with the random experiments we are most interested in. If we are interested in X, why should we study Y only because design dictates that? Method should never be prioritized over substance!

Nowadays many mainstream economists maintain that ‘imaginative empirical methods’ — especially ‘as-if-random’ natural experiments and RCTs — can help us to answer questions concerning the external validity of economic models. In their view, they are, more or less, tests of ‘an underlying economic model’ and enable economists to make the right selection from the ever-expanding ‘collection of potentially applicable models.’

It is widely believed among mainstream economists that the scientific value of randomization — contrary to other methods — is more or less uncontroversial and that randomized experiments are free from bias. When looked at carefully, however, there are in fact few real reasons to share this optimism on the alleged ’experimental turn’ in economics. Strictly seen, randomization does not guarantee anything.

‘Ideally’ controlled experiments tell us with certainty what causes what effects — but only given the right ‘closures.’ Making appropriate extrapolations from (ideal, accidental, natural, or quasi) experiments to different settings, populations, or target systems, is not easy. Causes deduced in an experimental setting still have to show that they come with an export warrant to the target population. The causal background assumptions made have to be justified, and without licenses to export, the value of ‘rigorous’ and ‘precise’ methods — and ‘on-average-knowledge’ — is despairingly small.

The almost religious belief with which its propagators — including ‘Nobel prize’ winners like Duflo, Banerjee and Kremer  — portray it, cannot hide the fact that RCTs cannot be taken for granted to give generalizable results. That something works somewhere is no warranty for us to believe it to work for us here or that it works generally.

Leaning on an interventionist approach often means that instead of posing interesting questions on a social level, the focus is on individuals. Instead of asking about structural socio-economic factors behind, e.g., gender or racial discrimination, the focus is on the choices individuals make.  Esther Duflo is a typical example of the dangers of this limiting approach. Duflo et consortes want to give up on ‘big ideas’ like political economy and institutional reform and instead go for solving more manageable problems ‘the way plumbers do.’ Yours truly is far from sure that is the right way to move economics forward and make it a relevant and realist science. A plumber can fix minor leaks in your system, but if the whole system is rotten, something more than good old fashion plumbing is needed. The big social and economic problems we face today are not going to be solved by plumbers performing interventions or manipulations in the form of RCTs.

Evidence-based policy — a façade of precisionThe present RCT idolatry is dangerous. Believing randomization is the only way to achieve scientific validity blinds people to searching for and using other methods that in many contexts are better. Insisting on using only one tool often means using the wrong tool.

Randomization is not a panacea. It is not the best method for all questions and circumstances. Proponents of randomization make claims about its ability to deliver causal knowledge that is simply wrong. There are good reasons to be skeptical of the now popular — and ill-informed — view that randomization is the only valid and the best method on the market. It is not.

Lars Pålsson Syll
Professor at Malmö University. Primary research interest - the philosophy, history and methodology of economics.

Leave a Reply

Your email address will not be published. Required fields are marked *