RCTs — assumptions, biases and limitations Randomised experiments require much more than just randomising an experiment to identify a treatment’s effectiveness. They involve many decisions and complex steps that bring their own assumptions and degree of bias before, during and after randomisation … Some researchers may respond, “are RCTs not still more credible than these other methods even if they may have biases?” For most questions we are interested in, RCTs cannot be more credible because they cannot be applied (as outlined above). Other methods (such as observational studies) are needed for many questions not amendable to randomisation but also at times to help design trials, interpret and validate their results, provide further insight on the
Topics:
Lars Pålsson Syll considers the following as important: Theory of Science & Methodology
This could be interesting, too:
Lars Pålsson Syll writes Kausalitet — en crash course
Lars Pålsson Syll writes Randomization and causal claims
Lars Pålsson Syll writes Race and sex as causes
Lars Pålsson Syll writes Randomization — a philosophical device gone astray
RCTs — assumptions, biases and limitations
Randomised experiments require much more than just randomising an experiment to identify a treatment’s effectiveness. They involve many decisions and complex steps that bring their own assumptions and degree of bias before, during and after randomisation …
Some researchers may respond, “are RCTs not still more credible than these other methods even if they may have biases?” For most questions we are interested in, RCTs cannot be more credible because they cannot be applied (as outlined above). Other methods (such as observational studies) are needed for many questions not amendable to randomisation but also at times to help design trials, interpret and validate their results, provide further insight on the broader conditions under which treatments may work, among other rea- sons discussed earlier. Different methods are thus complements (not rivals) in improving understanding.
Finally, randomisation does not always even out everything well at the baseline and it cannot control for endline imbalances in background influencers. No researcher should thus just generate a single randomisation schedule and then use it to run an experiment. Instead researchers need to run a set of randomisation iterations before conducting a trial and select the one with the most balanced distribution of background influencers between trial groups, and then also control for changes in those background influencers during the trial by collecting endline data. Though if researchers hold onto the belief that flipping a coin brings us closer to scientific rigour and understanding than for example systematically ensuring participants are distributed well at baseline and endline, then scientific understanding will be undermined in the name of computer-based randomisation.
The point of making a randomized experiment is often said to be that it ‘ensures’ that any correlation between a supposed cause and effect indicates a causal relation. This is believed to hold since randomization (allegedly) ensures that a supposed causal variable does not correlate with other variables that may influence the effect.
The problem with that simplistic view on randomization is that the claims made are both exaggerated and false:
• Even if you manage to do the assignment to treatment and control groups ideally random, the sample selection certainly is — except in extremely rare cases — not random. Even if we make a proper randomized assignment, if we apply the results to a biased sample, there is always the risk that the experimental findings will not apply. What works ‘there,’ does not work ‘here.’ Randomization hence does not ‘guarantee ‘ or ‘ensure’ making the right causal claim. Although randomization may help us rule out certain possible causal claims, randomization per se does not guarantee anything!
• Even if both sampling and assignment are made in an ideal random way, performing standard randomized experiments only give you averages. The problem here is that although we may get an estimate of the ‘true’ average causal effect, this may ‘mask’ important heterogeneous effects of a causal nature. Although we get the right answer of the average causal effect being 0, those who are ‘treated’ may have causal effects equal to -100 and those ‘not treated’ may have causal effects equal to 100. Contemplating being treated or not, most people would probably be interested in knowing about this underlying heterogeneity and would not consider the average effect particularly enlightening.
• There is almost always a trade-off between bias and precision. In real-world settings, a little bias often does not overtrump greater precision. And — most importantly — in case we have a population with sizeable heterogeneity, the average treatment effect of the sample may differ substantially from the average treatment effect in the population. If so, the value of any extrapolating inferences made from trial samples to other populations is highly questionable.
• Since most real-world experiments and trials build on performing a single randomization, what would happen if you kept on randomizing forever, does not help you to ‘ensure’ or ‘guarantee’ that you do not make false causal conclusions in the one particular randomized experiment you actually do perform. It is indeed difficult to see why thinking about what you know you will never do, would make you happy about what you actually do.
Randomization is not a panacea. It is not the best method for all questions and circumstances. Proponents of randomization make claims about its ability to deliver causal knowledge that are simply wrong. There are good reasons to be sceptical of the now popular — and ill-informed — view that randomization is the only valid and best method on the market. It is not.