While raising worthwhile points, most discussions I see misunderstand randomization in both causal and statistical ways. Notably, randomization can be valuable but does not induce balance in the ordinary English sense of the word, nor does it deal with most problems of real experiments. Furthermore, the use of the word “balance” to describe what randomization actually does invites confusion with the ordinary English meaning of “balance” (as does use of ordinary words like “significance” and “confidence” to describe other technical concepts). Causally, a controlled experiment is one which the experimenter causally controls the causes of (inputs to) the treatment (studied cause) or the outcome (studied effect) – preferably both. A randomized experiment is one in which the
Topics:
Lars Pålsson Syll considers the following as important: Statistics & Econometrics
This could be interesting, too:
Lars Pålsson Syll writes What statistics teachers get wrong!
Lars Pålsson Syll writes Statistical uncertainty
Lars Pålsson Syll writes The dangers of using pernicious fictions in statistics
Lars Pålsson Syll writes Interpreting confidence intervals
While raising worthwhile points, most discussions I see misunderstand randomization in both causal and statistical ways. Notably, randomization can be valuable but does not induce balance in the ordinary English sense of the word, nor does it deal with most problems of real experiments. Furthermore, the use of the word “balance” to describe what randomization actually does invites confusion with the ordinary English meaning of “balance” (as does use of ordinary words like “significance” and “confidence” to describe other technical concepts).
Causally, a controlled experiment is one which the experimenter causally controls the causes of (inputs to) the treatment (studied cause) or the outcome (studied effect) – preferably both. A randomized experiment is one in which the causes of the treatment are fully determined by a known randomizing device (at least within levels of fully measured covariates), so that there is nothing unmeasured that causes both treatment and outcome. Provided the outcome is shielded from any effect of the randomizing device except that through (mediated by) the treatment, the random assignment variable becomes a perfect instrumental variable (IV), and statistical techniques based on such perfect IVs can be justified without recourse to dodgy empirical tests. A similar view can be found in Pearl’s book (Causality, 2nd ed. 2009).
Statistically, a frequentist can use the randomization distribution of A to construct a reference distribution for test statistics under various models (hypotheses) about the treatment effect (usually only a test of a no-effect model is described in this fashion, but most so-called confidence intervals are summaries of tests of models across which the treatment effect but nothing else is varied). This view can be seen in writings by Stephen Senn and James Robins. In parallel a Bayesian can use the distribution to provide prior distributions for counterfactual outcomes under the same variety of models (Cornfield, American Journal of Epidemiology 1976).
Note that none of these descriptions use or need the term “balance”, nor need they make claims that randomization corresponds to no confounding (Greenland and Mansournia, European Journal of Epidemiology 2015). Proper randomization can be said to provide “balance in probability” but for frequentists this property is over a purely hypothetical long run while for Bayesians it is a subjective prior probability induced by the knowing that allocation was random (Cornfield 1976 again). Neither use of “balance” applies to the actual state of observed trial populations, which both theories allow or concede may be arbitrarily out of balance on unmeasured covariates due to “bad luck of the draw” (“random confounding” as per Greenland & Mansournia 2015). By properly merging the randomization distribution (information on allocation) and models for treatment effect, frequentists can deal with this chance element via P-value functions (“confidence distributions”) while Bayesians can deal with it via posterior distributions. Again, neither need invoke “balance” – and I would argue that, to avoid the confusion seen in much literature, they shouldn’t.
None of this should be taken as sanctifying or criticizing randomization; I am simply pointing out that randomization, if done and described precisely, does something valuable – but that is not balance of actual samples (as opposed to hypothetical infinite samples or repetitions, or bets about actual samples). Real randomized studies of humans and their groupings must deal with many other considerations such as selectivity of study subjects (hence lack of generalizability), blinding (masking) of subjects and evaluators, outcome measurement error, nonadherence, drop-out, competing risks, etc. Randomization can help deflect concerns about confounding in probability, but is no panacea, and can increase other concerns such as selectivity of participation.