Saturday , April 27 2024
Home / Lars P. Syll / Statistical assumptions and racial bias

Statistical assumptions and racial bias

Summary:
Statistical assumptions and racial bias Our analysis indicates that existing empirical work in this area is producing a misleading portrait of evidence as to the severity of racial bias in police behavior. Replicating and extending the study of police behavior in New York in Fryer (2019), we show that the consequences of ignoring the selective process that generates police data are severe, leading analysts to dramatically underestimate or conceal entirely the differential police violence faced by civilians of color. For example, while a naïve analysis that assumes no race-based selection into the data suggests only 10,000 black and Hispanic civilians were handcuffed because of racial bias in New York City between 2003 and 2013, we estimate that the true

Topics:
Lars Pålsson Syll considers the following as important:

This could be interesting, too:

Lars Pålsson Syll writes The importance of ‘causal spread’

Lars Pålsson Syll writes Applied econometrics — a messy business

Lars Pålsson Syll writes Feynman’s trick (student stuff)

Lars Pålsson Syll writes Difference in Differences (student stuff)

Statistical assumptions and racial bias

Statistical assumptions and racial biasOur analysis indicates that existing empirical work in this area is producing a misleading portrait of evidence as to the severity of racial bias in police behavior. Replicating and extending the study of police behavior in New York in Fryer (2019), we show that the consequences of ignoring the selective process that generates police data are severe, leading analysts to dramatically underestimate or conceal entirely the differential police violence faced by civilians of color. For example, while a naïve analysis that assumes no race-based selection into the data suggests only 10,000 black and Hispanic civilians were handcuffed because of racial bias in New York City between 2003 and 2013, we estimate that the true number is approximately 56,000. And while analyses ignoring bias in stopping would conclude that 10% of uses of force against black and Hispanic civilians in these data were discriminatory, after bias-correction, we estimate that the true percentage is 39% …

Traditionally, analysts use data on stopped individuals to study bias by computing the difference in violence rates between stopped minority and white civilians, while controlling for observable differences between these two sets of encounters. We term this the “naïve estimator” … However, without further assumptions, this quantity will have no causal interpretation so long as the treatment affects the mediator (i.e., civilian race affects whether officers detain a civilian). As we show below, this is because treated encounters (with minority civilians) that result in a stop will not be comparable to those with stopped control (majority) civilians. As a simple example, suppose officers exhibited racial bias as follows: they detain white civilians if they observe them committing a serious crime (such as assault, potentially warranting the use of force) but detain nonwhite civilians regardless of observed behavior. When this is true, comparing stopped white and nonwhite civilians amounts to comparing fundamentally different groups. The analyst will observe force used against a greater proportion of stopped white civilians because of the differential physical threat they pose to officers. Under the traditional approach, the analyst would naïvely conclude that anti-white bias exists, yielding an erroneous portrait of racial discrimination in the use of force.

Dean Knox, Will Lowe, Jonathan Mummolo

This study is a must-read for researchers trying to identify causal relations from proprietary administrative data sets!

Looking only at data often gives the wrong causal impression — especially when, as in this case, the data is loaded right from the start and results in sample selection bias due to post-treatment conditioning. Comparing white bank robbers to black civilians committing no crime does not give us the apples-to-apples comparison needed for making causal inferences.

From a statistical perspective, it could be argued that the basic mistake made in the traditional analyses here is the all too frequent — and highly problematic — assumption that one can simply treat ‘convenience’ samples and real-world processes as statistical random samples. They are not — and ALL the statistical hypotheses tested are of questionable value. Using unwarranted phantasmagorical model assumptions is a risky business if what we aim for is relevant and realist science.

Reducing interesting research questions into ‘manageable’ statistical hypotheses is a difficult art. And — data, whether ‘big’ or not, never by itself gives us credible causal inferences.

Lars Pålsson Syll
Professor at Malmö University. Primary research interest - the philosophy, history and methodology of economics.

Leave a Reply

Your email address will not be published. Required fields are marked *