Summary:
There seems to be an expectation in science that the people who gather a dataset should also be the ones who analyze it. But often that doesn’t make sense: what it takes to gather relevant data has little to do with what it takes to perform a reasonable analysis. Indeed, the imperatives of analysis can even impede data-gathering, if people have confused ideas of what they can and can’t do with their data. I’d like us to move to a world in which gathering and analysis of data are separated, in which researchers can get full credit for putting together a useful dataset, without the expectation that they perform a serious analyses. I think that could get around some research bottlenecks. It’s my impression that this is already done in many areas of science—for example, there are public
Topics:
Mike Norman considers the following as important: data analysis, data collection, statistics
This could be interesting, too:
There seems to be an expectation in science that the people who gather a dataset should also be the ones who analyze it. But often that doesn’t make sense: what it takes to gather relevant data has little to do with what it takes to perform a reasonable analysis. Indeed, the imperatives of analysis can even impede data-gathering, if people have confused ideas of what they can and can’t do with their data. I’d like us to move to a world in which gathering and analysis of data are separated, in which researchers can get full credit for putting together a useful dataset, without the expectation that they perform a serious analyses. I think that could get around some research bottlenecks. It’s my impression that this is already done in many areas of science—for example, there are public
Topics:
Mike Norman considers the following as important: data analysis, data collection, statistics
This could be interesting, too:
Joel Eissenberg writes Trusting statistics
Jeff Mosenkis (IPA) writes IPA’s weekly links
James Kwak writes COVID-19: The Statistics of Social Distancing
James Kwak writes COVID-19: The Statistics of Social Distancing
There seems to be an expectation in science that the people who gather a dataset should also be the ones who analyze it. But often that doesn’t make sense: what it takes to gather relevant data has little to do with what it takes to perform a reasonable analysis. Indeed, the imperatives of analysis can even impede data-gathering, if people have confused ideas of what they can and can’t do with their data.
I’d like us to move to a world in which gathering and analysis of data are separated, in which researchers can get full credit for putting together a useful dataset, without the expectation that they perform a serious analyses. I think that could get around some research bottlenecks.
It’s my impression that this is already done in many areas of science—for example, there are public datasets on genes, and climate, and astronomy, and all sorts of areas in which many teams of researchers are studying common datasets. And in social science we have the NES, GSS, NLSY, etc. Even silly things like the Electoral Integrity Project—I don’t think these data are so great, but I appreciate the open spirit under which these data are shared....Statistical Modeling, Causal Inference, and Social Science
Publish your raw data and your speculations, then let other people do the analysis: track and field edition
Andrew Gelman | Professor of Statistics and Political Science and Director of the Applied Statistics Center, Columbia University