Tuesday , April 23 2024
Home / Lars P. Syll / Machine learning — getting results that are completely wrong

Machine learning — getting results that are completely wrong

Summary:
Machine learning — getting results that are completely wrong Machine-learning techniques used by thousands of scientists to analyse data are producing results that are misleading and often completely wrong. Dr Genevera Allen from Rice University in Houston said that the increased use of such systems was contributing to a “crisis in science” … The data sets are very large and expensive. But, according to Dr Allen, the answers they come up with are likely to be inaccurate or wrong because the software is identifying patterns that exist only in that data set and not the real world … Machine learning systems and the use of big data sets has accelerated the crisis, according to Dr Allen. That is because machine learning algorithms have been developed

Topics:
Lars Pålsson Syll considers the following as important:

This could be interesting, too:

Lars Pålsson Syll writes Applied econometrics — a messy business

Lars Pålsson Syll writes Feynman’s trick (student stuff)

Lars Pålsson Syll writes Difference in Differences (student stuff)

Lars Pålsson Syll writes Vad ALLA bör veta om statistik

Machine learning — getting results that are completely wrong

Machine learning — getting results that are completely wrongMachine-learning techniques used by thousands of scientists to analyse data are producing results that are misleading and often completely wrong.

Dr Genevera Allen from Rice University in Houston said that the increased use of such systems was contributing to a “crisis in science” …

The data sets are very large and expensive. But, according to Dr Allen, the answers they come up with are likely to be inaccurate or wrong because the software is identifying patterns that exist only in that data set and not the real world …

Machine learning systems and the use of big data sets has accelerated the crisis, according to Dr Allen. That is because machine learning algorithms have been developed specifically to find interesting things in datasets and so when they search through huge amounts of data they will inevitably find a pattern.

“The challenge is can we really trust those findings?” she told BBC News.

“Are those really true discoveries that really represent science? Are they reproducible? If we had an additional dataset would we see the same scientific discovery or principle on the same dataset? And unfortunately the answer is often probably not.”

BBC News

The central problem with the present ‘machine learning’ and ‘big data’ hype is that so many think that they can get away with analysing real-world phenomena without any (commitment to) theory. But — data never speaks for itself. Without a prior statistical set-up, there actually are no data at all to process. And — using a machine learning algorithm will only produce what you are looking for.

Machine learning algorithms always express a view of what constitutes a pattern or regularity. They are never theory-neutral.

Clever data-mining tricks are not enough to answer important scientific questions. Theory matters.

Lars Pålsson Syll
Professor at Malmö University. Primary research interest - the philosophy, history and methodology of economics.

Leave a Reply

Your email address will not be published. Required fields are marked *