Bias with Big Data

Statistics is a extremely valuable tool in research and in business today. It helps in forecasting sales, market analysis, elections, and so on. Since we are moving into the world of big data, everything can be quantified…so it seems. Well, that is the attempt: To quantify everything.

In this zeal to quantify everything, it can be helpful, and provide some benefit, this quest does come with some huge drawbacks. The end result can potentially lead the analysis down some fallacious conclusions. Does this mean stats analysis should not be used? No. It just means that the analysis is one aspect of the story.

Key excerpt:

“Among experts it’s well understood that “big data” doesn’t solve problems of bias. But how much should one trust an estimate from a big but possibly biased data set compared to a much smaller random sample? In Statistical paradises and paradoxes in big data, Xiao-Li Meng provides some answers which are shocking, even to experts.”

