This is a long post…so grab a coffee and enjoy 🙂

Data science allows the tech industry to make the claim that they know you better than you do.

It’s not that data was never there; it has always been around. The difference is that now, more data is being collected and at a faster rate, and we finally have the ability to do something with it.

The power of data science has led to such a fast growing industry that data scientists are hard to come by. A 2013 McKinsey report predicted that by 2018, there would be a shortage of 190,000 data scientists in the United States, and a shortage of 1.5 million analysts capable of doing something about the big data flood headed their way.

So why is data science becoming the hot ticket in tech?

Data Science Is About Answers And Decision-Making

In “The Unreasonable Effectiveness of Data”, a 2009 paper by Peter Norvig, Google’s AI expert, the power of big data was summed up perfectly:

“Simple models and a lot of data trump more elaborate models based on less data.”

In other words, more data is better, if you know what to do with it.

Hilary Mason, former chief data scientist at would agree. In a recent interview with Gigaom, Mason said that what is taken for granted now, as far as available data fields go, would only have been “theoretical” 10 or more years ago. So why is data science growing so quickly in the tech industry right now?

Quite simply, data science allows for: accurate answers, decisions based on what is actually happening, and predicting the next big trend.

  1. Data Science Provides More Accurate Answers

Let’s take a look at a real-world example of Norvig’s theory: The 2014 Academy Awards. That year, data scientists who turned to big data proved to be incredibly accurate in predicting who would win. Motion picture columnists and other acclaimed experts, despite their experience in dealing with film and knowledge of who had won in the past, fared poorly in the accuracy of their predictions.

Predictive analytics handily beat the best guesses of the experts.

The tech industry is pulling in massive amounts of data from users on mobile apps and websites, tracking where they go, when they go, what they buy, what they share, what they click, and who their friends are. They are in the perfect position to use this data to “predict the winners”, discovering accurate answers where an educated guess, in the past, would have been as good as they could get.

  1. Data Science Changes How Decisions Are Made

Are data-driven companies any better than those who don’t rely on big data to make their decisions? A study in the Harvard Business Review revealed some surprising results. The more a company believed they were data-driven, the better they did in objective financial and operations measurements. Companies in the top 33% of their industry who relied on data-driven decisions were 6% more profitable than their competitors.

Whether it was the data itself, or the confidence these companies felt about their decisions because they were based on data, such companies did do better.

Without access to data, decisions have always been made by delegating the important decisions to those who have the experience, the HiPPO (“highest paid person’s opinion”). They, in turn, rely on the patterns and instincts they’ve developed over the years. Relying on one person’s gut instincts or perceived smarts instead of data is dangerous; just ask retailer JC Penney, who saw their company spiral downward under leadership that eschewed collecting or monitoring data for any decisions.

The tech industry embraces a different model. It’s summed up quite simply: you can’t manage what you can’t measure.

Decisions are no longer relegated to the person with the most experience. Instead, data scientists dig into data and find a reason to make a decision based on what is actually happening.

  1. Data Science Finds The Trends

Data scientists are a bit like artists. They look at big data, interpreting that data in the hopes of spotting trends. Data scientists inside an organization do this interpretation with an eye towards organizational goals.

Data science is what tells you what’s hot before the experts even see it on the radar. This is competitive advantage to the nth degree. Forget copycat trends, corporate espionage, or stealing the competitor’s best workers. Data science taps into the information that’s already out there, the information that’s pointing the way a trend is headed.

The ability to ferret out trends can be applied to more than making business decisions. Trend-spotting can be the actual product. Consider Google’s Flu Trends, which uses an algorithm that crunches data collected through its search engine. It’s been surprisingly accurate, beating the CDC to the predictive punch by about two weeks.

The Tech Industry Is Attractive To Data Scientists

For the serious data scientist — a person who has that rare set of skills that combines math, marketing, science, and analytics — the tech industry offers unique opportunities not found elsewhere.

  1. Faster Returns On Data Experiments

Scott Clark, a data scientist at Yelp, preferred the speed the tech industry offered him. According to Clark, making a small change in the Yelp website would have a bigger impact, stretching over millions of people, as compared to the slower return on an experiment that he might see in an academic setting. That speed can be a double-edged sword, however, as that demand for speed and analyzed results adds a layer of stress to data scientists in tech.

The faster returns aren’t the only reason the tech industry is pulling in data scientists. During the recent recession, opportunities in academia or on Wall Street dried up as research funding was reduced. Data scientists turned towards tech, filling a need, revealing their value, and driving home the promise and importance of data science.

  1. The Data Floodgates Are Already Open

The McKinsey report predicted that by 2020, there will 40,000 exabytes of data collected. Someone has to do something with that data.

Data scientists in the tech industry are positioned at the leading edge of this data deluge, one that’s already pouring in from mobile apps, internet, social media, ecommerce, and wearable technology.

Big data is growing in importance to the tech industry thanks to the tech industry itself. Cloud computing led to an increase in data collection because it made it scalable. New approaches to massive pools of data (“data lakes”) are making that data more fluid. Traditionally, data sets are designed first, before any data is collected. By creating data lakes, this approach is flipped. Massive amounts of data can be collected before designing the model. You can collect the data without knowing before hand what you’re going to do with it.

Big Business Needs Data Science Tools

Profitable business is always a driver of technology, and data science is no different. Big data has the potential to be mashed into data sets to understand customers much better than relying on those unscientific experts who make a best guess on hunches and past experience.

Instead of guessing at what to recommend to a customer, businesses can use data sets that tell them exactly what a customer wants depending on the season, weather, past purchases, geographic location, and life events. Information pulled from RFID tags is of little use if you don’t know what to do with that data.

Retail giant Target is well-known for harnessing the power of big data, discovering female customers who were pregnant based on the products they purchased, and then showering them with ads and coupons for baby-related items. Netflix and Amazon are well known for their powerful recommendation engines that use not only what people buy, but also what they look at. Credit card companies have tapped into the associative power found in big data, learning that people who buy anti-scuff furniture pads are also more likely to make their payments on time.

Big Business, it turns out, really needs Big Data. And because of that, the tech industry is needed to harness the power of data science into something usable. Someone must create the apps and systems and algorithms that power these data-driven customer targeting engines.