Editor’s note: In their book, “Data Science: Concepts and Practice,” authors Vijay Kotu and Bala Deshpande explain the core principles and applications of modern data science. Kotu is vice president of analytics at ServiceNow; Deshpande is a data scientist and consultant. This article, which focuses on the key characteristics and features of data science, is adapted with permission.
In the past few decades, a massive accumulation of data has coincided with the advancement of information technology, connected networks, and the businesses they enable. This trend is coupled with a steep decline in data storage and data processing costs.
The applications built on these advancements, such as digital businesses, social networking, and mobile technologies, unleash a large amount of complex, heterogeneous data that is waiting to be analyzed. Traditional analysis techniques like dimensional slicing, hypothesis testing, and descriptive statistics can only go so far in information discovery.
A paradigm is needed to manage the massive volume of data, explore the inter-relationships of thousands of variables, and deploy machine learning algorithms to deduce optimal insights from datasets. A set of frameworks, tools, and techniques are needed to intelligently assist humans to process all this data and extract valuable information. Data science is one such paradigm that can handle large volumes with multiple attributes and deploy complex algorithms to search for patterns from data.