What is Data Science?

Posted on Posted in Data Science
The key word in “Data Science” is not Data, it is Science!!!

In a single sentence we can say that data science is nothing but answering concrete questions using data. Rafa Irizarry, Roger Peng, and Jeff Leek has mentioned in their blog post that, the key word in “Data Science” is not Data, it is Science. Data science is only useful when the data are used to answer a question. That is the science part of the equation. In Data Science, aspects of statistics, computer science, applied mathematics and visualization are combined together to get new insights and new knowledge from vast amount of data

Data science is only useful when the data are used to answer a question. That is the science part of the equation. The
problem with this view of data science is that it is much harder than the view that focuses on data size or tools. It
is much easier to calculate the size of a data set and say “My data are bigger than yours” or to say, “I can code in Hadoop, can you?” than to say, “I have this really hard question, can I answer it with my data?”

The hype around big data/data science will flame out (it already is) if data science is only about “data” and not about science. The long term impact of data science will be measured by the scientific questions we can answer with the data.

There are lot more terms related to Data Science.

Data Analysis:

It means human activities that are needed to get new insight from your data. It uses certain analytics tool like R, Tableau, Amazon QuickSight, QlikView, Qlik Sence etc. to obtain desired result.

Data Analytics:

Data Analytics is all about automating insights into a dataset and supposes the usage of queries and data aggregation procedures. It can represent various dependencies between input variables, but also it can use Data Mining techniques and tools to discover hidden patterns in the dataset under analysis.

Machine Learning(ML):

Machine learning is the science of getting computers to act without being explicitly programmed. It is the science of getting computers to act without being explicitly programmed. Machine learning focuses on the development of computer programs that can teach themselves to grow and change when exposed to new data.

Two broad categories of machine learning are there: Supervised learning and Unsupervised learning.

Big Data:

Big data is an evolving term that describes massive amount of structured, semi-structured and unstructured data that has the potential to be mined for information.

Data Mining:

Data mining is the computational process of discovering patterns from voluminous data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The main goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use.

Business Analytics:

Business Analytics is the process of analyzing data to improve business performance through fact-based decision making. It is the subset of Business Intelligence, which can be described as “a set of techniques and tools for the acquisition and transformation of raw data into meaningful and useful information for business analysis purposes.

Also Read Who is a Data Scientist?
Also Read Who is a Data Engineer?