“A data scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician.”
What Kind of Skills Will I Need to be a Data Scientist?
From the above quote you already understood that a data scientist needs to be able to do statistics. They may
have learned that either in classes, so that they will have a qualification in statistics, whether it’s a degree or they’ve
taken a large number of classes, or they’ve taken some online classes. They probably need to know something about prediction and machine learning. That can often be learned from either statistics classes or from computer science
classes that focus on machine learning. Again, you can also learn that online.
What does a data scientist typically do on daily basis?
Typically, a data scientist runs experiments, pulls and cleans data, analyzes it and then communicates the results to people. So they have to have a set of skills that allows them to perform all of these activities.
A data scientist also needs to be able to do data analysis, and that means the whole picture: they need to be able to pull data sets out of a database, they need to be able to clean them up and analyze datasets. Perform the statistical inference or prediction that they want to do and then communicate those results. Data communication skills involve both being able to analyze the data, create nice visualizations, communicate those visualizations and the results to people in a way that both expresses what’s going on and carefully expresses how uncertain they are. When you’re making decisions, you know exactly how strongly the data are supporting the decision that you’re trying to make
In addition to the skills described above, data scientists usually have a few of the following skills. They usually know
how to use R or Python, which are general purpose data science languages that people use to analyze data. They know how to do some kind of visualization, often interactive visualization with something like D3.js. And they’ll
likely know SQL in order to pull data out of a relational database.
If you are data enthusiast and having good knowledge in below fields then you can try for data scientist job role.
- Data tools like R, Python, SAS and database querying language like SQL
- Machine Learning
- Linear Algebra and Multivariate Analysis
- Predictive Analytics
- Data Munging
- Data Visualization & Communication:
- Software Engineering
There is no fixed list of skill matrix for this job role. This list is always subject to change.
What Kind of Degree and Background Will I Need?
The common backgrounds for data scientists are that they come from a statistics or biostatistics department, they
learned a lot of applications, so they’ve actually worked on lots of real data sets. It’s not always true that they should come from statistics and biostatistics department, but it’s usually true that they should have true knowledge in applied statistics.
Another very common route to data science is if you do some kind of quantitative background say for example engineering, and then there’s some sort of data science transition. So they either take some classes online, like our
data science specialization, or they take some other data science program.
There is another route which is very common among data scientists is that they’ve actually trained in software engineering. They’re actually a trained software engineer, or a working software engineer, and they just pick up some statistics. Now, they might pick that up, again, from classes online, or they might have taken some courses in college.
And again, it’ll depend on the mix of what your organization is looking for.
Do I need PHD degree?
Some people think that to be a Data Scientist PhD. is needed but this is really not true. This job role is very crucial and people expectation is also very high. So, to be Data Scientist you need to have some specific skill set and back grounds.
Again there is no hard rule but it depends on what organization you are focused on doing. If your organization is doing more development of predictive tools, they might need to hire people who is little stronger in machine learning. On the other way, If your organization is trying to do experiments and come up with new hypotheses or inferences about what is effective, they might need to search data scientist who are little bit better at inference.
I found some useful information in Data Science London. Here is picture that describe a data scientist need to be aware of what types of data tools in broadly.
Also read Data Scientist’s Toolkits