20 Most Popular R packages

R offers multiple packages for performing data analysis. Apart from providing an awesome interface for statistical analysis, the next best thing about R is the endless support it gets from developers and data science maestros from all over the world. Currently, the CRAN Package repository features 11349 available packages. But the question is which packages are the most popular packages used […]

Making Data Management Decisions | Your Second Program

This example is almost similar to the previous example.  Here, I will show some basics of  Data Analysis and Data Engineering.  Setting aside missing data, coding valid data and recoding values, creating secondary variables, Grouping values within individual variables  helps you to make and implement even more decisions with data. Statisticians often call this task ‘data management’, while data scientists like the term […]

Data Management and Visualization | Running Your First Program

In this example, I will show some basics of  Data Analysis and Data Engineering.  Setting aside missing data, coding valid data and recoding values, creating secondary variables, Grouping values within individual variables  helps you to make and implement even more decisions with data. Statisticians often call this task ‘data management’, while data scientists like the term ‘data munging’. Download the code  Open […]

Download and Learn about Gapminder Dataset

For the purpose of Data Science with Python tutorial, I would like to work with a data set called Gapminder and I will provide some sample python codes for learning data analysis fundamentals. This portion of the GapMinder data includes one year of numerous country-level indicators of health, wealth and development.   Download GapMinder Data Set : gapminder.csv   Visit http://www.gapminder.org for more information GapMinder Founded in Stockholm by […]

Structure of a Data Science Project | Different Phases in Data Science Project

A typical data science project will be structured in a few different phases. There’s roughly five different phases that we can think about in a data science project. Phase 1: Defining A Question The first phase is the most important phase, and that’s the phase where you ask the question and you specify what is it that you’re interested in […]

How to Set up your Python Environment for Data Analysis

Python is a programming language, and for Learning Data Science with Python, you’ll be writing your Python code in the programming environment called Spyder. The Anaconda distribution simplifies the installation process by including Python, Spyder, and other packages and tools in one installation file. It contains the core Python language, as well as all of the essential libraries including NumPy, Pandas, […]

Popular Python Libraries for Data Analysis

Python is a general purpose language and is often used for things other than data analysis and data science. What makes Python extremely useful for working with data? There are libraries that give users the necessary functionality when crunching data. Below are the major Python libraries that are used for working with data. You should take some time to familiarize […]