Download and Learn about Gapminder Dataset

Posted on Posted in Data Science with Python

For the purpose of Data Science with Python tutorial, I would like to work with a data set called Gapminder and I will provide some sample python codes for learning data analysis fundamentals. This portion of the GapMinder data includes one year of numerous country-level indicators of health, wealth and development.

 

Download GapMinder Data Set : gapminder.csv

 

Visit http://www.gapminder.org for more information

GapMinder

Founded in Stockholm by Ola Rosling, Anna Rosling Rönnlund and Hans Rosling, GapMinder is a non-profit venture promoting sustainable global development and  achievement of the United Nations Millennium Development Goals. It seeks to  increase the use and understanding of statistics about social, economic, and
environmental development at local, national, and global levels.  Since its conception in 2005, Gapminder has grown to include over 200 indicators, including gross domestic product, total employment rate, and estimated HIV prevalence. Gapminder contains data for all 192 UN members, aggregating data for Serbia and Montenegro. Additionally, it includes data for 24 other areas, generating a total of 215 areas. GapMinder collects data from a handful of sources, including the Institute for Health  Metrics and Evaulation, US Census Bureau’s International Database, United Nations  Statistics Division, and the World Bank.

 

Gapminder Codebook

Variable Name & Description of Indicator:

1. incomeperperson

The variable is for 2010 Gross Domestic Product per capita in constant 2000 US$. The inflation but not the differences in the cost of living between countries has been taken into account.

2. alcconsumption

It represents 2008 alcohol consumption per adult (age 15+), litres Recorded and estimated average alcohol consumption, adult (15+) per capita consumption in litres pure alcohol

3. armedforcesrate

This variable is about Armed forces personnel (% of total labor force)

4. breastcancerper100TH

It does represent 2002 breast cancer new cases per 100,000 female Number of new cases of breast cancer in 100,000 female residents during the certain year.

5. co2emissions

It is 2006 cumulative CO2 emission (metric tons), Total amount of CO2 emission in metric tons since 1751.

6. femaleemployrate

2007 female employees age 15+ (% of population) Percentage of female population, age above 15, that has been
employed during the given year.

7. employrate

This is for 2007 total employees age 15+ (% of population) Percentage of total population, age above 15, that has been employed during the given year.

8. HIVrate

It is 2009 estimated HIV Prevalence % – (Ages 15-49) Estimated number of people living with HIV per 100 population of age group 15-49.

9. Internetuserate

This variable for 2010 Internet users (per 100 people) Internet users are people with access to the worldwide network.

10. lifeexpectancy

This variable for 2011 life expectancy at birth (years) The average number of years a newborn child would live if current mortality patterns were to stay the same.

11. oilperperson

It represents in 2010 what was the oil Consumption per capita (tonnes per year and person)

12. polityscore

2009 Democracy score (Polity) Overall polity score from the Polity IV dataset, calculated by subtracting an autocracy score from a democracy score. The summary measure of a country’s democratic and free nature. -10 is the lowest
value, 10 the highest.

13. relectricperperson

This is about in 2008 residential electricity consumption, per person (kWh) The amount of residential electricity consumption per person during the given year, counted in kilowatt-hours (kWh).

14. suicideper100TH

It represents 2005 Suicide, age adjusted, per 100 000 Mortality due to self-inflicted injury, per 100 000 standard population, age adjusted

15. urbanrate

2008 urban population (% of total) Urban population refers to people living in urban areas as defined by
national statistical offices (calculated using World Bank population estimates and urban ratios from the United Nations World Urbanization Prospects)