# 5 Core Activities of Data Analysis | Epicycles of Data Analysis

Posted on Posted in Data Science

“If you torture the data long enough, it will confess.”
-Ronald Coase, Economist

Data analysis is an iterative process.  This process is  applied to all steps of the  analysis and it can be considered as an epicycle. Now the question is what is epicycle? An epicycle is a small circle whose center moves around the circumference of a larger circle.  Some data analyses appear to be fixed and linear. An example could be be  algorithms embedded into various software platforms, including apps. However, these algorithms are final data analysis products that have emerged from the very non-linear work of developing and refining a data analysis so that it can be “algorithmized.

A study includes;

• the development of a hypothesis or question
• the designing of the data collection process (or study protocol)
• the collection of the data
• and the analysis and interpretation of the data.

Because a data analysis presumes that the data have already been collected, it includes development and refinement of a question and the process of analyzing and interpreting the data. It is important to note that although a data analysis is often performed without conducting a study, it may also be performed as a component of a study.

#### There are 5 core activities of data analysis:

1. Stating and refining the question
2. Exploring the data
3. Building formal statistical models
4. Interpreting the results
5. Communicating the results

These 5 activities can happen at any point of time; for example, you my get all these 5 activities in a single day. Sometime you may go through them over a couple of moths because you might be  dealing with a very large project. But it is  will  important to first understand the overall framework used to approach each of these activities.

Although there are many different types of activities that you might engage in while doing data analysis, every aspect
of the entire process can be approached through an interative process that is call the “epicycle of data analysis”. More specifically, for each of the five core activities, it is critical that you engage in the following steps:

##### Step 1: Set expectations

First and foremost set an expectation. This is the first duty for your analysis.

##### Step 2: Test expectations:

Then collecting information or data, comparing the data according to your expectations, and if the expectations match the it fine else if it don’t match then follow the 3rd step.

##### Step 3:

Iterating through this 3-step process is what we call the “epicycle of data analysis.” As you go through every stage
of an analysis, you will need to go through the epicycle to continuously refine your question, your exploratory data