Observational Studies and Experiments

Observational Studies:

In an observational study investigators observe subjects and measure variables of interest without assigning treatments to the subjects. The treatment that each subject receives is determined beyond the control of the investigator.

  • In observational study researchers collect data in a way that does not directly interfere how the data arise. It is merely observed.
  • From observation study, we can only establish correlation or association between explanatory and response variable.

For example, suppose we want to study the effect of smoking on lung capacity in women.

  • Find 100 women age 30 of which 50 have been smoking a pack a day for 10 years while the other 50 have been smoke free for 10 years.
  • Measure lung capacity for each of the 100 women.
  • Analyze, interpret, and draw conclusions from data.
Retrospective study:

If an observational study uses it data from past then it is called retrospective study.

Prospective Study:

If data is collected throughout the study then it is called prospective study.

Confounding Variable:

A confounding variable is related both to group membership and to the outcome of interest. It is extraneous variable that affect both the explanatory and response variable and that make it seem like there is a relationship between them.  Its presence makes it hard to establish the outcome as being a direct consequence of group membership.

Example:

Let’s think eating breakfast makes people slim. It means people who eat breakfast regularly are slim.For this case, there might be three explanations.

1. Eating breakfast make people slim.

2. Being slim cause people to eat breakfast.

3. There might be some third variable that might be responsible for both being slim and eating breakfast. Generally, people who are really health conscious they are slim and starts their day with breakfast.

This third variable is called confounding variable.

Experiments:

In an experiment investigators apply treatments to experimental units (people, animals, plots of land, etc.) and then proceed to observe the effect of the treatments on the experimental units.

In a randomized experiment investigators control the assignment of treatments to experimental units using a chance mechanism.

In an experiment as researchers randomly assign subjects to various treatments and therefore it establishes causal connection between explanatory and response variable.

  • Find 100 women age 20 who do not currently smoke
  • Randomly assign 50 of the 100 women to the smoking treatment and the other 50 to the no smoking treatment
  • Those in the smoking group smoke a pack a day for 10 years while those in the control group remain smoke free for 10 years.
  • Measure lung capacity for each of the 100 women.
  • Analyze, interpret, and draw conclusions from data
Principle of Experimental design:

In the design of experiments, treatments are applied to experimental units in the treatment group(s).In comparative experiments, members of the complementary group, the control group, receive either no treatment or a standard treatment.

From a statistician’s perspective, an experiment is performed to decide

1. whether the observed differences among the treatments (or sets of experimental conditions) included in the experiment are due only to change, and

2. whether the size of these differences is of practical importance.

Statistical inference reaches these decisions by comparing the variation in response among those experimental units exposed to the same treatment (experimental error) with that variation among experimental units exposed to different treatments (treatment effect). Thus, the three principles of experimental design are:

  • Control: Compare the treatment of interest to a control group to reduce experimental error by making the experiment more efficient.
  • Randomize: Randomly assign subject to treatments to ensure that this estimate is statistically valid.
  • Replicate: Collect a sufficiently large sample or replicate the entire study to provide an estimate of experimental error.

Sample and Population