R Programming
- Overview of R
- Installing R on Windows
- Download and Install RStudio on Windows
- Setting Your Working Directory (Windows)
- Getting Help with R
- Installing R Packages
- Loading R Packages
- Take Input and Print in R
- R Objects and Attributes
- R Data Structures
- R – Operators
- Vectorization
- Dates and Times
- Data Summary
- Reading and Writing Data to and from R
- Control Structure
- Loop Functions
- Functions
- Data Frames and dplyr Package
- Generating Random Numbers
- Random Number Seed in R
- Random Sampling
- Data Visualization Using R
dplyr Package – mutate()
Add new columns with mutate():
The mutate() function helps to compute transformations of variables in a data frame. Sometimes, you want to create new variables that are derived from existing variables and mutate() provides a clean interface for doing that.
For the examples in this section we will be using a built-in data set in R called sleep data set. First load the data set using data(“sleep”) command. To the help file for sleep data just type ?sleep. Don’t forget to load the dplyr package.![]()
library(dplyr)
library(datasets)
#OR
data("sleep")?sleep
You can see some basic characteristics of the dataset with the dim() and str() functions.
dim(sleep)
str(sleep)
summary(sleep)
Output:
> dim(sleep)
[1] 20 3
> str(sleep)
‘data.frame’: 20 obs. of 3 variables:
$ extra: num 0.7 -1.6 -0.2 -1.2 -0.1 3.4 3.7 0.8 0 2 …
$ group: Factor w/ 2 levels “1”,”2″: 1 1 1 1 1 1 1 1 1 1 …
$ ID : Factor w/ 10 levels “1”,”2″,”3″,”4″,..: 1 2 3 4 5 6 7 8 9 10 …
> summary(sleep)
extra group ID
Min. :-1.600 1:10 1 :2
1st Qu.:-0.025 2:10 2 :2
Median : 0.950 3 :2
Mean : 1.540 4 :2
3rd Qu.: 3.400 5 :2
Max. : 5.500 6 :2
(Other):8
Example:
Here we create a ‘extra_derived’ variable that subtracts the mean from the ‘extra’ variable.
sleep_data<-mutate(sleep, extra_derived= extra – mean(extra, na.rm = TRUE))
str(sleep_data)
head(sleep_data)
Output:
> str(sleep_data)
'data.frame': 20 obs. of 4 variables:
$ extra : num 0.7 -1.6 -0.2 -1.2 -0.1 3.4 3.7 0.8 0 2 ...
$ group : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ ID : Factor w/ 10 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
$ extra_derived: num -0.84 -3.14 -1.74 -2.74 -1.64 1.86 2.16 -0.74 -1.54 0.46 ...
> head(sleep_data)
extra group ID extra_derived
1 0.7 1 1 -0.84
2 -1.6 1 2 -3.14
3 -0.2 1 3 -1.74
4 -1.2 1 4 -2.74
5 -0.1 1 5 -1.64
6 3.4 1 6 1.86
Example 2:
There is also the related transmute() function, which does the same thing as mutate() but then drops all non-transformed variables.
s<-transmute(sleep, extra = extra*100)
head(s)
Output:
extra
1 70
2 -160
3 -20
4 -120
5 -10
6 340