R Programming
- Overview of R
- Installing R on Windows
- Download and Install RStudio on Windows
- Setting Your Working Directory (Windows)
- Getting Help with R
- Installing R Packages
- Loading R Packages
- Take Input and Print in R
- R Objects and Attributes
- R Data Structures
- R – Operators
- Vectorization
- Dates and Times
- Data Summary
- Reading and Writing Data to and from R
- Control Structure
- Loop Functions
- Functions
- Data Frames and dplyr Package
- Generating Random Numbers
- Random Number Seed in R
- Random Sampling
- Data Visualization Using R
dplyr Package – select()
Select columns with select():
For the examples in this section we will be using a built-in data set in R called iris data set. First load the data set using data(“iris”) command. To the help file for iris just type ?iris. Don’t forget to load the dplyr package.
library(dplyr)
library(datasets)
#OR
data("iris")?iris
You can see some basic characteristics of the dataset with the dim() and str() functions.
dim(iris)
str(iris)
names(iris)
Output:
> dim(iris)
[1] 150 5
> str(iris)
'data.frame': 150 obs. of 5 variables:
$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
$ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
$ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
> names(iris)
[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"
Example:
x<-select(iris,c(Species,Sepal.Length))
head(x)
Output:
Species Sepal.Length
1 setosa 5.1
2 setosa 4.9
3 setosa 4.7
4 setosa 4.6
5 setosa 5.0
6 setosa 5.4
Example 2:
Inside the select() function you can use : to specify a range of variable names.
y<-select(iris, Sepal.Length: Petal.Length)
head(y)
Output:
Sepal.Length Sepal.Width Petal.Length
1 5.1 3.5 1.4
2 4.9 3.0 1.4
3 4.7 3.2 1.3
4 4.6 3.1 1.5
5 5.0 3.6 1.4
6 5.4 3.9 1.7
Example 3:
You can also omit variables using the select() function by using the negative sign.
z<-select(iris,-c(Species,Sepal.Length))
head(z)
Output:
Sepal.Width Petal.Length Petal.Width
1 3.5 1.4 0.2
2 3.0 1.4 0.2
3 3.2 1.3 0.2
4 3.1 1.5 0.2
5 3.6 1.4 0.2
6 3.9 1.7 0.4
If you don’t want to use the select function then you can do the same things using equivalent following code in base R.
i <- match("Species", names(iris))
j <- match("Sepal.Length", names(chicago))
head(chicago[, -(i:j)]
Example 4:
The select() function also allows a special syntax that allows you to specify variable names based on patterns. Check example 4 and 5.
iris_subset1 <- select(iris, ends_with("Length"))
head(iris_subset1)
Output:
Sepal.Length Petal.Length
1 5.1 1.4
2 4.9 1.4
3 4.7 1.3
4 4.6 1.5
5 5.0 1.4
6 5.4 1.7
Example 5:
iris_subset2 <- select(iris, starts_with("Sepal"))
head(iris_subset2)
Output:
Sepal.Length Sepal.Width
1 5.1 3.5
2 4.9 3.0
3 4.7 3.2
4 4.6 3.1
5 5.0 3.6
6 5.4 3.9