Remove Rows With NA in One Column in R

Sheeraz Gul Jan 30, 2023 May 11, 2022
  1. Remove Rows With NA in One Column Using the is.na() Method in R
  2. Remove Rows With NA in One Column Using the complete.cases() Method in R
  3. Remove Rows With NA in One Column Using Tidyr Library drop_na() Method in R
Remove Rows With NA in One Column in R

Columns in a data frame can have empty values represented by the keyword NA. This tutorial demonstrates how to remove the rows which contain an NA value in one column in R.

Remove Rows With NA in One Column Using the is.na() Method in R

The method is.na() will look for the NA values in a data frame and remove the NA values’ rows. The process is given below:

  1. First of all, create the data frame.
  2. Select the column based on NA values and rows you want to delete.
  3. Create a negation with the method is.na() bypassing the parameter dataframe$columnname.
  4. The method will remove the rows by checking the given column if it contains a NA value.

Let’s try an example following the steps above.

Example:

Delftstack = data.frame(Name=c('Jack', 'John', 'Mike', 'Michelle', 'Jhonny'),
                  LastName=c('Danials', 'Cena', 'Chandler', 'McCool', 'Nitro'),
                  Id=c(101, 102, NA, 104, NA),
                  Designation=c('CEO', 'Project Manager', 'Senior Dev', 'Junior Dev', 'Intern'))

print('The dataframe before removing the rows:-')
print(Delftstack)

print('The dataframe after removing the rows:-')
Delftstack[!is.na(Delftstack$Id),]

The code above will delete the rows based on the NA values in the Id column.

Output:

[1] "The dataframe before removing the rows:-"
      Name LastName  Id     Designation
1     Jack  Danials 101             CEO
2     John     Cena 102 Project Manager
3     Mike Chandler  NA      Senior Dev
4 Michelle   McCool 104      Junior Dev
5   Jhonny    Nitro  NA          Intern

[1] "The dataframe after removing the rows:-"
      Name LastName  Id     Designation
1     Jack  Danials 101             CEO
2     John     Cena 102 Project Manager
4 Michelle   McCool 104      Junior Dev

Remove Rows With NA in One Column Using the complete.cases() Method in R

The method complete.cases() works similarly to is.na() method. The method complete.cases will look for the NA values in a data frame and remove the rows containing this value.

The process is similar to the steps described above only difference is we don’t use negation with complete.cases().

Example:

Delftstack = data.frame(Name=c('Jack', 'John', 'Mike', 'Michelle', 'Jhonny'),
                  LastName=c('Danials', 'Cena', 'Chandler', 'McCool', 'Nitro'),
                  Id=c(101, 102, NA, 104, NA),
                  Designation=c('CEO', 'Project Manager', 'Senior Dev', 'Junior Dev', 'Intern'))

print('The dataframe before removing the rows:-')
print(Delftstack)


print('The dataframe after removing the rows:-')
Delftstack[complete.cases(Delftstack$Id),]

The code above will delete the rows based on NA values in the Id column.

Output:

[1] "The dataframe before removing the rows:-"
      Name LastName  Id     Designation
1     Jack  Danials 101             CEO
2     John     Cena 102 Project Manager
3     Mike Chandler  NA      Senior Dev
4 Michelle   McCool 104      Junior Dev
5   Jhonny    Nitro  NA          Intern

[1] "The dataframe after removing the rows:-"
      Name LastName  Id     Designation
1     Jack  Danials 101             CEO
2     John     Cena 102 Project Manager
4 Michelle   McCool 104      Junior Dev

Remove Rows With NA in One Column Using Tidyr Library drop_na() Method in R

The drop_na() from the tidyr library will drop the rows based on the NA values column. First, you need to install the tidyr library if it is not already installed.

Run the following code to install the package:

install.packages('tidyverse')

Though the code output is similar to the above methods, the process is slightly different. We use the dataframe %>% drop_na(column) syntax to delete the rows.

Example:

library(tidyr)
Delftstack = data.frame(Name=c('Jack', 'John', 'Mike', 'Michelle', 'Jhonny'),
                  LastName=c('Danials', 'Cena', 'Chandler', 'McCool', 'Nitro'),
                  Id=c(101, 102, NA, 104, NA),
                  Designation=c('CEO', 'Project Manager', 'Senior Dev', 'Junior Dev', 'Intern'))

print('The dataframe before removing the rows:-')
print(Delftstack)

print('The dataframe after removing the rows:-')
Delftstack %>% drop_na(Id)

The code above will work similarly to the methods above.

Output:

[1] "The dataframe before removing the rows:-"
      Name LastName  Id     Designation
1     Jack  Danials 101             CEO
2     John     Cena 102 Project Manager
3     Mike Chandler  NA      Senior Dev
4 Michelle   McCool 104      Junior Dev
5   Jhonny    Nitro  NA          Intern

[1] "The dataframe after removing the rows:-"
      Name LastName  Id     Designation
1     Jack  Danials 101             CEO
2     John     Cena 102 Project Manager
4 Michelle   McCool 104      Junior Dev

There are also methods like na.omit(), filter(), etc., which are used to remove rows based on the NA values found in any column. They will remove the values based on multiple columns, not one column.

Author: Sheeraz Gul
Sheeraz Gul avatar Sheeraz Gul avatar

Sheeraz is a Doctorate fellow in Computer Science at Northwestern Polytechnical University, Xian, China. He has 7 years of Software Development experience in AI, Web, Database, and Desktop technologies. He writes tutorials in Java, PHP, Python, GoLang, R, etc., to help beginners learn the field of Computer Science.

LinkedIn Facebook

Related Article - R Data Frame