# Replace NA With Zero in R

Gustavo du Mortier Apr 26, 2021 Jan 25, 2021

There is a simple way to replace `NA` with zeroes in a data frame in R. Suppose you have a data frame called `my_data`. To replace all `NA` values with zeroes in that data frame, you can execute this statement.

``````my_data[is.na(my_data)] <- 0
``````

For example, if `my_data` has the below content.

``````   C1  C2    C3  C4  C5
1   4   3  <NA>   3   7
2   9   8   ABC   5  10
3   1   1   XYZ   3   6
4  NA   4  <NA>   7  10
5   1   2   ZC1  NA   2
``````

When you execute `my_data[is.na(my_data)] <- 0` the data frame’s content change to this.

``````   C1  C2    C3  C4  C5
1   4   3     0   3   7
2   9   8   ABC   5  10
3   1   1   XYZ   3   6
4   0   4     0   7  10
5   1   2   ZC1   0   2
``````

## Replace NA With Zero in Bigger R Data Frames

The previous solution uses the Base R subset reassigns, which work fine when you have relatively small data frames. But for bigger data sets, you might need a faster alternative, like the new hybrid evaluation approach implemented in recent versions of the `dplyr` package.

The new approach employed by the `dplyr` package recognizes entire expressions and uses C++ code to evaluate them. In this way, you can achieve up to 30% faster transforms when processing big data frames.

To replace `NA` values with zeroes using the `dplyr` package, you can use the `mutate` function with the `_all` scoped verb and the `replace` function in the `purrr` format, as in the below example.

``````my_data <- mutate_all(my_data, ~replace(., is.na(.), 0))
``````

The use of the `purrr` notation allows us to apply the `replace` function to each data frame element.

## Replace NA With Zero in a Subset of R Data Frame

Instead of the `_all` scoped verb in the `mutate` function, you can use the `_at` scoped verb to restrict the replacement action to specific columns. To do that, you can include a vector with the columns’ names where you want the replacement to be applied. Using the previous data frame, if you need to replace `NA` values only in columns `C1` and `C4`, you can use the following command:

``````my_data <- mutate_at(my_data, c("C1", "C4"), ~replace(., is.na(.), 0))
``````

In this way, only the NAs in columns `C1` and `C4` get replaced by 0, resulting in a data frame like below.

``````   C1  C2    C3  C4  C5
1   4   3  <NA>   3   7
2   9   8   ABC   5  10
3   1   1   XYZ   3   6
4   0   4  <NA>   7  10
5   1   2   ZC1   0   2
``````

In the previous example, you might have wanted to replace `NA` with zeroes only in numeric columns to avoid including zero values in alphanumeric columns such as `C3`. If that is the case, instead of specifying the columns where you want to apply the replacement, you can use the `mutate_if` function with the `is.numeric` condition to tell R to replace `NA` with zeroes only in numeric columns. In the following example, you can find the complete code to try this out, from installing the `dplyr` package and populating the data frame to performing the replacements and displaying the results.

``````install.packages("dplyr")
library(dplyr)
C1 <- c(4, 9, 1, NA, 1)
C2 <- c(3, 8, 1, 4, 2)
C3 <- c(NA, 'ABC', 'XYZ', NA, 'ZC1')
C4 <- c(3, 5, 3, 7, NA)
C5 <- c(7, 10, NA, 10, 2)
my_data <- data.frame(C1, C2, C3, C4, C5)
my_data <- mutate_if(my_data, is.numeric, ~replace(., is.na(.), 0))
my_data
``````

Output:

``````   C1  C2    C3  C4  C5
1   4   3  <NA>   3   7
2   9   8   ABC   5  10
3   1   1   XYZ   3   0
4   0   4  <NA>   7  10
5   1   2   ZC1   0   2
``````

You can find more info on the `mutate()` function and its variants in the R Documentation.