How to Convert Multiple Columns From Integer to Numeric Type in R

Jesse John Feb 02, 2024
  1. Convert Multiple Columns From Integer to Numeric Type in R
  2. Use the lapply() Function to Convert Multiple Columns From Integer to Numeric Type in R
  3. Use the dplyr Package Functions to Convert Multiple Columns From Integer to Numeric Type in R
  4. Convert Multiple Columns From Factor to Numeric Type in R
  5. Conclusion
How to Convert Multiple Columns From Integer to Numeric Type in R

R has vectorized functions that convert multiple columns from integer to numeric type with a single line of code and without resorting to loops. This article explores two approaches to this task.

In both cases, the actual conversion of each column is done by the as.numeric() function.

Convert Multiple Columns From Integer to Numeric Type in R

First, we will create some sample data.

Example code:

# Create vectors.
n = letters[1:5]
p = as.integer(c(11:15))
q = as.integer(c(51:55))

# Create a data frame.
df = data.frame(Names = n, Col1 = p, Col2 = q)
df

# See the structure of the data frame.
# Note that two columns are of integer type.
str(df)

Output:

> df
  Names Col1 Col2
1     a   11   51
2     b   12   52
3     c   13   53
4     d   14   54
5     e   15   55
>
> # See the structure of the data frame.
> # Note that two columns are of integer type.
> str(df)
'data.frame':    5 obs. of  3 variables:
 $ Names: chr  "a" "b" "c" "d" ...
 $ Col1 : int  11 12 13 14 15
 $ Col2 : int  51 52 53 54 55

Use the lapply() Function to Convert Multiple Columns From Integer to Numeric Type in R

Base R’s lapply() function allows us to apply a function to elements of a list. We will apply the as.numeric() function.

The documentation of the lapply() function recommends using a wrapper function for the function name that we specify inside it.

Example code:

# First, we will create a copy of our data frame.
df1 = df

# Columns 2 and 3 are integer type.
# We will convert these to numeric.
# We will use a wrapper function as recommended.
df1[2:3] = lapply(df1[2:3], FUN = function(y){as.numeric(y)})

# Check that the columns are converted to numeric.
str(df1)

Output:

> df1[2:3] = lapply(df1[2:3], FUN = function(y){as.numeric(y)})
>
> # Check that the columns are converted to numeric.
> str(df1)
'data.frame':    5 obs. of  3 variables:
 $ Names: chr  "a" "b" "c" "d" ...
 $ Col1 : num  11 12 13 14 15
 $ Col2 : num  51 52 53 54 55

Use the dplyr Package Functions to Convert Multiple Columns From Integer to Numeric Type in R

We can use dplyr’s mutate() and across() functions to convert integer columns to numeric. The advantage of this is that the entire family of tidyselect functions is available to select columns.

We will select columns using standard list syntax and the tidyselect function where() in the example code.

Example code:

# Load the dplyr package.
library(dplyr)

# USING STANDARD LIST SYNTAX.
# Convert the columns.
df2 = df %>% mutate(across(.cols=2:3, .fns=as.numeric))

# Check that the columns are converted.
str(df2)

# USING TIDYSELECT WHERE FUNCTION.
# Convert ALL integer columns to numeric.
df3 = df %>% mutate(across(.cols=where(is.integer), .fns=as.numeric))

# Check that the columns are converted.
str(df3)

Output:

# USING STANDARD LIST SYNTAX.
# Convert the columns.
df2 = df %>% mutate(across(.cols=2:3, .fns=as.numeric))

# Check that the columns are converted.
str(df2)

# USING TIDYSELECT WHERE FUNCTION.
# Convert ALL integer columns to numeric.
df3 = df %>% mutate(across(.cols=where(is.integer), .fns=as.numeric))

# Check that the columns are converted.
str(df3)

Convert Multiple Columns From Factor to Numeric Type in R

Sometimes, factor levels are coded with numbers, mostly integers. We will not want to convert such columns.

However, at other times, columns with integers may be represented as factors in R. Converting such columns to numbers poses a challenge.

The example code shows what happens when a factor column is converted to numeric.

Example code:

# Create a factor vector.
x = factor(c(15,15,20,25,30,30,30))

# See that these are 4 levels of factors.
# They are not numbers.
str(x)

# Convert the factor vector to numeric.
as.numeric(x) # This is not the result we want.

Output:

> # Create a factor vector.
> x = factor(c(15,15,20,25,30,30,30))
>
> # See that these are 4 levels of factors.
> # They are not numbers.
> str(x)
 Factor w/ 4 levels "15","20","25",..: 1 1 2 3 4 4 4
>
> # Convert the factor vector to numeric.
> as.numeric(x) # This is not the result we want.
[1] 1 1 2 3 4 4 4

When an integer column happens to be wrongly represented as factors, we need to add one preliminary step to convert it to numeric correctly.

We must first convert the factors to a character type and then convert the character to a numeric type.

Example code:

# First, convert the factor vector to a character type.
# Then convert the character type to numeric.
# Both the above can be done in a single step, as follows.
y = as.numeric(as.character(x))
y

# Check that y is numeric.
str(y)

Output:

> y = as.numeric(as.character(x))
> y
[1] 15 15 20 25 30 30 30
>
> # Check that y is numeric.
> str(y)
 num [1:7] 15 15 20 25 30 30 30

Let us see an example with a data frame. We will use the dplyr approach.

Example code:

# Create a factor vector.
f = factor(c(20,20,30,30,30))

# Create a data frame.
df4 = data.frame(Name=n, Col1=p, Col2=q, Fac=f)
df4

# Check the structure.
str(df4)

# We will use the dplyr approach.

# First only convert integer type columns.
df5 = df4 %>% mutate(across(.cols=where(is.integer), .fns=as.numeric))
# Factor column did not get converted.
str(df5)

# Now, we will START AGAIN, and convert the factor column as well.
# To modify an existing column by name, we will give it the SAME name.
df6 = df4 %>% mutate(across(.cols=where(is.integer), .fns=as.numeric), Fac=as.numeric(as.character(Fac)))
df6
# Check that the factor column has also got converted.
str(df6)

Output:

> # Create a factor vector.
> f = factor(c(20,20,30,30,30))
>
> # Create a data frame.
> df4 = data.frame(Name=n, Col1=p, Col2=q, Fac=f)
> df4
  Name Col1 Col2 Fac
1    a   11   51  20
2    b   12   52  20
3    c   13   53  30
4    d   14   54  30
5    e   15   55  30
>
> # Check the structure.
> str(df4)
'data.frame':    5 obs. of  4 variables:
 $ Name: chr  "a" "b" "c" "d" ...
 $ Col1: int  11 12 13 14 15
 $ Col2: int  51 52 53 54 55
 $ Fac : Factor w/ 2 levels "20","30": 1 1 2 2 2
>
> # We will use the dplyr approach.
>
> # First only convert integer type columns.
> df5 = df4 %>% mutate(across(.cols=where(is.integer), .fns=as.numeric))
> # Factor column did not get converted.
> str(df5)
'data.frame':    5 obs. of  4 variables:
 $ Name: chr  "a" "b" "c" "d" ...
 $ Col1: num  11 12 13 14 15
 $ Col2: num  51 52 53 54 55
 $ Fac : Factor w/ 2 levels "20","30": 1 1 2 2 2
>
> # Now, we will START AGAIN, and convert the factor column as well.
> # To modify an existing column by name, we will give it the SAME name.
> df6 = df4 %>% mutate(across(.cols=where(is.integer), .fns=as.numeric), Fac=as.numeric(as.character(Fac)))
> df6
  Name Col1 Col2 Fac
1    a   11   51  20
2    b   12   52  20
3    c   13   53  30
4    d   14   54  30
5    e   15   55  30
> # Check that the factor column has also got converted.
> str(df6)
'data.frame':    5 obs. of  4 variables:
 $ Name: chr  "a" "b" "c" "d" ...
 $ Col1: num  11 12 13 14 15
 $ Col2: num  51 52 53 54 55
 $ Fac : num  20 20 30 30 30

The tidyselect functions are documented at the selection language web page. Refer to R’s documentation of the lapply() function to understand the need for a wrapper function.

The documentation of the as.numeric() function gives a second approach to convert integers represented as factors to a numeric type.

Conclusion

Before initiating the conversion of integer columns to a numeric type, we need to check whether the integer columns are of integer type. If they are represented as factors and want to convert them to numeric, we need to take one additional step to ensure proper conversion.

The conversion can be done using base R’s lapply() function, or a combination of dplyr’s mutate() and across() functions. The actual conversion is done using the as.numeric() function.

Author: Jesse John
Jesse John avatar Jesse John avatar

Jesse is passionate about data analysis and visualization. He uses the R statistical programming language for all aspects of his work.