Split Column Into Two Columns in R

Lasha Khintibidze May 26, 2021
  1. Use the separate Function to Split Column Into Two Columns in R
  2. Use the extract Function to Split Column Into Two Columns in R
  3. Use the str_split_fixed Function to Split Column Into Two Columns in R
Split Column Into Two Columns in R

This article will introduce how to split a column into two columns using separate in R.

Use the separate Function to Split Column Into Two Columns in R

separate is part of the tidyr package, and it can be used to split a character column into multiple columns with regular expressions or numeric locations. In this code example, we declare a data frame that contains comma-separated strings of name/surname pairs. separate function takes the data frame as the first argument and column name as the second argument. The third argument denotes the variable names that will be column names of a newly created character vector. Note that we use %>% pipe to pass df object to the separate function. The same function call can be invoked on the data frame where name and surnames are delimited with a dot separator.

library(dplyr)
library(tidyr)
library(stringr)

df <- data.frame(x = c('John, Mae', 'Maude, Lebowski', 'Mia, Amy', 'Andy, James'))
df1 <- data.frame(x = c('John. Mae', 'Maude. Lebowski', 'Mia. Amy', 'Andy. James'))

df %>% separate(x, c('Name', 'Surname'))

df1 %>% separate(x, c('Name', 'Surname'))

Output:

> df %>% separate(x, c('Name', 'Surname'))
   Name   Surname
1  John       Mae
2 Maude  Lebowski
3   Mia       Amy
4  Andy     James

> df1 %>% separate(x, c('Name', 'Surname'))
   Name  Surname
1  John      Mae
2 Maude Lebowski
3   Mia      Amy
4  Andy    James

Use the extract Function to Split Column Into Two Columns in R

Another useful function to split a column into two separate ones is extract, which is also part of the tidyr package. extract function works on columns using regular expressions groups. Note that each regular expression group should be mapped to the items in the previous parameter. If the groups and items don’t match, the output will have NA values.

library(dplyr)
library(tidyr)
library(stringr)

df <- data.frame(x = c('John, Mae', 'Maude, Lebowski', 'Mia, Amy', 'Andy, James'))

df %>% extract(x, c("Name", "Surname"), "([^,]+), ([^)]+)")

Output:

> df %>% extract(x, c("Name", "Surname"), "([^,]+), ([^)]+)")

   Name  Surname
1  John      Mae
2 Maude Lebowski
3   Mia      Amy
4  Andy    James

Use the str_split_fixed Function to Split Column Into Two Columns in R

Alternatively, we can utilize str_split_fixed function from the stringr package. It matches the given character pattern and splits the character vector into the corresponding number of columns. Although, the user can explicitly pass the number of split items to return. The number of items is passed as the third argument.

library(dplyr)
library(tidyr)
library(stringr)

df <- data.frame(x = c('John, Mae', 'Maude, Lebowski', 'Mia, Amy', 'Andy, James'))

str_split_fixed(df$x, ", ", 2)

Output:

> str_split_fixed(df$x, ", ", 2)
     [,1]    [,2]      
[1,] "John"  "Mae"     
[2,] "Maude" "Lebowski"
[3,] "Mia"   "Amy"     
[4,] "Andy"  "James"