How to Split Column Into Two Columns in R

Jinku Hu Feb 02, 2024
  1. Use the separate Function to Split Column Into Two Columns in R
  2. Use the extract Function to Split Column Into Two Columns in R
  3. Use the str_split_fixed Function to Split Column Into Two Columns in R
How to Split Column Into Two Columns in R

This article will introduce how to split a column into two columns using separate in R.

Use the separate Function to Split Column Into Two Columns in R

separate is part of the tidyr package, and it can be used to split a character column into multiple columns with regular expressions or numeric locations. In this code example, we declare a data frame that contains comma-separated strings of name/surname pairs. separate function takes the data frame as the first argument and column name as the second argument. The third argument denotes the variable names that will be column names of a newly created character vector. Note that we use %>% pipe to pass df object to the separate function. The same function call can be invoked on the data frame where name and surnames are delimited with a dot separator.

library(dplyr)
library(tidyr)
library(stringr)

df <- data.frame(x = c('John, Mae', 'Maude, Lebowski', 'Mia, Amy', 'Andy, James'))
df1 <- data.frame(x = c('John. Mae', 'Maude. Lebowski', 'Mia. Amy', 'Andy. James'))

df %>% separate(x, c('Name', 'Surname'))

df1 %>% separate(x, c('Name', 'Surname'))

Output:

> df %>% separate(x, c('Name', 'Surname'))
   Name   Surname
1  John       Mae
2 Maude  Lebowski
3   Mia       Amy
4  Andy     James

> df1 %>% separate(x, c('Name', 'Surname'))
   Name  Surname
1  John      Mae
2 Maude Lebowski
3   Mia      Amy
4  Andy    James

Use the extract Function to Split Column Into Two Columns in R

Another useful function to split a column into two separate ones is extract, which is also part of the tidyr package. extract function works on columns using regular expressions groups. Note that each regular expression group should be mapped to the items in the previous parameter. If the groups and items don’t match, the output will have NA values.

library(dplyr)
library(tidyr)
library(stringr)

df <- data.frame(x = c('John, Mae', 'Maude, Lebowski', 'Mia, Amy', 'Andy, James'))

df %>% extract(x, c("Name", "Surname"), "([^,]+), ([^)]+)")

Output:

> df %>% extract(x, c("Name", "Surname"), "([^,]+), ([^)]+)")

   Name  Surname
1  John      Mae
2 Maude Lebowski
3   Mia      Amy
4  Andy    James

Use the str_split_fixed Function to Split Column Into Two Columns in R

Alternatively, we can utilize str_split_fixed function from the stringr package. It matches the given character pattern and splits the character vector into the corresponding number of columns. Although, the user can explicitly pass the number of split items to return. The number of items is passed as the third argument.

library(dplyr)
library(tidyr)
library(stringr)

df <- data.frame(x = c('John, Mae', 'Maude, Lebowski', 'Mia, Amy', 'Andy, James'))

str_split_fixed(df$x, ", ", 2)

Output:

> str_split_fixed(df$x, ", ", 2)
     [,1]    [,2]      
[1,] "John"  "Mae"     
[2,] "Maude" "Lebowski"
[3,] "Mia"   "Amy"     
[4,] "Andy"  "James"
Author: Jinku Hu
Jinku Hu avatar Jinku Hu avatar

Founder of DelftStack.com. Jinku has worked in the robotics and automotive industries for over 8 years. He sharpened his coding skills when he needed to do the automatic testing, data collection from remote servers and report creation from the endurance test. He is from an electrical/electronics engineering background but has expanded his interest to embedded electronics, embedded programming and front-/back-end programming.

LinkedIn Facebook