How to Remove Last Character From String in R

Manav Narula Feb 02, 2024
  1. Use the substr() Function to Remove the Last Characters in R
  2. Use the str_sub() Function to Remove the Last Characters in R
  3. Use the gsub() Function to Remove the Last Characters in R
  4. Use the substring() Function to Remove the Last Few Characters From a String in R
  5. Conclusion
How to Remove Last Character From String in R

In data analysis and manipulation, working with strings is a common task. Often, you may find the need to remove the last few characters from a string in R.

In this article, we’ll explore various methods to achieve this, focusing on the use of different functions for substring extraction.

Use the substr() Function to Remove the Last Characters in R

The substr() function in R is designed to extract or replace substrings within a character vector. Its syntax is as follows:

substr(x, start, stop)
  • x: The input character vector.
  • start: The starting position of the substring.
  • stop: The ending position of the substring.

Essentially, the substr() function extracts the substring of a given character vector (x) starting from the specified start position to the stop position.

If stop is not provided, it extracts characters from the start position to the end of the string. This makes it particularly useful for removing a specific number of characters from the end of a string.

Let’s consider a practical example where we have the string Hello World, and we want to remove the last three characters:

original_string <- "Hello Worldddd"

new_string <- substr(
    original_string, 1, nchar(original_string) -
        3
)
cat("Original String: ", original_string, "\n")
cat("New String: ", new_string, "\n")

Output:

Original String:  Hello Worldddd 
New String:  Hello World 

In this example, we start by defining the original string, Hello World. The substr() function is then applied to this string.

The start parameter is set to 1, indicating the beginning of the string, and the stop parameter is calculated as nchar(original_string) - 3, which represents three characters before the end of the string.

The result is a new string, new_string, which contains the original string with the last three characters removed.

Now, let’s extend the example to demonstrate how you can apply the substr() function to a column in a data frame. Suppose we have a data frame with a column named City containing city names:

data <- data.frame(ID = 1:3, City = c("New Yorkkk", "Los Angelesss", "Chicagooo"))

data$City <- as.character(data$City)

data$ShortenedCity <- substr(
    data$City, 1, nchar(data$City) -
        2
)

print(data$ShortenedCity)

Output:

[1] "New York"    "Los Angeles" "Chicago"    

In this example, we create a data frame with an ID column and a City column. We then use the substr() function to remove the last two characters from each city name, creating a new column named ShortenedCity.

Use the str_sub() Function to Remove the Last Characters in R

In addition to the substr() function, R provides the str_sub() function from the stringr package, which simplifies string manipulations. Its syntax is as follows:

str_sub(string, start, end)
  • string: The input character vector.
  • start: The starting position of the substring.
  • end: The ending position of the substring.

The str_sub() function works similarly to substr() but offers a more intuitive and consistent interface for substring extraction.

It extracts the substring of a given character vector (string) starting from the specified start position to the end position. If end is not provided, it extracts characters from the start position to the end of the string.

Note: The str_sub() function is part of the stringr package, which is available in R versions 3.2.0 and later.

Let’s consider a practical example where we have the string Hello World, and we want to remove the last three characters using the str_sub() function:

library(stringr)

original_string <- "Hello Worldddd"

new_string <- str_sub(original_string, end = -4)

cat("Original String: ", original_string, "\n")
cat("New String: ", new_string, "\n")

In this example, we begin by loading the stringr package, which provides the str_sub() function. We then define the original string as Hello World.

The str_sub() function is applied to this string, specifying end = -4 as the parameter. The use of a negative value for end is a distinctive feature of str_sub(), indicating that we want to exclude the last three characters from the end of the string.

The result is a new string, new_string, which contains the original string with the last three characters removed. The negative indexing simplifies the code by explicitly stating the number of characters to exclude from the end.

Output:

Original String: Hello Worldddd
New String:  Hello World

Now, let’s extend the example to demonstrate how you can apply the str_sub() function to a column in a data frame. We will be using the same example.

library(stringr)

data <- data.frame(ID = 1:3, City = c("New Yorkkk", "Los Angelesss", "Chicagooo"))

data$ShortenedCity <- str_sub(data$City, end = -3)

print(data$ShortenedCity)

Output:

[1] "New York"    "Los Angeles" "Chicago"    

In this example, we use the str_sub() function to remove the last two characters from each city name, creating a new column named ShortenedCity.

Use the gsub() Function to Remove the Last Characters in R

In addition to the substr() and str_sub() functions, another powerful tool in R for string manipulation is the gsub() function. This function is designed for global substitution of patterns within a string.

Its syntax is as follows:

gsub(pattern, replacement, x)
  • pattern: The pattern to be replaced.
  • replacement: The replacement string.
  • x: The input character vector.

The gsub() function searches for occurrences of the specified pattern within the input character vector (x) and replaces them with the specified replacement string.

This function can be applied globally, meaning it replaces all occurrences of the pattern in the entire string.

Let’s consider a practical example where we have the string Hello World, and we want to remove the last three characters:

original_string <- "Hello Worldddd"

new_string <- gsub(".{3}$", "", original_string)
cat("Original String: ", original_string, "\n")
cat("New String: ", new_string, "\n")

In this example, we start with the original string "Hello World". The gsub() function is then applied to this string with the pattern ".{3}$" and an empty replacement string.

The pattern ".{3}$" is a regular expression that matches the last three characters (.{3}) at the end of the string ($). This effectively identifies and selects the portion of the string that we want to remove.

The empty replacement string indicates that we want to replace the matched pattern with nothing, effectively removing it.

The result is a new string, new_string, which contains the original string with the last three characters removed. By utilizing the flexibility of regular expressions, the gsub() function offers a powerful way to perform global substitutions and modifications within strings in R.

Output:

Original String:  Hello Worldddd 
New String:  Hello World 

Now, let’s extend the example to demonstrate how you can apply the gsub() function to a column in a data frame. Suppose we have a data frame with a column named City containing city names:

data <- data.frame(ID = 1:3, City = c("New Yorkkk", "Los Angelesss", "Chicagooo"))

data$ShortenedCity <- gsub(".{2}$", "", data$City)
print(data$ShortenedCity)

In this extended example, we create the same data frame named data with two columns: ID and City. We then use the gsub() function to remove the last two characters from each city name by applying the pattern ".{2}$" and an empty replacement string.

The pattern ".{2}$" is a regular expression that matches the last two characters (.{2}) at the end of each city name ($). The gsub() function iterates through all the values in the City column, identifies the matched pattern, and replaces it with nothing.

The result is a modified data frame, data, with a new column named ShortenedCity containing city names with the last two characters removed.

Output:

[1] "New York"    "Los Angeles" "Chicago"    

Use the substring() Function to Remove the Last Few Characters From a String in R

R also provides the substring() function as another option for manipulating strings. The substring() function is versatile and allows for extracting substrings based on specified positions.

Its syntax is as follows:

substring(text, first, last)
  • text: The input character vector.
  • first: The starting position of the substring.
  • last: The ending position of the substring.

The substring() function extracts the substring of a given character vector (text) starting from the specified first position to the last position. If last is not provided, it extracts characters from the first position to the end of the string.

Let’s have an example:

original_string <- "Hello Worldddd"

new_string <- substring(
    original_string, 1, nchar(original_string) -
        3
)

cat("Original String: ", original_string, "\n")
cat("New String: ", new_string, "\n")

The substring() function is then applied to the string Hello World, specifying first = 1 and last = nchar(original_string) - 3.

The first parameter is set to 1, indicating the beginning of the string, and the last parameter is calculated as nchar(original_string) - 3, representing three characters before the end of the string. This effectively selects the portion of the string that we want to keep.

The result is a new string, new_string, which contains the original string with the last three characters removed.

Output:

Original String:  Hello Worldddd 
New String:  Hello World 

Now, let’s extend the example to demonstrate how you can apply the substring() function to a column in a data frame. Suppose we have a data frame with a column named City containing city names:

data <- data.frame(ID = 1:3, City = c("New Yorkkk", "Los Angelesss", "Chicagooo"))

data$City <- as.character(data$City)

data$ShortenedCity <- substring(
    data$City, 1, nchar(data$City) -
        2
)
print(data$ShortenedCity)

Output:

[1] "New York"    "Los Angeles" "Chicago"    

Using the same data as the previous sections, we use the substring() function to remove the last two characters from each city name by applying first = 1 and last = nchar(data$City) - 2.

The result is a modified data frame, data, with a new column named containing city names with the last two characters removed. The substring() function’s simplicity and flexibility make it an effective choice for substring extraction in various scenarios, especially when dealing with data frames in R.

Conclusion

When it comes to removing the last few characters from a string in R, you have multiple tools at your disposal. The choice of method depends on your specific requirements and the nature of the data.

Whether you opt for the basic functionality of substr(), the simplicity of str_sub(), the pattern-based approach of gsub(), or the adaptability of substring() for data frames, each method provides a valuable solution for efficient string manipulation in R.

Author: Manav Narula
Manav Narula avatar Manav Narula avatar

Manav is a IT Professional who has a lot of experience as a core developer in many live projects. He is an avid learner who enjoys learning new things and sharing his findings whenever possible.

LinkedIn

Related Article - R String