How to Use grepl to Find Matches for Any Character String in the R Character Vector

Jinku Hu Feb 02, 2024
  1. Use grep or grepl Functions to Search for Pattern Matches in R
  2. Use grepl to Match Any Character Strings in the R Character Vector
How to Use grepl to Find Matches for Any Character String in the R Character Vector

This article will discuss several methods of using grepl to find matches for any character string in the R character vector.

Use grep or grepl Functions to Search for Pattern Matches in R

grep is used for pattern matching in a character vector. It takes pattern argument as a regular expression which gets matched by the function accordingly. grep returns vectors of indices for matched elements by default, but it can also return a character vector of matched elements if the user assigns TRUE to the value parameter. On the other hand, grepl returns a vector of boolean values indicating if the corresponding element matches the pattern or not. The following example demonstrates the alphabet letters matching with any alphabet letter, and we get the all element match result as expected.

grep("[a-z]", letters)
grep("[a-z]", letters, value = TRUE)
grepl("[a-z]", letters)
> grep("[a-z]", letters)
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

> grep("[a-z]", letters, value = TRUE)
 [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v"
[23] "w" "x" "y" "z"

> grepl("[a-z]", letters)
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[18] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

Use grepl to Match Any Character Strings in the R Character Vector

The grepl function can match any logical permutations of strings provided with the corresponding pattern. Note that grepl does not match different case letters by default. The following code snippet shows the first function that matches every string where The is found. On the other hand, the next call to grepl matches words containing either The or the.

words <- c("The", "licenses", "for", "most", "software", "are",
         "to", "share", "and", "change", "it.",
         "", "By", "contrast,", "the", "GNU", "General", "Public", "License",
         "free", "for", "all", "its", "users")

i <- grepl("The", words)
i
i <- grepl("The|the", words)
i
[1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[15] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

[1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[15]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Alternatively, we can utilize the if...else function to print string values for each matched or unmatched element in the character vector, as shown in the following code sample. Notice that the second call uses matching with any string containing the character from the e-h range.

words <- c("The", "Them", "for", "most", "software", "are",
         "to", "share", "and", "change", "it.",
         "", "By", "contrast,", "the", "GNU", "General", "Public", "License",
         "free", "for", "all", "its", "users")

i <- ifelse(grepl("The|the", words), "Tr", "Fa")
i <- ifelse(grepl("[e-h]", words), "Tr", "Fa")
i
[1] "Tr" "Tr" "Fa" "Fa" "Fa" "Fa" "Fa" "Fa" "Fa" "Fa" "Fa" "Fa" "Fa" "Fa" "Tr" "Fa" "Fa"
[18] "Fa" "Fa" "Fa" "Fa" "Fa" "Fa" "Fa"

[1] "Tr" "Tr" "Tr" "Fa" "Tr" "Tr" "Fa" "Tr" "Fa" "Tr" "Fa" "Fa" "Fa" "Fa" "Tr" "Fa" "Tr"
[18] "Fa" "Tr" "Tr" "Tr" "Fa" "Fa" "Tr"
Author: Jinku Hu
Jinku Hu avatar Jinku Hu avatar

Founder of DelftStack.com. Jinku has worked in the robotics and automotive industries for over 8 years. He sharpened his coding skills when he needed to do the automatic testing, data collection from remote servers and report creation from the endurance test. He is from an electrical/electronics engineering background but has expanded his interest to embedded electronics, embedded programming and front-/back-end programming.

LinkedIn Facebook

Related Article - R Regex