How to Find All Substring Occurrences in Python String

Vaibhhav Khetarpal Feb 02, 2024
  1. Use the string.count() Function to Find All Occurrences of a Substring in a String
  2. Use List Comprehension and the startswith() Function to Find All Occurrences of a Substring in a String
  3. Use the re.finditer() Function to Find All Occurrences of a Substring in a String
  4. Use the str.find() to Find All Occurrences of a Substring in a String
  5. Conclusion
How to Find All Substring Occurrences in Python String

A substring in Python is a cluster of characters that occurs within another string. Dealing with substrings can often be troublesome, and one such problem is finding all the occurrences of a substring within a particular string.

This tutorial will discuss different methods to find all occurrences of a substring within a string in Python.

Use the string.count() Function to Find All Occurrences of a Substring in a String

The string.count() is a Python built-in function that returns the number of occurrences of a substring in a given particular string. Moreover, it has additional parameters start and end to specify the indices of starting and ending positions.

The count() method traverses the string and returns the number of times a specific substring has occurred in the string.

Basic syntax(string.count()):

count = input_string.count(substring)

Parameters:

  • input_string: This is the original string in which you want to count occurrences.
  • substring: This is the substring you want to count.

The count variable will hold the number of times the substring appears in the input_string.

Code example:

str1 = "This shirt looks good; you have good taste in clothes."
substr = "good"

count1 = str1.count(substr)
print(count1)

count2 = str1.count(substr, 0, 25)
print(count2)

Output:

2
1

In this code, we start with the string str1, which contains the sentence "This shirt looks good; you have good taste in clothes:. We are interested in counting the occurrences of the substring "good" within this string.

The first count1 line uses the str.count() method to count all occurrences of "good" in the entire string str1. It returns 2, indicating that "good" appears twice in the full sentence.

The second count2 line counts occurrences of "good" only within the substring that starts at index 0 and ends at index 24. This substring corresponds to the portion of the sentence, and it returns 1, and this tells us that within this specific part of the sentence, "good" appears only once.

So, the code outputs 2 for count1 and 1 for count2, providing us with the counts of "good" in the entire string and a specific substring, respectively.

It is an easy method and works in every case. The only downfall of this method is that it does not return the different indices at which the substring occurs in the string.

Use List Comprehension and the startswith() Function to Find All Occurrences of a Substring in a String

This method needs two things: list comprehension and the startswith() method.

The startswith() function plays out the task of getting the beginning indices of the substring, and list comprehension is utilized to iterate through the complete target string.

Basic syntax:

res = [i for i in range(len(input_string)) if input_string.startswith(substring, i)]

Parameters:

  • input_string: This is the original string you want to search within.
  • substring: This is the substring you want to find.
  • i: This is the starting index at which you want to check for the substring.

The list comprehension iterates through the indices of the input_string and checks if the substring starts at each index using str.startswith(). If a match is found, the index is added to the res list.

Code example:

str1 = "This shirt looks good; you have good taste in clothes."
substr = "good"

print("The original string is : " + str1)
print("The substring to find : " + substr)

res = [i for i in range(len(str1)) if str1.startswith(substr, i)]

print("The start indices of the substrings are : " + str(res))

Output:

The original string is : This shirt looks good; you have good taste in clothes.
The substring to find : good
The start indices of the substrings are : [17, 32]

In this code, we start with the string str1, which contains the sentence "This shirt looks good; you have good taste in clothes". Our goal is to find the start indices of the substring "good" within this string.

We first print the original string and the substring we want to find. Then, we use list comprehension to iterate through the indices of the string, and for each index i, we check if the substring "good" starts at that position using the str.startswith() method. If it does, we include that index in the result list res.

The output shows that the original string is "This shirt looks good; you have good taste in clothes", and we are looking for the substring "good". The start indices of the substrings "good" are [17, 32] and these indices indicate where "good" begins in the string, representing the positions of its occurrences.

Use the re.finditer() Function to Find All Occurrences of a Substring in a String

re.finditer() is a function of the regex library that Python provides for programmers to use in their code. It helps in performing the task of finding the occurrence of a particular pattern in a string. To use this function, we need to import the regex library re first.

The re.finditer() uses the pattern and string parameters in its syntax. In this case, the pattern refers to the substring.

Basic syntax:

import re

matches = [match.start() for match in re.finditer(pattern, input_string)]

Parameters:

  • pattern: The regex pattern you want to search for.
  • input_string: The string in which you want to find the matches.
  • matches: A list of match objects, and you extract the start positions using match.start().

Code example:

import re

str1 = "This shirt looks good; you have good taste in clothes."
substr = "good"

print("The original string is: " + str1)
print("The substring to find: " + substr)

result = [_.start() for _ in re.finditer(substr, str1)]

print("The start indices of the substrings are: " + str(result))

Output:

The original string is: This shirt looks good; you have good taste in clothes.
The substring to find: good
The start indices of the substrings are : [17, 32]

In this code, we begin with the string str1, which contains the sentence "This shirt looks good; you have good taste in clothes". Our objective is to find the start indices of the substring "good" within this string using regular expressions.

First, we print the original string and specify the substring we are searching for. We then utilize the re.finditer() function from the re module to locate all non-overlapping occurrences of the "good" substring in str1.

Next, we store the start positions of these matches in the result list, extracting them using _.start() for each match. The output reveals the original string and the substring we are looking for.

The start indices of the substrings "good" are [17, 32], which indicates where "good" begins in the string. These indices represent the positions of its occurrences as identified by the regular expression search.

Use the str.find() to Find All Occurrences of a Substring in a String

In Python, to use the str.find() method to find all occurrences of a substring in a string, you can create a loop that iterates through the string and uses str.find() to locate the substring.

Basic syntax:

res = []
start = 0

while start < len(input_string):
    index = input_string.find(substring, start)
    if index == -1:
        break
    res.append(index)
    start = index + 1

Parameters:

  • input_string: This is the original string in which you want to find occurrences.
  • substring: This is the substring you want to find.
  • start: This is the position where the search starts, initially set to 0.

The while loop continues until the find() method no longer finds any more occurrences of the substring. If an occurrence is found, the index is added to the res list, and the start position is updated to continue the search.

Code example:

str1 = "This shirt looks good; you have good taste in clothes."
substr = "good"

print("The original string is: " + str1)
print("The substring to find: " + substr)

res = []
start = 0

while start < len(str1):
    index = str1.find(substr, start)
    if index == -1:
        break
    res.append(index)
    start = index + 1

print("The start indices of the substrings are: " + str(res))

Output:

The original string is: This shirt looks good; you have good taste in clothes.
The substring to find: good
The start indices of the substrings are: [17, 32]

In this code, we start with the string str1, which contains the sentence "This shirt looks good; you have good taste in clothes". Our task is to find the starting positions of the substring "good" within this string.

First, we print the original string and specify the substring we are searching for. We then use a while loop that begins at index 0 and continues as long as the start position is within the length of the string.

Within the loop, we use the str.find() method to search for the substring "good" starting from the current start position. If a match is found (the index is not -1), we add that index to the res list and update the start position to continue searching.

The output provides information about the original string and the target substring. The start indices of the substrings "good" are presented as [17, 32] and these indices indicate where "good" begins in the string, representing the positions of its occurrences.

Conclusion

In summary, this article delves into several techniques for efficiently finding all occurrences of a substring within a string in Python. The methods explored include using the string.count() function, employing list comprehension and startswith() for obtaining substring start indices, leveraging re.finditer() for pattern-based searches, and utilizing a while loop in combination with str.find() to locate and record all occurrences.

Each method offers its advantages, allowing us to choose the suitable approach for their specific substring identification needs. Whether counting occurrences, capturing start positions, or applying regular expressions, these methods equip developers with a diverse set of tools to handle substring-related tasks with ease and precision.

Vaibhhav Khetarpal avatar Vaibhhav Khetarpal avatar

Vaibhhav is an IT professional who has a strong-hold in Python programming and various projects under his belt. He has an eagerness to discover new things and is a quick learner.

LinkedIn

Related Article - Python String