GREP in Python

Manav Narula Oct 10, 2023
GREP in Python

GREP is an interesting command-line feature that allows us to search plain text files for specific lines using regular expressions.

Regular expressions are very heavily used in Python and can be used to check if a string matches a pattern or not.

The re module in Python allows us to deal with regular expressions. In the following code, we will try to implement GREP in Python and search a file for some specific pattern.

with open("sample.txt", "r") as file:
    for line in file:
        if re.search(pattern, line):
            print(line)

We open the required file in the reading mode and iterate through the file line by line. Then we use the re.search() function to search the pattern in every line. If the pattern is found, then the line is printed.

There is another neat way to implement this in the command line with Python. This method will specify the regular expression and the file to be searched in the command line while running the file in the terminal. This allows us to replicate GREP in Python properly.

The following code implements this.

import re
import sys

with open(sys.argv[2], "r") as file:
    for line in file:
        if re.search(sys.argv[1], line):
            print(line)

The sys module provides the argv() function, which returns an array of all the arguments provided in the command-line.

We can save this file as grep.py and run this Python script from the terminal and specify the necessary arguments in the following way.

python grep.py 'RE' 'file-name'

If we want to work with multiple arguments, then we can use the glob module.

The glob module allows us to find the paths of files that match a pattern in a directory.

Its use in replicating GREP in Python can be seen below.

import re
import sys
import glob

for arg in sys.argv[2:]:
    for file in glob.iglob(arg):
        for line in open(file, "r"):
            if re.search(sys.argv[1], line):
                print(
                    line,
                )

The iglob() function creates an object that returns the files in the directory, which is passed to the function.

Another concise way of implementing GREP in just a few lines is shown below.

import re
import sys

map(sys.stdout.write, (l for l in sys.stdin if re.search(sys.argv[1], l)))

This way is more precise and memory efficient, and we can run these lines directly in the terminal.

python -c "import re,sys;map(sys.stdout.write,(l for l in sys.stdin if re.search(sys.argv[1],l)))" "RE"
Author: Manav Narula
Manav Narula avatar Manav Narula avatar

Manav is a IT Professional who has a lot of experience as a core developer in many live projects. He is an avid learner who enjoys learning new things and sharing his findings whenever possible.

LinkedIn