GREP in Python

GREP is an interesting command-line feature that allows us to search plain text files for specific lines using regular expressions.

Regular expressions are very heavily used in Python and can be used to check if a string matches a pattern or not.

The re module in Python allows us to deal with regular expressions. In the following code, we will try to implement GREP in Python and search a file for some specific pattern.

with open("sample.txt","r") as file:
    for line in file:
        if, line):

We open the required file in the reading mode and iterate through the file line by line. Then we use the function to search the pattern in every line. If the pattern is found, then the line is printed.

There is another neat way to implement this in the command line with Python. This method will specify the regular expression and the file to be searched in the command line while running the file in the terminal. This allows us to replicate GREP in Python properly.

The following code implements this.

import re
import sys

with open(sys.argv[2],"r") as file:
    for line in file:
        if[1], line):

The sys module provides the argv() function, which returns an array of all the arguments provided in the command-line.

We can save this file as and run this Python script from the terminal and specify the necessary arguments in the following way.

python 'RE' 'file-name'

If we want to work with multiple arguments, then we can use the glob module.

The glob module allows us to find the paths of files that match a pattern in a directory.

Its use in replicating GREP in Python can be seen below.

import re
import sys
import glob

for arg in sys.argv[2:]:
    for file in glob.iglob(arg):
        for line in open(file, 'r'):
            if[1], line):

The iglob() function creates an object that returns the files in the directory, which is passed to the function.

Another concise way of implementing GREP in just a few lines is shown below.

import re, sys

map(sys.stdout.write,(l for l in sys.stdin if[1],l)))

This way is more precise and memory efficient, and we can run these lines directly in the terminal.

python -c "import re,sys;map(sys.stdout.write,(l for l in sys.stdin if[1],l)))" "RE"
DelftStack is a collective effort contributed by software geeks like you. If you like the article and would like to contribute to DelftStack by writing paid articles, you can check the write for us page.