GREP is an interesting command-line feature that allows us to search plain text files for specific lines using regular expressions.
Regular expressions are very heavily used in Python and can be used to check if a string matches a pattern or not.
re module in Python allows us to deal with regular expressions. In the following code, we will try to implement GREP in Python and search a file for some specific pattern.
with open("sample.txt","r") as file: for line in file: if re.search(pattern, line): print(line)
We open the required file in the reading mode and iterate through the file line by line. Then we use the
re.search() function to search the pattern in every line. If the pattern is found, then the line is printed.
There is another neat way to implement this in the command line with Python. This method will specify the regular expression and the file to be searched in the command line while running the file in the terminal. This allows us to replicate GREP in Python properly.
The following code implements this.
import re import sys with open(sys.argv,"r") as file: for line in file: if re.search(sys.argv, line): print(line)
sys module provides the
argv() function, which returns an array of all the arguments provided in the command-line.
We can save this file as
grep.py and run this Python script from the terminal and specify the necessary arguments in the following way.
python grep.py 'RE' 'file-name'
If we want to work with multiple arguments, then we can use the
glob module allows us to find the paths of files that match a pattern in a directory.
Its use in replicating GREP in Python can be seen below.
import re import sys import glob for arg in sys.argv[2:]: for file in glob.iglob(arg): for line in open(file, 'r'): if re.search(sys.argv, line): print(line,)
iglob() function creates an object that returns the files in the directory, which is passed to the function.
Another concise way of implementing GREP in just a few lines is shown below.
import re, sys map(sys.stdout.write,(l for l in sys.stdin if re.search(sys.argv,l)))
This way is more precise and memory efficient, and we can run these lines directly in the terminal.
python -c "import re,sys;map(sys.stdout.write,(l for l in sys.stdin if re.search(sys.argv,l)))" "RE"