How to Read One File Line by Line to A List in Python

Suppose we have a file with the content below,

Line One: 1
Line Two: 2
Line Three: 3
Line Four: 4
Line Five: 5

We need to read the file content line by line to a list, ["Line One: 1", "Line Two: 2", "Line Three: 3", "Line Four: 4", "Line Five: 5"].

We will introduce different methods to read a file line by line to a list below.

readlines to read the file line by line in Python

readlines returns a list of lines from the stream.

>>> filePath = r"/your/file/path"
>>> with open(filePath, 'r', encoding='utf-8') as f:
	f.readlines()

	
['Line One: 1\n', 'Line Two: 2\n', 'Line Three: 3\n', 'Line Four: 4\n', 'Line Five: 5']

The ending character \n is also included in the string and it could be removed with str.rstrip('\n')

>>> with open(filePath, 'r', encoding='utf-8') as f:
	[_.rstrip('\n') for _ in f.readlines()]

	
['Line One: 1', 'Line Two: 2', 'Line Three: 3', 'Line Four: 4', 'Line Five: 5']

Iterate over the file method to read a file line by line in Python

We could iterate over the file to read it line by line, rather than using readlines.

>>> with open(filePath, 'r', encoding='utf-8') as f:
	[_.rstrip('\n') for _ in f]

	
['Line One: 1', 'Line Two: 2', 'Line Three: 3', 'Line Four: 4', 'Line Five: 5']

This method is much better than the above method from the perspective of memory usage. readlines method holds all the lines of the file in the memory, but the interation method only takes one line of the file content to the memory and process it. It is preferred if the file size is super large to avoid MemoryError.

file.read method to read the file line by line in Python

file.read(size=-1, /) reads from the file until EOF if size is not set. We could split the lines from it by using str.splitlines function.

>>> with open(filePath, 'r') as f:
	f.read().splitlines()

	
['Line One: 1', 'Line Two: 2', 'Line Three: 3', 'Line Four: 4', 'Line Five: 5']

The result doesn’t include the ending character \n in default str.splitlines method. But you could include \n if the keepends parameter is set to be True.

>>> with open(filePath, 'r') as f:
	f.read().splitlines(keepends=True)

	
['Line One: 1\n', 'Line Two: 2\n', 'Line Three: 3\n', 'Line Four: 4\n', 'Line Five: 5']

Comparsion of different methods in reading a file line by line in Python

We will compare the efficiency performance among different methods introduced in this article. We increase the number of lines in the tested file to 8000 to easily compare the performance difference.

>>> timeit.timeit('''with open(filePath, 'r', encoding='utf-8') as f:
			f.readlines()''',
	      setup='filePath=r"C:\Test\Test.txt"',
	      number = 10000)
16.36330720000001
>>> timeit.timeit('''with open(filePath, 'r', encoding='utf-8') as f:
			[_ for _ in f]''',
	      setup='filePath=r"C:\Test\Test.txt"',
	      number = 10000)
18.37279060000003
>>> timeit.timeit('''with open(filePath, 'r', encoding='utf-8') as f:
			f.read().splitlines()''',
	      setup='filePath=r"C:\Test\Test.txt"',
	      number = 10000)
12.122660100000019

readlines() method is sligtly better than file iteration method, and file.read().splitlines() is the most efficient method with the margin of more than 25% compared to the other two methods.

But, if in the BigData application where memory is the constrainer, the file iteration method is the best as explained above.

Related Articles - Python String

  • How to Check a String Is Empty in a Pythonic Way
  • How to Remove Whitespace in a String
  • How to convert string to datetime
  • How to Convert String to Lowercase in Python 2 and 3
  • How to Check Whether a String Contains Substring in Python
  • How to Convert a List to String in Python
  • How to Convert String to Float or Int in Python
  • How to Convert String to Bytes in Python
  • comments powered by Disqus