How to get all the files of a directory

You could have at least 3 methods to list all the files of a directory, by using Python functions os.listdir, os.walk, glob.glob.

Below is the presumption of this tutorial,

  1. Python version - Python 3
  2. The directory name is dirPath and it exists in the system, therefore, we will not need to check the existence of the directory.

os.listdir

os.listdir lists all the files and folders in the directory, therefore it needs extra code to filter files from the result.

import os
dirPath = r"C:\git\DelftStack\content"
result = [f for f in os.listdir(dirPath) if os.path.isfile(os.path.join(dirPath, f))]
print(result)

os.listdir returns only the relative path of the files or folders with reference to dirPath, and os.path.isfile function need the full path to check whether it is a file or not, hence we need to use os.path.join to combine the dirPath and the results of os.listdir to get the full path of either files or paths.

os.walk

os.walk generates the file names in the given directory by traversing the tree top-down (per default) or bottom-up. It yields a 3-tuple (dirpath, dirname, filenames) each time it walks to the directory in the tree (including top itself).

All the files are included in the tuple for the first yield of os.walk, therefore a Pythonic way is

import os
dirPath = r"C:\git\DelftStack\content"
result = next(os.walk(dirPath))[2]
print(result)

glob.glob

The glob module finds all the pathnames matching the given specific pattern according to the rules used in the Unix shell. glob.glob returns the list of path names that match the given path name pattern. The file path has the pattern of *.*, that is what will be passed to glob.glob as the input argument.

import glob
dirPathPattern = r"C:\git\DelftStack\content\*.*"
result = glog.glob(dirPathPattern)
print(result)

glob.glob returns the full path of the matched files, like C:\git\DelftStack\content\about.md.

Warning

The result of glob.glob method as shown here couldn’t guarantee they are files-only because it only checks whether the path name matches the pattern, but not checks it is a file or a directory. For example, if a directory has the name pattern like test.test, then this directory is also included in the result.