os.pathModule to Extract Extension From File in Python
pathlibModule to Extract Extension From File in Python
This tutorial will introduce how to get the file extension from the filename in Python.
os.path Module to Extract Extension From File in Python
Python has a module
os.path that has pre-made useful utility functions to manipulate OS file paths. It includes opening, saving and updating, and getting the information from file paths.
We will use this module to get the file extension in Python.
os.path has a function
splitext() to split the root and the extension of the given file path. The function returns a tuple containing the root string and the extension string.
Let’s provide an example file path with a
The output expected should be the extension
Declare two separate variables to catch the result of
import os path = '/Users/user/Documents/sampledoc.docx' root, extension = os.path.splitext(path) print('Root:', root) print('extension:', extension)
Root: /Users/user/Documents/sampledoc Extension: .docx
The extension has now been successfully returned ted from the root file path.
pathlib Module to Extract Extension From File in Python
pathlib is a Python module that contains classes representing file paths and implements utility functions and constants for these classes.
pathlib.Path() accepts a path string as an argument and returns a new
pathlib.Path object has the attribute
suffix that returns the file extension information.
import pathlib path = pathlib.Path('/Users/user/Documents/sampledoc.docx') print('Parent:', path.parent) print('Filename:', path.name) print('Extension:', path.suffix)
Other than the root, we can also get the parent file path and the actual file name of the given file path by simply calling the attributes
name within the
Parent: /Users/user/Documents Filename: sampledoc.docx Extension: .docx
What if we have a file extension like
pathlib also provides an attribute for files with multiple suffixes as extensions. The attribute
suffixes within the
Path object is a list containing all of the suffixes of the given file. If we use the example above and print out the
import pathlib path = pathlib.Path('/Users/user/Documents/sampledoc.docx') print('Suffix(es):', path.suffixes)
So even if there is only one suffix, the output will result in a singleton list.
Now try an example with a
.tar.gz extension. To convert the list into a single string, the
join() function can be used on an empty string and accept the
suffixes attribute as an argument.
import pathlib path = pathlib.Path('/Users/user/Documents/app_sample.tar.gz') print('Parent:', path.parent) print('Filename:', path.name) print('Extension:', ''.join(path.suffixes))
Parent: /Users/user/Documents Filename: app_sample.tar.gz Extension: .tar.gz
Now the actual extension is displayed instead of a list.
In summary, the two modules
pathlib provide convenient methods to get the file extension from a file path in Python.
os module has the function
splitext to split the root and the filename from the file extension.
pathlib creates a
Path object and simply stores the extension within the attribute
If you’re anticipating more than one extension in a file, it would be best to use
pathlib as it provides easy support for multiple extensions using the attribute
- Get All the Files of a Directory
- Append Text to a File in Python
- Check if a File Exists in Python
- Find Files With a Certain Extension Only in Python
- Read Specific Lines From a File in Python
- Check if File Is Empty in Python