Read an Excel File Using Python

Read an Excel File Using Python

  1. Read an Excel File Using pandas Package in Python
  2. Read an Excel File Using the xlrd Package in Python
  3. Examples of Tasks Performed Over Excel Files in Python

Python programming language is well known for its use in the field of data science. Data science generally involves dealing with data and analyzing it with the help of graphs and plots such as line plots, violin plots, histograms, and heat maps, and mathematical computations such as mean, median, mode, probability, variance, etc. What makes Python even more suitable is the fact that it makes file reading and manipulation very seamless. Since data is generally represented in popular file formats such as xls, xlsx, csv, txt, etc., handling them with Python is a piece of cake.

This article will introduce how to read excel files using Python with the help of some examples. For instance, we will consider a sample excel file that you can download from here so that we all are on the same page. Just rename it to sample.xls for the following code snippets to work, or change the file name in the following code snippets itself.

Read an Excel File Using pandas Package in Python

In Python, we can use the pandas library to read an excel file. The pandas module is a robust, powerful, fast, and flexible open-source data analysis and manipulation library written in Python. If you don’t have it installed on your machine or virtual environment, use the following command.

  • To install pandas: pip install pandas or pip3 install pandas

Refer to the following code for reading an excel file using the pandas module.

import xlrd
import pandas

df = pandas.read_excel("sample.xls")
print("Columns")
print(df.columns)

Output:

Columns
Index(['Segment', 'Country', 'Product', 'Discount Band', 'Units Sold',
       'Manufacturing Price', 'Sale Price', 'Gross Sales', 'Discounts',
       ' Sales', 'COGS', 'Profit', 'Date', 'Month Number', 'Month Name',
       'Year'],
      dtype='object')

Read an Excel File Using the xlrd Package in Python

In Python, we can use the xlrd package to read excel files. The xlrd module is a Python package used for reading and formatting excel files. In case you don’t have it installed on your machine or virtual environment, use the following command.

  • To install xlrd, use the following command.
pip install xlrd

Or,

pip3 install xlrd

Refer to the following code for reading excel files using xlrd.

from xlrd import open_workbook

wb = open_workbook('sample.xls')
sheet = wb.sheet_by_index(0)
sheet.cell_value(0, 0)
columns = []
print("Columns")
 
for i in range(sheet.ncols):
    columns.append(sheet.cell_value(0, i))
    
print(columns)

Output:

Columns
['Segment', 'Country', 'Product', 'Discount Band', 'Units Sold', 'Manufacturing Price', 'Sale Price', 'Gross Sales', 'Discounts', ' Sales', 'COGS', 'Profit', 'Date', 'Month Number', 'Month Name', 'Year']

Here’s a brief explanation of what the above code does. It first creates a file descriptor for the excel file with the help of the open_workbook() function. Then it resets the file pointer to the (0,0) position or the top-left cell. Next, it iterates over the first row and stores all the column names in a variable. Generally, columns name are present in the first row; that is why the code considers that location. In case the column names are on some different row, one can change the 0 value in the statement sheet.cell_value(0, i) to whatever row number they wish to. Essentially, (0, i) represents y and x coordinates, where y is 0, and x is i, considering origin (0, 0) to be present at the top-left corner of the file.

Examples of Tasks Performed Over Excel Files in Python

Let’s look at some simple tasks that we can perform over excel files to understand these two libraries better.

Printing the First 3 Rows of an Excel File

Using the pandas package

import pandas

df = pandas.read_excel("sample.xls")
count = 3

for index, row in df.iterrows():
    print(row, end = "\n\n")
    
    if index == count - 1:
        break

Output:

Segment                         Government
Country                             Canada
Product                          Carretera
Discount Band                         None
Units Sold                          1618.5
Manufacturing Price                      3
Sale Price                              20
Gross Sales                        32370.0
Discounts                              0.0
 Sales                             32370.0
COGS                               16185.0
Profit                             16185.0
Date                   2014-01-01 00:00:00
Month Number                             1
Month Name                         January
Year                                  2014
Name: 0, dtype: object

Segment                         Government
Country                            Germany
Product                          Carretera
Discount Band                         None
Units Sold                          1321.0
Manufacturing Price                      3
Sale Price                              20
Gross Sales                        26420.0
Discounts                              0.0
 Sales                             26420.0
COGS                               13210.0
Profit                             13210.0
Date                   2014-01-01 00:00:00
Month Number                             1
Month Name                         January
Year                                  2014
Name: 1, dtype: object

Segment                          Midmarket
Country                             France
Product                          Carretera
Discount Band                         None
Units Sold                          2178.0
Manufacturing Price                      3
Sale Price                              15
Gross Sales                        32670.0
Discounts                              0.0
 Sales                             32670.0
COGS                               21780.0
Profit                             10890.0
Date                   2014-06-01 00:00:00
Month Number                             6
Month Name                            June
Year                                  2014
Name: 2, dtype: object

Using the xlrd package

from xlrd import open_workbook

wb = open_workbook('sample.xls')
sheet = wb.sheet_by_index(0)
sheet.cell_value(0, 0)
count = 3

for i in range(1, count + 1):
    for j in range(sheet.ncols):
        print(sheet.cell_value(i, j), end = ", ")
        
    print()

Output:

Government, Canada, Carretera, None, 1618.5, 3.0, 20.0, 32370.0, 0.0, 32370.0, 16185.0, 16185.0, 41640.0, 1.0, January, 2014, 
Government, Germany, Carretera, None, 1321.0, 3.0, 20.0, 26420.0, 0.0, 26420.0, 13210.0, 13210.0, 41640.0, 1.0, January, 2014, 
Midmarket, France, Carretera, None, 2178.0, 3.0, 15.0, 32670.0, 0.0, 32670.0, 21780.0, 10890.0, 41791.0, 6.0, June, 2014, 

Printing Values of a Specific Column

Using the pandas package

import pandas

df = pandas.read_excel("sample.xls")
column = df.columns[4]
print(column)
print("-" * len(column))

for index, row in df.iterrows():
    print(row[column])

Output:

Units Sold
----------
1618.5
1321.0
2178.0
888.0
2470.0
1513.0
921.0
2518.0
1899.0
1545.0
2470.0
2665.5
958.0
2146.0
345.0
615.0
292.0
974.0
2518.0
1006.0
367.0
883.0
549.0
788.0
2472.0
1143.0
1725.0
912.0
2152.0
1817.0
1513.0
1493.0
1804.0
2161.0
1006.0
1545.0
2821.0
345.0
2001.0
2838.0
2178.0
888.0
...

Using the xlrd package

from xlrd import open_workbook

wb = open_workbook('sample.xls')
sheet = wb.sheet_by_index(0)
sheet.cell_value(0, 0)
column_index = 4
column = sheet.cell_value(0, column_index)
print(column)
print("-" * len(column))

for row in range(1, sheet.nrows):
    print(sheet.cell_value(row, column_index))

Output:

Units Sold
----------
1618.5
1321.0
2178.0
888.0
2470.0
1513.0
921.0
2518.0
1899.0
1545.0
2470.0
2665.5
958.0
2146.0
345.0
615.0
292.0
974.0
2518.0
1006.0
367.0
883.0
549.0
788.0
2472.0
1143.0
1725.0
912.0
2152.0
1817.0
1513.0
1493.0
1804.0
2161.0
1006.0
1545.0
2821.0
345.0
2001.0
2838.0
2178.0
888.0
...
Vaibhav Vaibhav avatar Vaibhav Vaibhav avatar

Vaibhav is an artificial intelligence and cloud computing stan. He likes to build end-to-end full-stack web and mobile applications. Besides computer science and technology, he loves playing cricket and badminton, going on bike rides, and doodling.

LinkedIn GitHub

Related Article - Python Excel

  • Python Export to Excel
  • Read XLSX Files Using Python and Pandas