How to Read SAS in Pandas

Abid Ullah Feb 02, 2024
  1. What are SAS files
  2. Open .SAS Files in Python
  3. Read SAS File Using Pandas in Python
  4. Read Specific Columns From the SAS File in Python
  5. Save the SAS Files to CSV in Python
How to Read SAS in Pandas

Python is a programming language that is very helpful in performing data analysis, data modeling, and visualization. However, data storing using Python can be quite tedious at times.

In this article, we will focus on understanding how to open and use .SAS files in Python using Pandas. We shall also discuss how we can read data from .SAS files, how we write to .SAS files and how the use of .SAS files is helpful for faster computations in Python.

What are SAS files

SAS stands for Statistical Analysis System, and it includes statistics and data. These files can be useful when performing data analytics, business intelligence, predictive analysis, computational analysis, and data management.

In most cases, the .SAS files have the extensions .sas7bdat and .sas7bcat.

Open .SAS Files in Python

For opening an .SAS file in Python, we have 2 different methods. In the first method, we use pyreadstat, which enables us to open our .SAS files in Python.

The second method to do the same is using a Pandas data frame. If we use a Pandas data frame, we will use the read_sas method, which will help us open SAS files in our Python notebook.

First, we need to install Pyreadstat by running the following command.

pip install pyreadstat

Once the package is installed, we can load SAS files to our Python notebook.

In the next step, we shall import the Pyreadstat we just installed.

import pyreadstat

This shall import the pyreadstat package and be ready to use.

Now, to open the SAS file with the .sas7bdat extension, we need to use read_sas7bdat.

Let us consider that we want to read a file that goes by the name gold.sas7bdat; the following code shall help us to import the file. We’ll be using data frames here.

df, meta = pyreadstat.read_sas7bdat("/gold.sas7bdat")

Now that we have already loaded the file using pyreadstat, it shall be able to locate in the working directories. To check the type of the df variable created, we need to type in the following line.

type(df)

Output:

The output of df type

Now that we know it is a Python data frame, we shall now be able to use all the methods available for Python data frame objects. We want to print the first five entries in the file.

The following code shall be able to display the desired output.

df.head()

Output:

The output of the DF head

Read SAS File Using Pandas in Python

This section will help us understand how to load the same file used above using Pandas.

In the first step, we shall import pandas. This can be done with the following code.

import pandas as pd

This code shall import the Panda’s library to our workbook.

This step will import the file to our notebook using the Pandas read_sas method.

geturl = "/gold.sas7bdat"
df = pd.read_sas(geturl)

This code shall import the file to our notebook. Now, let’s print the first five records of the file as we did use pyreadstat.

df.head()

Output:

The output of the Pandas DF head

Read Specific Columns From the SAS File in Python

If we are interested in getting specific columns for the file under consideration, we shall use the argument in pyreadstat by usecols. The following code will help us better understand the concept.

columns = ["YEAR"]
df, meta = pyreadstat.read_sas7bdat("/airline.sas7bdat", usecols=columns)
df.head()

Output:

The output of code using usecols

Save the SAS Files to CSV in Python

To save any file with the extension .sas7bdat to CSV, we need to ensure the correct usage of the to_csv method. The following code shall convert the file to CSV for the data frame created above.

df.to_csv("ourdatafile.csv", index=False)

The above code will save the existing .SAS extension file in CSV format using data frames.

We hope you find this article helpful in learning how to use SAS files using Python.

Author: Abid Ullah
Abid Ullah avatar Abid Ullah avatar

My name is Abid Ullah, and I am a software engineer. I love writing articles on programming, and my favorite topics are Python, PHP, JavaScript, and Linux. I tend to provide solutions to people in programming problems through my articles. I believe that I can bring a lot to you with my skills, experience, and qualification in technical writing.

LinkedIn