How to Download CSV From URL in Python

Manav Narula Feb 12, 2024
  1. Use the pandas.read_csv() Function to Download a CSV File From a URL in Python
  2. Use the urllib and csv Modules to Download a CSV File From a URL in Python
  3. Use the requests and csv Modules to Download a CSV File From a URL in Python
  4. Conclusion
How to Download CSV From URL in Python

CSV, or Comma-Separated Values, is a widely used file format for storing tabular data. Its simplicity and compatibility make it a standard choice for data interchange between applications and platforms.

In the realm of data analysis and manipulation, the ability to seamlessly retrieve datasets from online sources is pivotal for data scientists and analysts. This article explores three distinct methods, employing pandas.read_csv(), urllib, and requests in tandem with the csv module to download CSV files directly from URLs.

Use the pandas.read_csv() Function to Download a CSV File From a URL in Python

The read_csv() function from the Pandas module can read CSV files from different sources and store the result in a Pandas DataFrame.

We can use this function to download CSV files from a URL in Python by providing the URL within the function directly.

To illustrate this concept, let’s consider a scenario where we have a CSV file hosted online, and we want to bring it into our Python environment for analysis.

import pandas as pd

# Define the URL of the CSV file
url = "https://support.staffbase.com/hc/en-us/article_attachments/360009197031/username.csv"

# Use pandas.read_csv() to directly read the CSV file into a DataFrame
df = pd.read_csv(url)

# Specify the destination filename for the locally saved CSV file
destination_filename = "data.csv"

# Save the DataFrame to a new CSV file using to_csv() method
df.to_csv(destination_filename, index=False)

# Print a message indicating a successful download
print("CSV file successfully downloaded and saved as 'data.csv'.")

In this example, we start by importing the pandas library, a cornerstone of data manipulation in Python. To maintain brevity and enhance code readability, we use the common alias pd when importing the library.

The pivotal line in this code is df = pd.read_csv(url). Here, the read_csv() method from the pandas library is employed to directly fetch the CSV data from the specified URL and convert it into a DataFrame named df.

A DataFrame is a tabular data structure that facilitates various operations on the data.

Moving forward, we set the destination_filename variable to the desired name for our locally stored CSV file, which will contain the downloaded data.

The df.to_csv(destination_filename, index=False) line takes care of saving our DataFrame to a new CSV file in the local directory. The index=False parameter ensures that the DataFrame’s index is not included in the saved CSV file.

To provide a user-friendly experience, we include a print statement: CSV file successfully downloaded and saved as 'data.csv'. This message is a reassuring confirmation for the user that the operation was successful.

Output:

CSV file successfully downloaded and saved as 'data.csv'.

Use the urllib and csv Modules to Download a CSV File From a URL in Python

The urllib module is used to work with and fetch URLs from different protocols in Python. We can use the urllib.urlopen() function to create a connection to a URL and read its contents.

This response can be processed using the csv module. The csv module works with CSV files in Python.

It can parse the response using the csv.reader() function. We can then display the parsed result at once or traverse through the content one row at a time.

Let’s explore a practical example of how to download a CSV file from a URL using the urllib and csv modules.

import urllib.request
import csv

# Define the URL of the CSV file
url = "https://support.staffbase.com/hc/en-us/article_attachments/360009197031/username.csv"

# Open the URL and create a response object
response = urllib.request.urlopen(url)

# Decode the content of the response
content = response.read().decode("utf-8")

# Create a CSV reader object from the decoded content
csv_reader = csv.reader(content.splitlines())

# Extract the header and rows from the CSV reader
header = next(csv_reader)
rows = list(csv_reader)

# Specify the destination filename for the locally saved CSV file
destination_filename = "data.csv"

# Write the CSV data to a local file
with open(destination_filename, "w", newline="") as csvfile:
    csv_writer = csv.writer(csvfile)

    # Write the header
    csv_writer.writerow(header)

    # Write the rows
    csv_writer.writerows(rows)

# Print a message indicating a successful download
print("CSV file successfully downloaded and saved as 'data.csv'.")

In this example, we begin by importing the urllib.request module for handling URLs and the csv module for CSV file operations.

We define the URL of the CSV file as url = "https://example.com/data.csv". This is the location from which we want to fetch our data.

Next, we use urllib.request.urlopen(url) to open the URL and create a response object. The content of this response is then decoded using response.read().decode('utf-8') to obtain a string representation of the CSV data.

To work with the CSV data, we create a csv.reader object by splitting the decoded content into lines and passing it to the csv.reader constructor. This reader allows us to iterate over the CSV data row by row.

We extract the header and rows from the CSV reader. The next(csv_reader) call retrieves the header, and list(csv_reader) creates a list of rows.

Moving forward, we specify the desired local filename as destination_filename = "data.csv".

Using the csv.writer object, we open a local CSV file and write both the header and rows. The newline='' argument ensures cross-platform compatibility for newline characters.

To conclude the process, we print a message to the console: CSV file successfully downloaded and saved as 'data.csv'.

Output:

CSV file successfully downloaded and saved as 'data.csv'.

Use the requests and csv Modules to Download a CSV File From a URL in Python

One effective method involves using the requests library for handling HTTP requests and the native csv module for CSV file processing. This combination offers simplicity and versatility, making it a preferred choice for Python developers seeking an efficient way to download CSV files from URLs.

Let’s explore a practical example of how to download a CSV file from a URL using the requests and csv modules.

Code:

import requests
import csv

# Define the URL of the CSV file
url = "https://support.staffbase.com/hc/en-us/article_attachments/360009197031/username.csv"

# Send an HTTP GET request to the specified URL
response = requests.get(url)

# Check if the request was successful (status code 200)
if response.status_code == 200:
    # Decode the content of the response
    content = response.text

    # Create a CSV reader object from the decoded content
    csv_reader = csv.reader(content.splitlines())

    # Extract the header and rows from the CSV reader
    header = next(csv_reader)
    rows = list(csv_reader)

    # Specify the destination filename for the locally saved CSV file
    destination_filename = "data.csv"

    # Write the CSV data to a local file
    with open(destination_filename, "w", newline="") as csvfile:
        csv_writer = csv.writer(csvfile)

        # Write the header
        csv_writer.writerow(header)

        # Write the rows
        csv_writer.writerows(rows)

    # Print a message indicating a successful download
    print("CSV file successfully downloaded and saved as 'data.csv'.")

We begin by importing the necessary modules—requests for making HTTP requests and csv for handling CSV files.

Next, we define the URL of the CSV file we want to download.

Using requests.get(url), we send an HTTP GET request to the specified URL and obtain a response object, which contains information about the response, including the status code.

We check if the request was successful by verifying that the status code is 200, indicating a successful HTTP response.

Assuming a successful request, we proceed to decode the content of the response using response.text. This decoded content is then used to create a csv.reader object, allowing us to iterate over the CSV data row by row.

Next, we extract the header and rows from the CSV reader. The next(csv_reader) call retrieves the header, and list(csv_reader) creates a list of rows.

We specify the desired local filename for the saved CSV file as destination_filename = "data.csv".

Using the csv.writer object, we open a local CSV file and write both the header and rows. The newline='' argument ensures cross-platform compatibility for newline characters.

Finally, we print a message to the console indicating a successful download.

Output:

CSV file successfully downloaded and saved as 'data.csv'.

Conclusion

Python’s rich ecosystem equips developers with various methods to seamlessly download CSV files from URLs. The pandas.read_csv() method, with its elegance and power, simplifies the process for extensive data analysis.

Meanwhile, the urllib and requests methods, coupled with the csv module, provide lightweight yet effective solutions for scenarios where minimal dependencies and streamlined processes are essential. Whether opting for the feature-rich pandas or the simplicity of urllib and requests, Python stands as a versatile language, empowering developers to navigate diverse data retrieval challenges with ease.

Author: Manav Narula
Manav Narula avatar Manav Narula avatar

Manav is a IT Professional who has a lot of experience as a core developer in many live projects. He is an avid learner who enjoys learning new things and sharing his findings whenever possible.

LinkedIn

Related Article - Python HTTP