How to Fix CSV.Error: Line Contains Null Byte in Python

Zeeshan Afridi Feb 02, 2024
  1. the CSV File Format
  2. Create a CSV File in Python
  3. the _csv.Error: line contains NULL byte Error in Python
  4. Fix the _csv.Error: line contains NULL byte Error in Python
  5. Conclusion
How to Fix CSV.Error: Line Contains Null Byte in Python

A CSV file is a text file that contains comma-separated values. Each line in the file represents a row of data, and a comma separates each value.

CSV files are often used to store data from spreadsheets or databases. They can be opened in text editors or spreadsheet programs and easily parsed and processed with programming languages.

the CSV File Format

A CSV file is a text file that stores data in a tabular format. Each row of the table is called a record, and each field in the record is called a column.

CSV files typically use a comma to separate each field, but other characters, such as tabs or spaces, can also be used.

CSV files are often used to store data from databases or spreadsheets. They can be opened in a text editor, such as Microsoft Notepad, or a spreadsheet program, such as Microsoft Excel.

Create a CSV File in Python

CSV stands for comma-separated values, where the data in the file is separated with commas and stored in a tabular format as plain text. Each row in the file represents a record, and the column represents the different attributes of the data in the CSV files.

import csv

meta_data = ["First Name", "Last Name", "Course", "Age"]
student_data = ["Zeeshan", "Afridi", "Computer programming", "24"]

with open("countries.csv", "w", encoding="UTF8") as f:
    writer = csv.writer(f)

    # write the header
    writer.writerow(meta_data)

    # write the data
    writer.writerow(student_data)

# closing the file
f.close()

a = open("countries.csv", "r")
print(a.read())

# closing the file
a.close()

Output:

First Name,Last Name,Course,Age
Zeeshan,Afridi,Computer programming,24

the _csv.Error: line contains NULL byte Error in Python

Suppose you get _csv.Error: line contains NULL byte when trying to read a CSV file, it’s likely because there are one or more NULL bytes in the file. To fix this, you can use the --zero-terminated option when running the CSV reader, which will treat all NULL bytes as end-of-line characters.

When you have any null values, you will encounter the below error:

file my.csv, line 1: line contains NULL byte

Fix the _csv.Error: line contains NULL byte Error in Python

You encounter _csv.Error: line contains NULL byte usually because you are trying to read a CSV file saved in the wrong encoding. You must specify the correct encoding when reading the file to fix this.

For example, if the file is encoded in UTF-8, you would use the following code:

import csv

with open("filename.csv", "r", encoding="utf-8") as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

Suppose you encounter the _csv.Error: line contains NULL byte when trying to read a CSV file, there is likely an invalid character in the file. This can be caused by several things, including characters that are not valid in the UTF-8 encoding or characters that are not visible (e.g., non-printable characters).

To fix this error, you must identify and remove the invalid character from the file. This can be done using a text editor or a hex editor.

Once the invalid character is removed, the file should be able to be read without issue.

These are the three more solutions for this type of error.

  1. Converting in the memory byte stream
  2. By replacing a NULL byte with an empty string
  3. Passing in fixed lines

Convert Objects In-Memory Bytes Stream

We can resolve this error by converting the object file in-memory byte stream. So below is a code that will help convert it into an in-memory byte stream.

content = csv_file.read()
# after conversion into an in-memory byte stream
csv_stream = io.BytesIO(content)

Replace NULL Bytes With Empty Strings

The NULL byte error can be resolved by iterating through lines and replacing null bytes with empty strings. The code for that purpose will be the following:

# After Iteration through the lines and replacing null bytes with empty string
fixed_lines = (line.replace(b"\x00", b"") for line in csv_stream)

Pass the Object File in fixed_lines Instead of csv_stream

Resolving this error requires passing in fixed lines instead of a CSV stream. The code for that purpose will be following:

# Below remains unchanged, just passing in fixed_lines instead of csv_stream
stream = codecs.iterdecode(fixed_lines, "utf-8-sig", errors="strict")
dict_reader = csv.DictReader(stream, skipinitialspace=True, restkey="INVALID")

Conclusion

The CSV error line contains NULL byte is caused by a line in your CSV file containing a null byte. This can happen if you’re using a text editor that doesn’t support Unicode or if you’re transferring the file from another system that doesn’t support Unicode.

To fix this error, you need to find the line in your CSV file that contains the null byte and remove it. You can do this using a text editor that supports Unicode or by transferring the file to a system that does support Unicode.

Zeeshan Afridi avatar Zeeshan Afridi avatar

Zeeshan is a detail oriented software engineer that helps companies and individuals make their lives and easier with software solutions.

LinkedIn

Related Article - Python Error