How to Convert JSON to a Pandas DataFrame

Manav Narula Feb 02, 2024
  1. JSON to Pandas DataFrame Using json_normalize()
  2. JSON to Pandas DataFrame Using read_json()
How to Convert JSON to a Pandas DataFrame

This article will introduce how to convert JSON to a Pandas DataFrame.

JSON stands for JavaScript Object Notation. It is based on the format of objects in JavaScript and is an encoding technique for representing structured data. It is widely used these days, especially for sharing data between servers and web applications.

Due to its simplicity and influence from programming language data structures, JSON is becoming immensely popular. It’s relatively easy to understand, and the following is a simple example of a JSON response from an API.

{
"Results":
         [
         { "id": "01", "Name": "Jay" },
         { "id": "02", "Name": "Mark" },
         { "id": "03", "Name": "Jack" }
         ],
"status": ["ok"]
}

As you can see in our example, JSON appears to be somewhat a combination of nested lists and dictionaries; therefore, it is relatively easy to extract data from JSON files and even store it as a Pandas DataFrame.

Pandas and JSON libraries in Python can help in achieving this. We have two functions read_json() and json_normalize() which can help in converting JSON string to a DataFrame.

JSON to Pandas DataFrame Using json_normalize()

The json_normalize() function is very widely used to read the nested JSON string and return a DataFrame. To use this function, we need first to read the JSON string using json.loads() function in the JSON library in Python. Then we pass this JSON object to the json_normalize(), which will return a Pandas DataFrame containing the required data.

import pandas as pd
import json
from pandas import json_normalize

data = """
{
"Results":
         [
         { "id": "1", "Name": "Jay" },
         { "id": "2", "Name": "Mark" },
         { "id": "3", "Name": "Jack" }
         ],
"status": ["ok"]
}
"""

info = json.loads(data)

df = json_normalize(info["Results"])  # Results contain the required data
print(df)

Output:

   id  Name
0  1   Jay
1  2  Mark
2  3  Jack

JSON to Pandas DataFrame Using read_json()

Another Pandas function to convert JSON to a DataFrame is read_json() for simpler JSON strings. We can directly pass the path of a JSON file or the JSON string to the function for storing data in a Pandas DataFrame. read_json() has many parameters, among which orient specifies the format of the JSON string.

The downside is that it is difficult to use with nested JSON strings. So for using read_json(), we will use a much simpler example as shown below:

import pandas as pd

data = """
          {
              "0":{
                  "Name": "Jay",
                  "Age": "17"
              },
              "1":{
                  "Name": "Mark",
                  "Age": "15"
              },
              "2":{
                  "Name": "Jack",
                  "Age":"16"
              }
          }
    """

df = pd.read_json(data, orient="index")
print(df)

Output:

   Name  Age
0   Jay   17
1  Mark   15
2  Jack   16

We set orient to be 'index' because the JSON string fromat matchs the pattern as {index : {column: value}}.

Author: Manav Narula
Manav Narula avatar Manav Narula avatar

Manav is a IT Professional who has a lot of experience as a core developer in many live projects. He is an avid learner who enjoys learning new things and sharing his findings whenever possible.

LinkedIn

Related Article - Pandas DataFrame

Related Article - Pandas JSON