How to Build Pandas DataFrame Row by Row

Salman Mehmood Feb 02, 2024
  1. Create Rows in Pandas DataFrame
  2. Using loc() Function to Create Rows in Pandas DataFrame
  3. Using pandas.concat() Function to Create Rows in Pandas DataFrame
How to Build Pandas DataFrame Row by Row

This article demonstrates how to build a Dataframe row-wise instead of the customarily followed column-wise convention in Pandas.

Create Rows in Pandas DataFrame

Pandas DataFrame is a structure that stores data with two dimensions and the labels corresponding to those dimensions. DataFrames are comparable to SQL tables and spreadsheets that can be manipulated in applications such as Excel and Calc.

Because they are an essential component of the Python and NumPy ecosystems, DataFrames are frequently superior to tables and spreadsheets in terms of speed, usability, and power. This is the case for many applications.

As a data storing structure, based on a specific condition, it may be needed that data needs to be input row by row instead of column by column.

Consider the following code.

import pandas

df = pandas.DataFrame(
    columns=["a", "b", "c", "d", "e"], index=["v", "w", "x", "y", "z"]
)
y = {"a": 1, "b": 5, "c": 2, "d": 3, "e": 7}

print("Attempt 1")
# df['y'] = y
# print(df)

print("Attempt 2")
# df.join(y)

The following outputs of each attempt are written separately.

Output (Attempt 1):

Attempt 1
     a    b    c    d    e   y
v  NaN  NaN  NaN  NaN  NaN NaN
w  NaN  NaN  NaN  NaN  NaN NaN
x  NaN  NaN  NaN  NaN  NaN NaN
y  NaN  NaN  NaN  NaN  NaN NaN
z  NaN  NaN  NaN  NaN  NaN NaN

Output (Attempt 2):

Traceback (most recent call last):
  File "d:\Test\test.py", line 13, in <module>
    df.join(y)
  File "C:\Program Files\Python310\lib\site-packages\pandas\core\frame.py", line 9969, in join
    return self._join_compat(
  File "C:\Program Files\Python310\lib\site-packages\pandas\core\frame.py", line 10036, in _join_compat
    can_concat = all(df.index.is_unique for df in frames)
  File "C:\Program Files\Python310\lib\site-packages\pandas\core\frame.py", line 10036, in <genexpr>
    can_concat = all(df.index.is_unique for df in frames)

AttributeError: 'builtin_function_or_method' object has no attribute 'is_unique'

In the code above, first a DataFrame instance is initialized, with columns ['a','b','c','d', 'e'] with indexes ['v', 'w','x','y','z']. The main objective is to add elements row-wise, which, as evident from the code in our case, is y.

The data to be input in the input is initialized, with the values corresponding to each column given as {'a':1, 'b':5, 'c':2, 'd':3, 'e': 7}.

In attempt one, the created data is assigned to the DataFrame by setting it to the index y, using df[y]. But, as seen from the output, a new column is created, with all its members being NaN, as with all the other elements.

In the second attempt, the join() method is used to try and join the declared data to the DataFrame itself, which also gives an error, the "builtin_function_or_method' object has no attribute 'is_unique'". This problem can be approached with the following techniques mentioned below.

  • Using loc() Function.
  • Using pandas.concat() Function.

Using loc() Function to Create Rows in Pandas DataFrame

Consider the following code:

import pandas

df = pandas.DataFrame(
    columns=["a", "b", "c", "d", "e"], index=["v", "w", "x", "y", "z"]
)
print("Current Shape:\n" + str(df))

y = {"a": 1, "b": 5, "c": 2, "d": 3, "e": 7}
df.loc["y"] = pandas.Series(y)

print("DataFrame:\n" + str(df))

Output:

Current Shape:

     a    b    c    d    e
v  NaN  NaN  NaN  NaN  NaN
w  NaN  NaN  NaN  NaN  NaN
x  NaN  NaN  NaN  NaN  NaN
y  NaN  NaN  NaN  NaN  NaN
z  NaN  NaN  NaN  NaN  NaN

DataFrame:

     a    b    c    d    e
v  NaN  NaN  NaN  NaN  NaN
w  NaN  NaN  NaN  NaN  NaN
x  NaN  NaN  NaN  NaN  NaN
y    1    5    2    3    7
z  NaN  NaN  NaN  NaN  NaN

The loc property of the DataFrame class is used to access a row or column of a DataFrame. The loc property allows access to a single or a group of rows and columns and a Boolean array.

In our code, we used the loc property since the property is label based. Hence we passed the desired label (or index), y in our case.

Note that the panda.Series() is to align the input in case you don’t have to specify all the elements.

Using pandas.concat() Function to Create Rows in Pandas DataFrame

Consider the following code:

import pandas

df = pandas.DataFrame(columns=["a", "b", "c", "d", "e"], index=[])
print("Current Shape:\n" + str(df))

entry = pandas.DataFrame.from_dict(
    {
        "a": [1, 6, 11, 16],
        "b": [2, 7, 12, 17],
        "c": [3, 8, 13, 18],
        "d": [4, 9, 14, 19],
        "e": [5, 10, 15, 20],
    }
)

df = pandas.concat([df, entry])
print("DataFrame:\n" + str(df))

Output:

Current Shape:
Empty DataFrame
Columns: [a, b, c, d, e]
Index: []
DataFrame:
    a   b   c   d   e
0   1   2   3   4   5
1   6   7   8   9  10
2  11  12  13  14  15
3  16  17  18  19  20

The from_dict() method, which contains a dictionary containing column names and their corresponding values, is declared, from which a new DataFrame is created. This newly created DataFrame instance is then stored in the variable named entry, which corresponds to the new elements we want to add to our original DataFrame.

After the DataFrame is created and data is assigned to the DataFrame, we now need to find a way to join the two DataFrame instances. Using the pandas.concat() method, we can concatenate two DataFrame instances, and the resulting DataFrame is then stored in the first instance.

Salman Mehmood avatar Salman Mehmood avatar

Hello! I am Salman Bin Mehmood(Baum), a software developer and I help organizations, address complex problems. My expertise lies within back-end, data science and machine learning. I am a lifelong learner, currently working on metaverse, and enrolled in a course building an AI application with python. I love solving problems and developing bug-free software for people. I write content related to python and hot Technologies.

LinkedIn

Related Article - Pandas DataFrame