从 Pandas DataFrame 系列中获取列表

Luqman Khan 2022年5月16日
从 Pandas DataFrame 系列中获取列表

Python 是一种众所周知的数据分析语言,主要归功于 Python 包。Pandas 是帮助我们更轻松地分析数据的软件包之一。

Pandas tolist() 方法将系列转换为 Python 的系列或内置列表。默认情况下,series 是 pandas.core.series.Series 数据类型和 tolist() 方法的类型,转换为数据列表。

使用 tolist() 方法从 Pandas DataFrame 系列中获取列表

本文将讨论如何从 Pandas Dataframe 列中获取列表。我们将首先将 CSV 文件读入 Pandas DataFrame。

import pandas as pd

# read csv file
df = pd.read_csv("home_price.csv")
# display 3 rows
df = df.head(3)
print(df)

输出:

   Area  Home price
0  1000       10000
1  1200       12000
2  1300       13000

现在我们将从列中提取值并将其转换为列表,因为我们知道 tolist() 有帮助。

list1 = df["Home price"].values.tolist()
print("extract the value of series and converting into the list")
print(list1)

输出:

extract the value of series and converting into the list
[10000, 12000, 13000, 14000, 15000]

列表是一个有序且灵活的 Python 容器,是 Python 中最常见的数据结构之一。元素被插入方括号 [],用逗号分隔以创建一个列表。列表可以包含重复值;这就是我们主要在数据集中使用列表的原因。

import numpy as np
import pandas as pd

# read csv file
df = pd.read_csv("home_price.csv")
# extract the value of series and converting into the list
list1 = df["Home price"].values.tolist()
list1 = np.array(list1)
# type casting in list data type
updated = list(list1 * 1.5)
print("after include 1.5 % tax\n")
print(updated, "new home price")
df["Home price"] = updated
# create new csv
df.to_csv("home prices after 1 year.csv")
df2 = pd.read_csv("home prices after 1 year.csv")
print(df2)

在这种情况下,当前价格会增加 1.5 税。现在我们创建一个名为 updated 的列表并更新现有列;此外,我们使用 to_csv() 方法创建一个新的 CSV 文件。

输出:

after include 1.5 % tax

[15000.0, 18000.0, 19500.0, 21000.0, 22500.0] new home price
   Unnamed: 0  Area  Home price
0           0  1000     15000.0
1           1  1200     18000.0
2           2  1300     19500.0
3           3  1400     21000.0
4           4  1500     22500.0

让我们考虑另一个简单的例子:

import pandas as pd

df = pd.DataFrame(
    {
        "Country": ["Pakistan", "India", "America", "Russia", "China"],
        "Immigrants": ["2000", "2500", "6000", "4000", "1000"],
        "Years": ["2010", "2008", "2011", "2018", "2016"],
    }
)
print(df, "\n")
list = df.columns.tolist()
print(type(df.columns))
print("\n", list, "\n")
print("After type cast into the list")
print(type(list))

请注意,系列数据类型被 tolist() 改变了,我们得到了一个包含 Dataframe 所有列的列表。

输出:

    Country Immigrants Years
0  Pakistan       2000  2010
1     India       2500  2008
2   America       6000  2011
3    Russia       4000  2018
4     China       1000  2016 

<class 'pandas.core.indexes.base.Index'>

 ['Country', 'Immigrants', 'Years'] 

After type cast into the list
<class 'list'>

所有的代码都在这里。

import numpy as np
import pandas as pd

# read csv file
df = pd.read_csv("home_price.csv")
# display 3 rows
df = df.head(3)
print(df)

list1 = df["Home price"].values.tolist()
print("extract the value of series and converting into the list")
print(list1)

# another example
# read csv file
df = pd.read_csv("home_price.csv")
# extract the value of series and converting into the list
list1 = df["Home price"].values.tolist()
list1 = np.array(list1)
# type casting in list data type
updated = list(list1 * 1.5)
print("after include 1.5 % tax\n")
print(updated, "new home price")
df["Home price"] = updated
# create new csv
df.to_csv("home prices after 1 year.csv")
df2 = pd.read_csv("home prices after 1 year.csv")
print(df2)

# another example
df = pd.DataFrame(
    {
        "Country": ["Pakistan", "India", "America", "Russia", "China"],
        "Immigrants": ["2000", "2500", "6000", "4000", "1000"],
        "Years": ["2010", "2008", "2011", "2018", "2016"],
    }
)
print(df, "\n")
list = df.columns.tolist()
print(type(df.columns))
print("\n", list, "\n")
print("After type cast into the list")
print(type(list))

输出:

   Area  Home price
0  1000       10000
1  1200       12000
2  1300       13000
extract the value of series and converting into the list
[10000, 12000, 13000]
after include 1.5 % tax

[15000.0, 18000.0, 19500.0, 21000.0, 22500.0] new home price
   Unnamed: 0  Area  Home price
0           0  1000     15000.0
1           1  1200     18000.0
2           2  1300     19500.0
3           3  1400     21000.0
4           4  1500     22500.0
    Country Immigrants Years
0  Pakistan       2000  2010
1     India       2500  2008
2   America       6000  2011
3    Russia       4000  2018
4     China       1000  2016 

<class 'pandas.core.indexes.base.Index'>

 ['Country', 'Immigrants', 'Years'] 

After type cast into the list
<class 'list'>

相关文章 - Pandas DataFrame