遍历 df - Python (1) - 芒果文档

📌 相关文章

📜 遍历 df - Python (1)

📅 最后修改于: 2023-12-03 15:28:28.380000 🧑 作者: Mango

遍历 df - Python

在Python的数据分析中，使用Pandas库中的DataFrame是经常遇到的情况。而在进行数据分析时，遍历DataFrame是很有必要的。下面将介绍DataFrame的遍历方法，以及如何对DataFrame中的每行或每列进行操作。

遍历DataFrame

对于DataFrame的遍历，最常用的方式是使用iterrows()或itertuples()方法。iterrows()方法返回DataFrame中的每一行作为一个元组，其中元组的第一个元素为行的索引，第二个元素为Series，即该行的所有数据。代码示例如下：

import pandas as pd

df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})

for index, row in df.iterrows():
    print(index, row['col1'], row['col2'])

输出：

0 1 3
1 2 4

需要注意的是，iterrows()方法的效率比较低，因为它需要每次返回一个元组，所以当DataFrame较大时，推荐使用itertuples()方法。itertuples()方法返回DataFrame中的每一行作为一个命名元组，其中命名元组中的属性名为列名，属性值为该列对应的值。代码示例如下：

import pandas as pd

df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})

for row in df.itertuples(index=False):
    print(row.col1, row.col2)

输出：

1 3
2 4

在遍历DataFrame时，也可以直接使用for循环来遍历DataFrame的每一列，代码示例如下：

import pandas as pd

df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})

for col in df:
    print(df[col])

输出：

0    1
1    2
Name: col1, dtype: int64
0    3
1    4
Name: col2, dtype: int64

操作DataFrame的每一行或每一列

在遍历DataFrame的每一行或每一列时，我们可以对其进行一些操作，例如逐行计算某一列的均值、最大值等等。代码示例如下：

import pandas as pd

df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})

# 计算每一行的均值
row_means = df.mean(axis=1)
print(row_means)

# 计算每一列的最大值
col_max = df.max()
print(col_max)

输出：

0    2.0
1    3.0
dtype: float64
col1    2
col2    4
dtype: int64

以上就是对DataFrame进行遍历以及操作的方法介绍，希望对大家有所帮助。