Python|熊猫 dataframe.reindex()

Python是一种用于进行数据分析的出色语言，主要是因为以数据为中心的Python包的奇妙生态系统。 Pandas就是其中之一，它使导入和分析数据变得更加容易。

Pandas dataframe.reindex dataframe.reindex()函数将 DataFrame 与具有可选填充逻辑的新索引一致，将 NA/NaN 放置在先前索引中没有值的位置。除非新索引等同于当前索引并且 copy=False 否则会生成一个新对象

Syntax: DataFrame.reindex(labels=None, index=None, columns=None, axis=None, method=None, copy=True, level=None, fill_value=nan, limit=None, tolerance=None)

Parameters :
labels : New labels/index to conform the axis specified by ‘axis’ to.
index, columns : New labels / index to conform to. Preferably an Index object to avoid duplicating data
axis : Axis to target. Can be either the axis name (‘index’, ‘columns’) or number (0, 1).
method : {None, ‘backfill’/’bfill’, ‘pad’/’ffill’, ‘nearest’}, optional
copy : Return a new object, even if the passed indexes are the same
level : Broadcast across a level, matching Index values on the passed MultiIndex level
fill_value : Fill existing missing (NaN) values, and any new element needed for successful DataFrame alignment, with this value before computation. If data in both corresponding DataFrame locations is missing the result will be missing.
limit : Maximum number of consecutive elements to forward or backward fill
tolerance : Maximum distance between original and new labels for inexact matches. The values of the index at the matching locations most satisfy the equation abs(index[indexer] – target) <= tolerance.

Returns : reindexed : DataFrame

编程需要懂一点英语

示例 #1：使用reindex()函数重新索引数据帧。默认情况下，新索引中在数据框中没有相应记录的值被分配为 NaN。
注意：我们可以通过将值传递给关键字 fill_value 来填充缺失值。

# importing pandas as pd
import pandas as pd
  
# Creating the dataframe 
df = pd.DataFrame({"A":[1, 5, 3, 4, 2],
                   "B":[3, 2, 4, 3, 4],
                   "C":[2, 2, 7, 3, 4],
                   "D":[4, 3, 6, 12, 7]},
                   index =["first", "second", "third", "fourth", "fifth"])
  
# Print the dataframe
df

让我们使用dataframe.reindex()函数重新索引数据帧

# reindexing with new index values
df.reindex(["first", "dues", "trois", "fourth", "fifth"])

输出：

注意输出，新索引填充了NaN值，我们可以使用参数 fill_value 填充缺失值

# filling the missing values by 100
df.reindex(["first", "dues", "trois", "fourth", "fifth"], fill_value = 100)

输出：
示例 #2：使用reindex()函数重新索引列轴

# importing pandas as pd
import pandas as pd
  
# Creating the first dataframe 
df1 = pd.DataFrame({"A":[1, 5, 3, 4, 2],
                    "B":[3, 2, 4, 3, 4],
                    "C":[2, 2, 7, 3, 4],
                    "D":[4, 3, 6, 12, 7]})
  
# reindexing the column axis with
# old and new index values
df.reindex(columns =["A", "B", "D", "E"])

输出：

注意，我们在重新索引后的新列中有NaN值，我们可以在重新索引时处理缺失的值。通过将参数fill_value传递给函数。

# reindex the columns
# fill the missing values by 25
df.reindex(columns =["A", "B", "D", "E"], fill_value = 25)

输出：