Python| Pandas DataFrame.set_index()

Python是一种用于进行数据分析的出色语言，主要是因为以数据为中心的Python包的奇妙生态系统。 Pandas就是其中之一，它使导入和分析数据变得更加容易。
Pandas set_index()是一种将列表、系列或数据框设置为数据框索引的方法。也可以在制作数据框的同时设置索引列。但有时一个数据帧是由两个或多个数据帧组成的，因此以后可以使用这种方法更改索引。
句法：

DataFrame.set_index(keys, drop=True, append=False, inplace=False, verify_integrity=False)

编程需要懂一点英语

参数：

keys: Column name or list of column name.
drop: Boolean value which drops the column used for index if True.
append: Appends the column to existing index column if True.
inplace: Makes the changes in the dataframe if True.
verify_integrity: Checks the new index column for duplicates if True.

编程需要懂一点英语

要下载使用的 CSV 文件，请单击此处。
代码 #1：更改索引列
在此示例中，First Name 列已成为 Data Frame 的索引列。

Python3

# importing pandas package
import pandas as pd
 
# making data frame from csv file
data = pd.read_csv("employees.csv")
 
# setting first name as index column
data.set_index("First Name", inplace = True)
 
# display
data.head()

Python3

# importing pandas package
import pandas as pd
 
# making data frame from csv file
data = pd.read_csv("employees.csv")
 
# setting first name as index column
data.set_index(["First Name", "Gender"], inplace = True,
                            append = True, drop = False)
 
# display
data.head()

Python3

# importing pandas library
import pandas as pd
 
# creating and initializing a nested list
students = [['jack', 34, 'Sydeny', 'Australia',85.96],
            ['Riti', 30, 'Delhi', 'India',95.20],
            ['Vansh', 31, 'Delhi', 'India',85.25],
            ['Nanyu', 32, 'Tokyo', 'Japan',74.21],
            ['Maychan', 16, 'New York', 'US',99.63],
            ['Mike', 17, 'las vegas', 'US',47.28]]
 
# Create a DataFrame object
df = pd.DataFrame(students,
                      columns=['Name', 'Age', 'City', 'Country','Agg_Marks'],
                           index=['a', 'b', 'c', 'd', 'e', 'f'])
 
# here we set Float column 'Agg_Marks' as index of data frame
# using dataframe.set_index() function
df = df.set_index('Agg_Marks')
 
 
# Displaying the Data frame
df

Python3

# importing pandas library
import pandas as pd
 
# creating and initializing a nested list
students = [['jack', 34, 'Sydeny', 'Australia',85.96,400],
            ['Riti', 30, 'Delhi', 'India',95.20,750],
            ['Vansh', 31, 'Delhi', 'India',85.25,101],
            ['Nanyu', 32, 'Tokyo', 'Japan',74.21,900],
            ['Maychan', 16, 'New York', 'US',99.63,420],
            ['Mike', 17, 'las vegas', 'US',47.28,555]]
 
# Create a DataFrame object
df = pd.DataFrame(students,
                      columns=['Name', 'Age', 'City', 'Country','Agg_Marks','ID'],
                           index=['a', 'b', 'c', 'd', 'e', 'f'])
 
# Here we pass list of 3 columns i.e 'Name', 'City' and 'ID'
# to dataframe.set_index() function
# to set them as multiIndex of dataframe
df = df.set_index(['Name','City','ID'])
 
 
# Displaying the Data frame
df

输出：
如输出图像所示，之前的索引列是一系列数字，但后来它已被替换为名字。
操作前——

手术后——

代码 #2：多索引列
在此示例中，将两列作为索引列。 Drop 参数用于删除列，append 参数用于将传递的列追加到已经存在的索引列。

Python3

# importing pandas package
import pandas as pd
 
# making data frame from csv file
data = pd.read_csv("employees.csv")
 
# setting first name as index column
data.set_index(["First Name", "Gender"], inplace = True,
                            append = True, drop = False)
 
# display
data.head()

输出：
如输出图像所示，数据有 3 个索引列。

代码 #3：在 Pandas DataFrame 中将单个Float 列设置为索引

Python3

# importing pandas library
import pandas as pd
 
# creating and initializing a nested list
students = [['jack', 34, 'Sydeny', 'Australia',85.96],
            ['Riti', 30, 'Delhi', 'India',95.20],
            ['Vansh', 31, 'Delhi', 'India',85.25],
            ['Nanyu', 32, 'Tokyo', 'Japan',74.21],
            ['Maychan', 16, 'New York', 'US',99.63],
            ['Mike', 17, 'las vegas', 'US',47.28]]
 
# Create a DataFrame object
df = pd.DataFrame(students,
                      columns=['Name', 'Age', 'City', 'Country','Agg_Marks'],
                           index=['a', 'b', 'c', 'd', 'e', 'f'])
 
# here we set Float column 'Agg_Marks' as index of data frame
# using dataframe.set_index() function
df = df.set_index('Agg_Marks')
 
 
# Displaying the Data frame
df

输出：

在上面的示例中，我们将列“ Agg_Marks ”设置为数据框的索引。

代码 #4：在 Pandas DataFrame 中将三列设置为MultiIndex

Python3

# importing pandas library
import pandas as pd
 
# creating and initializing a nested list
students = [['jack', 34, 'Sydeny', 'Australia',85.96,400],
            ['Riti', 30, 'Delhi', 'India',95.20,750],
            ['Vansh', 31, 'Delhi', 'India',85.25,101],
            ['Nanyu', 32, 'Tokyo', 'Japan',74.21,900],
            ['Maychan', 16, 'New York', 'US',99.63,420],
            ['Mike', 17, 'las vegas', 'US',47.28,555]]
 
# Create a DataFrame object
df = pd.DataFrame(students,
                      columns=['Name', 'Age', 'City', 'Country','Agg_Marks','ID'],
                           index=['a', 'b', 'c', 'd', 'e', 'f'])
 
# Here we pass list of 3 columns i.e 'Name', 'City' and 'ID'
# to dataframe.set_index() function
# to set them as multiIndex of dataframe
df = df.set_index(['Name','City','ID'])
 
 
# Displaying the Data frame
df

输出：

在上面的示例中，我们将列“ Name ”、“ City ”和“ ID ”设置为数据框的 multiIndex。