📜  Pandas 中的数据透视表

📅  最后修改于: 2022-05-13 01:54:38.349000             🧑  作者: Mango

Pandas 中的数据透视表

在本文中,我们将看到 Pandas 中的数据透视表。让我们讨论一些概念:

Pandas : Pandas 是一个建立在 NumPy 库之上的开源库。它是一个Python包,提供用于处理数值数据和时间序列的各种数据结构和操作。它主要用于更容易地导入和分析数据。 Pandas 速度快,为用户提供高性能和生产力。

数据透视表:数据透视表是汇总更广泛表(例如来自数据库、电子表格或商业智能程序)的数据的统计表。此摘要可能包括总和、平均值或其他统计数据,数据透视表以有意义的方式将它们组合在一起。

需要的步骤

  • 导入库(熊猫)
  • 导入/加载/创建数据。
  • 使用具有不同变体的 Pandas.pivot_table() 方法。

在这里,我们将在如下所示的数据帧上讨论数据透视表的一些变体:

Python3
# import packages
import pandas as pd
  
# create data
df = pd.DataFrame({'ID': {0: 23, 1: 43, 2: 12, 
                          3: 13, 4: 67, 5: 89,
                          6: 90, 7: 56, 8: 34},
                     
                 'Name': {0: 'Ram', 1: 'Deep', 2: 'Yash',
                          3: 'Aman', 4: 'Arjun', 5: 'Aditya',
                          6: 'Akash', 7: 'Chalsea',
                          8: 'Divya'},
                     
                 'Marks': {0: 89, 1: 97, 2: 45,
                           3: 78, 4: 56, 5: 76,
                           6: 81, 7: 87, 8: 100},
                     
                 'Grade': {0: 'B', 1: 'A', 2: 'F',
                           3: 'C', 4: 'E', 5: 'C',
                           6: 'B', 7: 'B', 8: 'A'}})
  
# view data
display(df)


Python3
# import packages
import pandas as pd
  
# create data
df = pd.DataFrame({'ID': {0: 23, 1: 43, 2: 12, 
                          3: 13, 4: 67, 5: 89,
                          6: 90, 7: 56, 8: 34},
                     
                 'Name': {0: 'Ram', 1: 'Deep', 2: 'Yash',
                          3: 'Aman', 4: 'Arjun', 5: 'Aditya',
                          6: 'Akash', 7: 'Chalsea',
                          8: 'Divya'},
                     
                 'Marks': {0: 89, 1: 97, 2: 45,
                           3: 78, 4: 56, 5: 76,
                           6: 81, 7: 87, 8: 100},
                     
                 'Grade': {0: 'B', 1: 'A', 2: 'F',
                           3: 'C', 4: 'E', 5: 'C',
                           6: 'B', 7: 'B', 8: 'A'}})
  
# simple use pivot_table() method
print(pd.pivot_table(df, index = ["ID"]))


Python3
# import packages
import pandas as pd
  
# create data
df = pd.DataFrame({'ID': {0: 23, 1: 43, 2: 12, 
                          3: 13, 4: 67, 5: 89,
                          6: 90, 7: 56, 8: 34},
                     
                 'Name': {0: 'Ram', 1: 'Deep', 2: 'Yash',
                          3: 'Aman', 4: 'Arjun', 5: 'Aditya',
                          6: 'Akash', 7: 'Chalsea',
                          8: 'Divya'},
                     
                 'Marks': {0: 89, 1: 97, 2: 45,
                           3: 78, 4: 56, 5: 76,
                           6: 81, 7: 87, 8: 100},
                     
                 'Grade': {0: 'B', 1: 'A', 2: 'F',
                           3: 'C', 4: 'E', 5: 'C',
                           6: 'B', 7: 'B', 8: 'A'}})
# multiple columns with 
# pivot_table() method
display(pd.pivot_table(df, 
                     index = ["ID", "Name"]))


Python3
# import packages
import pandas as pd
import numpy as np
  
# create data
df = pd.DataFrame({'ID': {0: 23, 1: 43, 2: 12,
                          3: 13, 4: 67, 5: 89, 
                          6: 90, 7: 56, 8: 34},
                     
                   'Name': {0: 'Ram', 1: 'Deep',
                            2: 'Yash', 3: 'Aman',
                            4: 'Arjun', 5: 'Aditya',
                            6: 'Akash',7: 'Chalsea',
                            8: 'Divya'},
  
                   'Marks': {0: 89, 1: 97, 2: 45, 
                             3: 78, 4: 56, 5: 76,
                             6: 81, 7: 87, 8: 100},
  
                   'Grade': {0: 'B', 1: 'A', 2: 'F', 3: 'C',
                             4: 'E', 5: 'C', 6: 'B', 7: 'B',
                             8: 'A'}})
  
# Pivot Table with mean 
# aggregate function on marks
display(pd.pivot_table(df,
                     index = ["Grade"],
                     values = ["Marks"],
                     aggfunc = np.mean))


输出:

示例 1:pivot_table() 方法的简单使用。

蟒蛇3

# import packages
import pandas as pd
  
# create data
df = pd.DataFrame({'ID': {0: 23, 1: 43, 2: 12, 
                          3: 13, 4: 67, 5: 89,
                          6: 90, 7: 56, 8: 34},
                     
                 'Name': {0: 'Ram', 1: 'Deep', 2: 'Yash',
                          3: 'Aman', 4: 'Arjun', 5: 'Aditya',
                          6: 'Akash', 7: 'Chalsea',
                          8: 'Divya'},
                     
                 'Marks': {0: 89, 1: 97, 2: 45,
                           3: 78, 4: 56, 5: 76,
                           6: 81, 7: 87, 8: 100},
                     
                 'Grade': {0: 'B', 1: 'A', 2: 'F',
                           3: 'C', 4: 'E', 5: 'C',
                           6: 'B', 7: 'B', 8: 'A'}})
  
# simple use pivot_table() method
print(pd.pivot_table(df, index = ["ID"]))

输出 :

示例 2:具有多列索引的数据透视表。

蟒蛇3

# import packages
import pandas as pd
  
# create data
df = pd.DataFrame({'ID': {0: 23, 1: 43, 2: 12, 
                          3: 13, 4: 67, 5: 89,
                          6: 90, 7: 56, 8: 34},
                     
                 'Name': {0: 'Ram', 1: 'Deep', 2: 'Yash',
                          3: 'Aman', 4: 'Arjun', 5: 'Aditya',
                          6: 'Akash', 7: 'Chalsea',
                          8: 'Divya'},
                     
                 'Marks': {0: 89, 1: 97, 2: 45,
                           3: 78, 4: 56, 5: 76,
                           6: 81, 7: 87, 8: 100},
                     
                 'Grade': {0: 'B', 1: 'A', 2: 'F',
                           3: 'C', 4: 'E', 5: 'C',
                           6: 'B', 7: 'B', 8: 'A'}})
# multiple columns with 
# pivot_table() method
display(pd.pivot_table(df, 
                     index = ["ID", "Name"]))

输出 :

示例 3:具有聚合函数的数据透视表。

蟒蛇3

# import packages
import pandas as pd
import numpy as np
  
# create data
df = pd.DataFrame({'ID': {0: 23, 1: 43, 2: 12,
                          3: 13, 4: 67, 5: 89, 
                          6: 90, 7: 56, 8: 34},
                     
                   'Name': {0: 'Ram', 1: 'Deep',
                            2: 'Yash', 3: 'Aman',
                            4: 'Arjun', 5: 'Aditya',
                            6: 'Akash',7: 'Chalsea',
                            8: 'Divya'},
  
                   'Marks': {0: 89, 1: 97, 2: 45, 
                             3: 78, 4: 56, 5: 76,
                             6: 81, 7: 87, 8: 100},
  
                   'Grade': {0: 'B', 1: 'A', 2: 'F', 3: 'C',
                             4: 'E', 5: 'C', 6: 'B', 7: 'B',
                             8: 'A'}})
  
# Pivot Table with mean 
# aggregate function on marks
display(pd.pivot_table(df,
                     index = ["Grade"],
                     values = ["Marks"],
                     aggfunc = np.mean))

输出 :