Python| Pandas Dataframe.describe() 方法

Python是一种用于进行数据分析的出色语言，主要是因为以数据为中心的Python包的奇妙生态系统。 Pandas就是其中之一，它使导入和分析数据变得更加容易。

Pandas describe()用于查看数据框或一系列数值的一些基本统计细节，如百分位数、平均值、标准差等。当此方法应用于一系列字符串时，它会返回不同的输出，如下例所示。

Syntax: DataFrame.describe(percentiles=None, include=None, exclude=None)

Parameters:
percentile: list like data type of numbers between 0-1 to return the respective percentile
include: List of data types to be included while describing dataframe. Default is None
exclude: List of data types to be Excluded while describing dataframe. Default is None

Return type: Statistical summary of data frame.

编程需要懂一点英语

要下载以下示例中使用的数据集，请单击此处。
在以下示例中，使用的数据框包含一些 NBA 球员的数据。下面附上任何操作之前的数据帧图像。

示例 #1：用对象和数字数据类型描述数据框

在此示例中，描述了数据框，并将 ['object'] 传递给包含参数以查看对象系列的描述。 [.20, .40, .60, .80] 被传递给 percentile 参数以查看数字系列的各个百分位数。

# importing pandas module 
import pandas as pd 
  
# importing regex module
import re
    
# making data frame 
data = pd.read_csv("https://media.geeksforgeeks.org/wp-content/uploads/nba.csv") 
    
# removing null values to avoid errors 
data.dropna(inplace = True) 
  
# percentile list
perc =[.20, .40, .60, .80]
  
# list of dtypes to include
include =['object', 'float', 'int']
  
# calling describe method
desc = data.describe(percentiles = perc, include = include)
  
# display
desc

输出：
如输出图像所示，数据帧的统计描述与各自通过的百分位数一起返回。对于带有字符串的列，为数值运算返回 NaN。
示例 #2：描述一系列字符串

在此示例中，名称列调用 describe 方法以查看对象数据类型的行为。

# importing pandas module 
import pandas as pd 
  
# importing regex module
import re
    
# making data frame 
data = pd.read_csv("https://media.geeksforgeeks.org/wp-content/uploads/nba.csv") 
    
# removing null values to avoid errors 
data.dropna(inplace = True) 
  
# calling describe method
desc = data["Name"].describe()
  
# display
desc

输出：
如输出图像所示， describe() 的行为与一系列字符串不同。
在这种情况下，返回了不同的统计信息，例如值计数、唯一值、顶部和出现频率。