📜  Pandas DataFrame.describe()

📅  最后修改于: 2020-10-29 01:55:47             🧑  作者: Mango

Pandas DataFrame.describe()

describe()方法用于计算一些统计数据,例如Series或DataFrame的数值的百分位数,均值和标准差。它分析数字和对象系列以及混合数据类型的DataFrame列集。

句法

DataFrame.describe(percentiles=None, include=None, exclude=None)

参量

  • percentile:这是一个可选参数,它是一个列表,如数字的数据类型,应在0到1之间。其默认值为[.25,.5,.75],它返回第25、50和75个百分位数。
  • include:它也是一个可选参数,在描述DataFrame时包括数据类型列表。其默认值为无。
  • exclude:它也是一个可选参数,在描述DataFrame时不包括数据类型列表。其默认值为无。

返回值

它返回Series和DataFrame的统计摘要。

例1

import pandas as pd
import numpy as np
a1 = pd.Series([1, 2, 3])
a1.describe()

输出量

count     3.0
mean      2.0
std       1.0
min       1.0
25%       1.5
50%       2.0
75%       2.5
max       3.0
dtype: float64

例2

import pandas as pd
import numpy as np
a1 = pd.Series(['p', 'q', 'q', 'r'])
a1.describe()

输出量

count      4
unique     3
top        q
freq       2
dtype: object

例子3

import pandas as pd
import numpy as np
a1 = pd.Series([1, 2, 3])
a1.describe()
a1 = pd.Series(['p', 'q', 'q', 'r'])
a1.describe()
info = pd.DataFrame({'categorical': pd.Categorical(['s','t','u']),
'numeric': [1, 2, 3],
'object': ['p', 'q', 'r']
 })
info.describe(include=[np.number])
info.describe(include=[np.object])
info.describe(include=['category'])

输出量

    categorical
count    3
unique    3
top     u
freq    1

例子4

import pandas as pd
import numpy as np
a1 = pd.Series([1, 2, 3])
a1.describe()
a1 = pd.Series(['p', 'q', 'q', 'r'])
a1.describe()
info = pd.DataFrame({'categorical': pd.Categorical(['s','t','u']),
'numeric': [1, 2, 3],
'object': ['p', 'q', 'r']
 })
info.describe()
info.describe(include='all')
info.numeric.describe()
info.describe(include=[np.number])
info.describe(include=[np.object])
info.describe(include=['category'])
info.describe(exclude=[np.number])
info.describe(exclude=[np.object])

输出量

      categorical  numeric
count     3         3.0
unique    3         NaN
top       u         NaN
freq      1         NaN
mean      NaN       2.0
std       NaN       1.0
min       NaN       1.0
25%       NaN       1.5
50%       NaN       2.0
75%       NaN       2.5
max       NaN       3.0