📜  如何用关系对Python NumPy 数组进行排名?

📅  最后修改于: 2022-05-13 01:55:42.604000             🧑  作者: Mango

如何用关系对Python NumPy 数组进行排名?

在本文中,我们将了解如何在Python中使用 tie-breakers 对 Numpy 数组进行排名。

排名是在数据科学、社会学等众多领域中使用的基本统计操作。一种非常暴力的方法是按照对应值的顺序对数组的索引进行排序。在给定数字集中不涉及相同值的情况下,这种方法会很方便。本文将向前迈出一步,探索Python库 Scipy 中的 rankdata()函数,并说明它在有关系的列表中的用法。

rankdata()函数

为了计算排名,我们将使用Python中 scipy.stats 库中的 rankdata()函数。该函数有五种不同的平局策略,其语法如下:

示例 1:在一维 Numpy 数组上排名

在这个例子中,我们将探索一维 Numpy 数组上的所有平局策略。

Python3
import numpy as np
from scipy.stats import rankdata
  
arr = np.array([-20, -10, -10, -10, 10,
                20, 20, 50, 50, 60, 60,
                60, 60, 60])
  
# Normal ranking; each value has distinct rank
print(f"Ordinal ranking: {rankdata(arr,
method='ordinal')}")
  
# Average ranking; each value's
# rank is averaged over all ties
print(f"Average ranking: {rankdata(arr,
method='average')}")
  
# Max ranking; each value's rank is the
# maximum ordinal rank for the corresponding
# tie
print(f"Max ranking: {rankdata(arr, 
method='max')}")
  
# Min ranking; each value's rank is
# the minimum ordinal rank for the corresponding 
# tie
print(f"Min ranking: {rankdata(arr,
method='min')}")
  
# Dense ranking; each value's rank
# is sequentially arranged
print(f"Dense ranking: {rankdata(arr,
method='dense')}")


Python3
arr = np.array([[-20, -10, -10, -10, 10, 20, 20],
                [50, 50, 60, -20, 60, 60, 60],
                [-20, 50, -10, -30, 60, 20, 60]])
  
# Normal ranking; each value has distinct rank
print(f"Ordinal ranking:\n {rankdata(arr,
method='ordinal', axis = 0)}")
  
# Average ranking; each value's
# rank is averaged over all ties
print(f"Average ranking:\n {rankdata(arr,
method='average', axis = 0)}")
  
# Max ranking; each value's rank is
# the maximum ordinal rank for
# the corresponding tie
print(f"Max ranking:\n {rankdata(arr,
method='max', axis = 0)}")
  
# Min ranking; each value's rank is the 
# minimum ordinal rank for the corresponding 
# tie
print(f"Min ranking:\n {rankdata(arr,
method='min', axis = 0)}")
  
# Dense ranking; each value's rank
# is sequentially arranged
print(f"Dense ranking:\n {rankdata(arr, 
method='dense', axis = 0)}")


输出:

示例 2:使用 'axis' 参数沿特定轴对 2-D Numpy 数组进行排名

在这个例子中,我们将沿着行探索二维 Numpy 数组上的所有平局策略。

Python3

arr = np.array([[-20, -10, -10, -10, 10, 20, 20],
                [50, 50, 60, -20, 60, 60, 60],
                [-20, 50, -10, -30, 60, 20, 60]])
  
# Normal ranking; each value has distinct rank
print(f"Ordinal ranking:\n {rankdata(arr,
method='ordinal', axis = 0)}")
  
# Average ranking; each value's
# rank is averaged over all ties
print(f"Average ranking:\n {rankdata(arr,
method='average', axis = 0)}")
  
# Max ranking; each value's rank is
# the maximum ordinal rank for
# the corresponding tie
print(f"Max ranking:\n {rankdata(arr,
method='max', axis = 0)}")
  
# Min ranking; each value's rank is the 
# minimum ordinal rank for the corresponding 
# tie
print(f"Min ranking:\n {rankdata(arr,
method='min', axis = 0)}")
  
# Dense ranking; each value's rank
# is sequentially arranged
print(f"Dense ranking:\n {rankdata(arr, 
method='dense', axis = 0)}")

输出:

正如我们所看到的,通过比较同一行中的相应条目,为二维数组“arr”中的每一列的值分配了一个等级。