📜  使用 NumPy 针对 bin 计算 nums 的直方图

📅  最后修改于: 2022-05-13 01:55:24.610000             🧑  作者: Mango

使用 NumPy 针对 bin 计算 nums 的直方图

在本文中,我们将讨论如何使用 NumPy 计算针对 bin 的 nums 模块。直方图是可视化数据集频率分布的最佳方式,将数据集拆分为大小相等的小区间,称为 bins。 Numpy histogram函数类似于matplotlib库的hist()函数,唯一的区别是 NumPy histogram 给出了数据集的数值表示,而hist()给出了数据集的图形表示。

在创建直方图时,最好不要从 bin 的角度考虑,而不是找出每个值出现的次数,即频率表。为此, Python字典非常适合。下面是纯Python中直方图的简单实现:

Python3
# Dataset
a = (1, 3, 7, 7, 2, 3, 4, 7, 6, 6, 3, 5, 2)
  
# Creating empty dictionary
hist = {}
  
# Counting the number of occurences
for i in a:
    hist[i] = hist.get(i, 0) + 1
  
# Printing the frequency table i.e histogram
print(hist)


Python3
# Import libraries 
import numpy as np  
    
# Creating dataset 
a = np.random.randint(100, size =(50)) 
    
# Creating histogram 
np.histogram(a, bins = [0, 10, 20, 30, 40, 
                        50, 60, 70, 80, 90, 
                        100]) 
    
hist, bins = np.histogram(a, bins = [0, 10,  
                                     20, 30, 
                                     40, 50, 
                                     60, 70, 
                                     80, 90, 
                                     100])  
    
# Displaying histogram 
print (hist)  
print (bins)


Python3
# Import libraries
from matplotlib import pyplot as plt
import numpy as np
  
  
# Creating dataset
a = np.random.randint(100, size=(50))
  
# Creating plot
fig = plt.figure(figsize=(10, 7))
  
plt.hist(a, bins=[0, 10, 20, 30,
                  40, 50, 60, 70,
                  80, 90, 100])
  
plt.title("Numpy Histogram")
  
# show plot
plt.show()


Python3
# Import libraries
from matplotlib import pyplot as plt
import numpy as np
  
# Creating dataset
l = [i for i in range(50)]
  
# Creating plot
plt.hist(l, bins=[1, 2, 3, 4, 5],
         color='green')
  
# show plot
plt.show()


Python3
# Import libraries
from matplotlib import pyplot as plt
import numpy as np
  
# Creating dataset
l = np.random.randint(150)
  
# Creating plot
plt.hist(l, bins=l,
         color='lime')
  
# show plot
plt.show()


输出:

{1: 1, 3: 3, 7: 3, 2: 2, 4: 1, 6: 2, 5: 1}

Numpy有一个内置的numpy.histogram()函数,它以图形形式表示数据分布的频率。具有相等水平尺寸的矩形对应于称为 bin 的类间隔和对应于频率的可变高度。

执行:

蟒蛇3

# Import libraries 
import numpy as np  
    
# Creating dataset 
a = np.random.randint(100, size =(50)) 
    
# Creating histogram 
np.histogram(a, bins = [0, 10, 20, 30, 40, 
                        50, 60, 70, 80, 90, 
                        100]) 
    
hist, bins = np.histogram(a, bins = [0, 10,  
                                     20, 30, 
                                     40, 50, 
                                     60, 70, 
                                     80, 90, 
                                     100])  
    
# Displaying histogram 
print (hist)  
print (bins)  

输出:

[5 7 4 7 4 5 1 6 4 7]
[  0  10  20  30  40  50  60  70  80  90 100]

这里, np.histogram()函数有两个返回值hist ,它给出了直方图的值数组, edge_bin是一个浮点数据类型数组,包含长度比 hist 大一的 bin 边缘。

直方图的上述数字表示可以转换为图形形式。 plt()函数存在于 Matplotlibpyplot子模块以数据集数组和 bin 数组为参数,并创建相应数据值的直方图。下面是一些计算 nums 对 bin 的直方图的示例:

示例 1:

蟒蛇3

# Import libraries
from matplotlib import pyplot as plt
import numpy as np
  
  
# Creating dataset
a = np.random.randint(100, size=(50))
  
# Creating plot
fig = plt.figure(figsize=(10, 7))
  
plt.hist(a, bins=[0, 10, 20, 30,
                  40, 50, 60, 70,
                  80, 90, 100])
  
plt.title("Numpy Histogram")
  
# show plot
plt.show()

输出:

示例 2:

蟒蛇3

# Import libraries
from matplotlib import pyplot as plt
import numpy as np
  
# Creating dataset
l = [i for i in range(50)]
  
# Creating plot
plt.hist(l, bins=[1, 2, 3, 4, 5],
         color='green')
  
# show plot
plt.show()

输出:

示例 3:

蟒蛇3

# Import libraries
from matplotlib import pyplot as plt
import numpy as np
  
# Creating dataset
l = np.random.randint(150)
  
# Creating plot
plt.hist(l, bins=l,
         color='lime')
  
# show plot
plt.show()

输出: