📜  如何在Python中执行卡方拟合优度检验

📅  最后修改于: 2022-05-13 01:55:02.836000             🧑  作者: Mango

如何在Python中执行卡方拟合优度检验

在本文中,我们将了解如何在Python中执行卡方拟合优度检验

卡方拟合优度检验是一种非参数统计假设检验,用于确定事件的观察值与预期值的差异程度。它可以帮助我们检查变量是否来自某个分布,或者样本是否代表总体。将观察到的概率分布与预期的概率分布进行比较。

示例 1:使用 stats.chisquare()函数

在这种方法中,我们使用 scipy.stats 模块中的 stats.chisquare() 方法,它可以帮助我们确定拟合统计量和 p 值的卡方优度。

在下面的示例中,我们还使用了 stats.ppf() 方法,该方法将参数的显着性水平和自由度作为输入,并为我们提供卡方临界值的值。如果 chi_square_ 值 > 临界值,则拒绝原假设。如果 chi_square_value <= 临界值,则接受原假设。在下面的示例中,卡方值为 5.0127344877344875,临界值为 12.591587243743977。当 chi_square_value <= 时,接受critical_value 零假设,拒绝备择假设。

Python3
# importing packages
import scipy.stats as stats
import numpy as np
  
# no of hours a student studies
# in a week vs expected no of hours
observed_data = [8, 6, 10, 7, 8, 11, 9]
expected_data = [9, 8, 11, 8, 10, 7, 6]
  
  
# Chi-Square Goodness of Fit Test
chi_square_test_statistic, p_value = stats.chisquare(
    observed_data, expected_data)
  
# chi square test statistic and p value
print('chi_square_test_statistic is : ' +
      str(chi_square_test_statistic))
print('p_value : ' + str(p_value))
  
  
# find Chi-Square critical value
print(stats.chi2.ppf(1-0.05, df=6))


Python3
# importing packages
import scipy.stats as stats
import numpy as np
  
# no of hours a student studies
# in a week vs expected no of hours
observed_data = [8, 6, 10, 7, 8, 11, 9]
expected_data = [9, 8, 11, 8, 10, 7, 6]
  
  
# determining chi square goodness of fit using formula
chi_square_test_statistic1 = 0
for i in range(len(observed_data)):
    chi_square_test_statistic1 = chi_square_test_statistic1 + \
        (np.square(observed_data[i]-expected_data[i]))/expected_data[i]
  
  
print('chi square value determined by formula : ' +
      str(chi_square_test_statistic1))
  
# find Chi-Square critical value
print(stats.chi2.ppf(1-0.05, df=6))


输出:

chi_square_test_statistic is : 5.0127344877344875
p_value : 0.542180861413329
12.591587243743977

示例 2:通过实现公式确定卡方检验统计量

在这种方法中,我们直接实现公式。我们可以看到我们得到了相同的 chi_square 值。

Python3

# importing packages
import scipy.stats as stats
import numpy as np
  
# no of hours a student studies
# in a week vs expected no of hours
observed_data = [8, 6, 10, 7, 8, 11, 9]
expected_data = [9, 8, 11, 8, 10, 7, 6]
  
  
# determining chi square goodness of fit using formula
chi_square_test_statistic1 = 0
for i in range(len(observed_data)):
    chi_square_test_statistic1 = chi_square_test_statistic1 + \
        (np.square(observed_data[i]-expected_data[i]))/expected_data[i]
  
  
print('chi square value determined by formula : ' +
      str(chi_square_test_statistic1))
  
# find Chi-Square critical value
print(stats.chi2.ppf(1-0.05, df=6))

输出:

chi square value determined by formula : 5.0127344877344875
12.591587243743977