毫升 |使用 Sklearn 的投票分类器

投票分类器是一种机器学习模型，它在众多模型的集合上进行训练，并根据它们选择类作为输出的最高概率来预测输出（类）。
它只是汇总传递给投票分类器的每个分类器的结果，并根据最高多数投票预测输出类别。我们的想法不是创建单独的专用模型并为每个模型找到准确性，而是创建一个模型，该模型通过这些模型进行训练，并根据它们对每个输出类的多数投票来预测输出。

投票分类器支持两种类型的投票。

硬投票：在硬投票中，预测的输出类是具有最高多数票的类，即每个分类器预测的概率最高的类。假设三个分类器预测输出类（A，A，B） ，所以这里大多数预测A作为输出。因此A将是最终的预测。
软投票：在软投票中，输出类是基于给定该类的概率平均值的预测。假设给三个模型的一些输入，类别A = (0.30, 0.47, 0.53)和B = (0.20, 0.32, 0.40)的预测概率。所以A 类的平均值是 0.4333和B 是 0.3067 ，获胜者显然是A类，因为它具有每个分类器平均的最高概率。

注意：确保包含各种模型来为投票分类器提供数据，以确保一个人犯的错误可以被另一个人解决。
代码：实现投票分类器的Python代码

# importing libraries
from sklearn.ensemble import VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
  
# loading iris dataset
iris = load_iris()
X = iris.data[:, :4]
Y = iris.target
  
# train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, 
                                                    Y, 
                                                    test_size = 0.20, 
                                                    random_state = 42)
  
# group / ensemble of models
estimator = []
estimator.append(('LR', 
                  LogisticRegression(solver ='lbfgs', 
                                     multi_class ='multinomial', 
                                     max_iter = 200)))
estimator.append(('SVC', SVC(gamma ='auto', probability = True)))
estimator.append(('DTC', DecisionTreeClassifier()))
  
# Voting Classifier with hard voting
vot_hard = VotingClassifier(estimators = estimator, voting ='hard')
vot_hard.fit(X_train, y_train)
y_pred = vot_hard.predict(X_test)
  
# using accuracy_score metric to predict accuracy
score = accuracy_score(y_test, y_pred)
print("Hard Voting Score % d" % score)
  
# Voting Classifier with soft voting
vot_soft = VotingClassifier(estimators = estimator, voting ='soft')
vot_soft.fit(X_train, y_train)
y_pred = vot_soft.predict(X_test)
  
# using accuracy_score
score = accuracy_score(y_test, y_pred)
print("Soft Voting Score % d" % score)

输出：

Hard Voting Score 1
Soft Voting Score 1

例子：

Input  :4.7, 3.2, 1.3, 0.2 
Output :Iris Setosa

实际上，软投票的输出精度会更高，因为它是所有估计器组合的平均概率，对于我们的基本 iris 数据集，我们已经过拟合，因此输出不会有太大差异。