R Logistic回归 - 芒果文档

📌 相关文章

📜 R Logistic回归

📅 最后修改于: 2021-01-08 10:04:12 🧑 作者: Mango

R逻辑回归

在逻辑回归中，拟合了回归曲线y = f(x)。在回归曲线方程中，y是类别变量。该回归模型用于预测y给定了一组预测变量x。因此，预测变量可以是分类的，连续的或两者的混合。

逻辑回归是一种属于非线性回归的分类算法。该模型用于将给定的二进制结果(1/0，是/否，是/否)预测为一组独立变量。此外，它有助于使用虚拟变量表示分类/二进制结果。

Logistic回归是一种回归模型，其中响应变量具有分类值，例如true / false或0/1。因此，我们可以测量二进制响应的概率。

有以下用于逻辑回归的数学方程式：

y = 1 /(1 + e ^-(b ₀ + b ₁ x ₁ + b ₂ x ₂ +⋯))

在上式中，y是响应变量，x是预测变量，b ₀和b ₁ ，b ₂ ，… b _n是系数，是数字常数。我们使用glm()函数创建回归模型。

glm()函数具有以下语法。

glm(formula, data, family)

这里，

S.No	Parameter	Description
1.	formula	It is a symbol which represents the relationship b/w the variables.
2.	data	It is the dataset giving the values of the variables.
3.	family	An R object which specifies the details of the model, and its value is binomial for logistic regression.

建立逻辑回归

内置数据集“ mtcars”描述了具有不同发动机规格的各种车型。在“ mtcars”数据集中，传输模式由列“ am”描述，该列是二进制值(0或1)。我们可以在“ am”列与其他三列hp，wt和cyl之间构建逻辑回归模型。

让我们看一个示例，以了解如何使用glm函数创建逻辑回归，以及如何使用summary函数查找分析摘要。

在我们的示例中，我们将使用R环境中可用的数据集“ BreastCancer”。要使用它，我们首先需要安装“ mlbench”和“ caret”软件包。

例

#Loading library
library(mlbench)
#Using BreastCancer dataset
data(BreastCancer, package = "mlbench")
breast_canc = BreastCancer[complete.cases(BreastCancer),]
#Displaying the information related to dataset with the str() function.
str(breast_canc)

输出：

现在，我们将数据分为训练集和测试集，训练集包含70％的数据，测试集包括剩余的百分比。

#Dividing dataset into training and test dataset.
set.seed(100)
#Creating partitioning.
Training_Ratio <- createDataPartition(b_canc$Class, p=0.7, list = F)
#Creating training data.
Training_Data <- b_canc[Training_Ratio, ]
str(Training_Data)
#Creating test data.
Test_Data <- b_canc[-Training_Ratio, ]
str(Test_Data)

输出：

现在，我们构建回归函数GLM()函数的帮助。我们将公式Class〜Cell.shape用作第一个参数，并将属性族指定为“ binomial ”，并使用Training_data作为第三参数。

例

#Creating Regression Model
glm(Class ~ Cell.shape, family="binomial", data = Training_Data)

输出：

现在，使用摘要函数进行分析。

#Creating Regression Model
model<-glm(Class ~ Cell.shape, family="binomial", data = Training_Data)
#Using summary function
print(summary(model))

输出：