700字范文 > 机器学习实战（Machine Learning in Action）学习笔记————05.Logistic回归

机器学习实战（Machine Learning in Action）学习笔记————05.Logistic回归

时间：2019-05-26 03:55:16

关键字：Logistic回归、python、源码解析、测试

作者：米仓山下

时间：-10-26

机器学习实战（Machine Learning in Action,@author: Peter Harrington）

源码下载地址：/books/machine-learning-in-action

git@:pbharrin/machinelearninginaction.git

*************************************************************

一、Logistic回归

Sigmoid函数输入为z，z=w0x0+w1x1+w2x2+…wnxn,又写成z=WTX

Sigmoid函数为σ(z)=1/(1+exp(-z))

#Logistic回归分类的原理：训练得到系数矩阵W，将位置特征向量带入Sigmoid，计算得到一个位于0~1之间的数，大于0.5则属于1类，小于0.5则属于0类。

梯度上升法：要找到某个函数的最大值，最好的方法就是沿着该函数梯度的方向探寻。如果梯度记为▽，则函数f(x,y)的梯度表示为：

梯度上升算法到达每个点后都会重新估计移动的方向。从P0开始，计算该点的梯度，函数就根据梯度移动到下一个点P1。在P1点，梯度再次被重新计算，并沿新的梯度方向移动到P2。如此迭代，直到满足停止条件。迭代的过程中，梯度算子总是保证我们能够取到最佳的移动方向。

梯度的方向就是导数最大值的方向，即函数变化率最快的方向。梯度可以通过对函数求导得到。向梯度相反方向移动保证每一次迭代都在减少下降局部全局最小值

用向量来表示的话，梯度算法的迭代公式为：w:=w+α▽wf(w)公式一直迭代下去，直到某个指定值或算法达到某个可以允许的误差范围。

这本书中用的是梯度上升，平时听到比较多的是梯度下降法，其实是一样的，只是移动的方向不同：梯度上升用来求解最大值，梯度下降用来求解最小值。接触过深度学习就知道，梯度下降在求解参数矩阵时非常重要。

主要看两个函数：

#Logistic函数σ(z)=1/(1+exp(-z))def sigmoid(inX):return 1.0/(1+exp(-inX))#Logistic回归梯度上升优化算法def gradAscent(dataMatIn, classLabels):dataMatrix = mat(dataMatIn) #convert to NumPy matrixlabelMat = mat(classLabels).transpose() #convert to NumPy matrixm,n = shape(dataMatrix)alpha = 0.001 #移动步长maxCycles = 500#迭代次数weights = ones((n,1)) #初始化系数向量for k in range(maxCycles): #heavy on matrix operationsh = sigmoid(dataMatrix*weights)#matrix multerror = (labelMat - h) #[注].vector subtractionweights = weights + alpha * dataMatrix.transpose()* error #[注].matrix multreturn weights

[注]书中省略了梯度的推导过程。构造的损失函数为P(y|x;θ)=(hθ(x))^y * (1-hθ(x))^(1-y)，其中h即Logistic函数σ，取其似然函数和最大似然函数，求最大似然估计，然后求导就可以得到上面的结果。参考网址************或则书*******

--------------------------------------------------------------

测试：

>>> import logRegres>>> data,lable=logRegres.loadDataSet()>>> w=logRegres.gradAscent(data,lable)>>> wmatrix([[ 4.12414349],[ 0.48007329],[-0.6168482 ]])>>>#画出决策边界>>> logRegres.plotBestFit(w.getA())>>>

（图-画出决策边界）

--------------------------------------------------------------

方法优化1：随机梯度上升————每次迭代仅用一个样本点来更新回归系数。

对应logRegres.stocGradAscent0方法，迭代次数为数据的条数

方法优化2：改进的随机梯度上升————每次迭代时，调整alpha大小，alpha = 4/(1.0+j+i)+0.0001

alpha随着迭代次数增加不断减小，但又不等于零

--------------------------------------------------------------

#分类函数，在求得参数weights后将其和测试数据inX（向量）带入如下公式，就可以完成二类判别

def classifyVector(inX, weights):prob = sigmoid(sum(inX*weights))if prob > 0.5: return 1.0else: return 0.0

*************************************************************

二、示例：从疝气病症预测马的死亡率

处理数据中的缺失值的方法：

#使用可用特征的均值来填补缺失值；

#使用特殊值来填补缺失值，如-1；

#忽略有缺失值的样本；

使用相似样本均值补缺缺失值；

使用另外的机器学习算法预测缺失值

这个例子中用了0来补缺失值,数据包含28个特征和1列标签（分类两类），horseColicTraining.txt为训练数据，horseColicTest.txt为测试数据。

使用改进的随机梯度上升stocGradAscent1算法，对数据进行测试

>>> logRegres.colicTest()#colicTest()为循环训练1000次再进行测试的效果logRegres.py:18: RuntimeWarning: overflow encountered in expreturn 1.0/(1+exp(-inX))the error rate of this test is: 0.3731340.373134328358209>>>>>> logRegres.multiTest()#colicTest()执行10次的平均错误率the error rate of this test is: 0.343284the error rate of this test is: 0.358209the error rate of this test is: 0.343284the error rate of this test is: 0.343284the error rate of this test is: 0.268657the error rate of this test is: 0.253731the error rate of this test is: 0.343284the error rate of this test is: 0.268657the error rate of this test is: 0.447761the error rate of this test is: 0.283582after 10 iterations the average error rate is: 0.325373>>>

其他代码：

sigmoidPlot.py #s = 1/(1 + exp(-t))函数在[-5,5]和[-60,60]上的形态对比

plotSDerror.py #stocGradAscent1算法，在迭代过程中，三个参数的变化趋势

plotGD.py #梯度下降示意图

plot2D.py #stocGradAscent0进行梯度下降，决策边界

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。