逻辑回归 | 忘生的blog

1.sigmoid函数

在逻辑回归中，我们引入了一个函数，sigmoid函数。

$y=\frac{1}{1+e^{-x}}$

该函数有一个很好的特性就是在实轴域上y的取值在（0,1），且有很好的对称性，对极大值和极小值不敏感（因为在取向无论是正无穷还是负无穷的时候函数的y几乎很稳定）。由于sigmoid函数的值域在（0，1）之间，这正好可以表示一个概率值，令 $P(Y=1 |X)=\frac{1}{1+e^{-\theta{x}+b}}$ 其中， $\theta ,x$ 为向量。
上式表示在给定x的值后，预测y=1的概率。所以有假设函数

$h_{\theta}(x)=\frac{1}{1+e^{-\theta{x}+b}}$

当Y=1时, $P(Y=1)=\frac{1}{1+e^{-\theta{x}+b}}=h_{\theta}(x)$
显然Y=0时， $P(Y=0)=1-P(Y=1)=1-h_{\theta}(x)$
,将其整合为一个公式：

$P=(Y=y)=yh_{\theta}(x)+(1-y)(1-h_{\theta}(x)),(y∈{(0,1}))$

2.损失函数

1对数损失函数

这里引入对数似然函数：

$L(Y,P(Y|X))=-logP(Y|X)$

当Y=1时，损失函数为： $-logP(Y=1|X)$
当Y=0时，损失函数为： $-log(1-P(Y=1|X))$
则将两个公式合并就得到了一个样本的损失函数：

$L(Y|X)=-ylogP(Y=1|X)-(1-y)log(1-P(Y=1|X))=-(ylog(h_{\theta}(x))+(1-y)log(1-h_{\theta}(x)))$

所以m个样本的损失函数为：

$J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}{(y_{i}log(h_{\theta}(x_{i}))+(1-y_{i})log(1-h_{\theta}(x_{i})))}$

2.最大似然函数

在这里我们将y的概率公式合并为

$P(Y |X)=(h_{\theta}(x_{i}))^{y_{i}}(1-h_{\theta}(x_{i}))^{(1-y_{i})}$

由于每个样本的概率密度函数是一样的，所以对于每一个样本来说都是同分布的，所以我们构造出这m个样本的最大似然函数：

$L(\theta | ((x_{1},y_{1}),(x_{2},y_{2})...(x_{m},y_{m})))=\pi_{i=1}^{m}{(h_{\theta}(x_{i}))^{y_{i}}(1-h_{\theta}(x_{i})^{(1-y_{i})}}$

对似然函数取对数的得到对数似然函数：

$logL(\theta)=\sum_{i=1}^{m}{(y_{i}log(h_{\theta}(x_{i}))+(1-y_{i})log(1-h_{\theta}(x_{i})))}$

似然函数是要求似然函数的最大值，而我们引入代价函数 $J(\theta)=-\frac{1}{m}logL(\theta)$ 来用梯度下降法来求最小值。

$J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}{(y_{i}log(h_{\theta}(x_{i}))+(1-y_{i})log(1-h_{\theta}(x_{i})))}$

3.对 $\theta$ 求偏导

$\frac{\partial}{\partial \theta}J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}({y_{i}\frac{1}{h_{\theta}(x_{i})}\frac{\partial}{\partial \theta}h_{\theta}(x_{i})}-(1-y_{i})(\frac{1}{1-h_{\theta}(x)})(\frac{\partial}{\partial \theta}h_{\theta}(x_{i})))$

$=-\frac{1}{m}\sum_{i=1}^{m}(y_{i}\frac{1}{h_{\theta}(x_{i})}-(1-y_{i})(\frac{1}{1-h_{\theta}(x_{i})}) \frac{\partial}{\partial \theta}h_{\theta}(x_{i})$

sigmoid函数求导：

$g(z)=\frac{1}{1+e^{-z}}$

$g^{'}(z)=\frac{e^{-z}}{(1+e^{-z})^{2}}$

$=\frac{1}{1+e^{-z}}\frac{e^{-z}}{1+e^{-z}}$

$=\frac{1}{1+e^{-z}}(\frac{1+e^{-z}}{1+e^{-z}}-\frac{1}{1+e^{-z}})$

$=g(z)(1-g(z))$

由于 $h_{\theta}(x)=g(\theta{x})=\frac{1}{1+e^{-\theta x}}$

所以： $\frac{\partial}{\partial \theta}h_{\theta}(x)=h_{\theta}({x})(1-h_{\theta}(x))x$

$\frac{\partial}{\partial \theta}J(\theta)$

$=-\frac{1}{m}\sum_{i=1}^{m}(y_{i}\frac{1}{h_{\theta}(x_{i})}-(1-y_{i})(\frac{1}{1-h_{\theta}(x_{i})})(h_{\theta}({x_{i}})(1-h_{\theta}(x_{i}))x_{i}$

$=-\frac{1}{m}\sum_{i=1}^{m}(y_{i}(1-h_{\theta}(x_{i}))-(1-y_{i})h_{\theta}(x_{i}))x_{i}$

$=-\frac{1}{m}\sum_{i=1}^{m}(y_{i}-y_{i}h_{\theta}(x_{i})-h_{\theta}(x_{i})+y_{i}h_{\theta}(x_{i}))x_{i}$

$=\frac{1}{m}\sum_{i=1}^{m}(h_{\theta}(x_{i})-y_{i})x_{i}$

最终的结果与线性回归求得的结果一样

4.代码演示

1 2	import numpy as np import matplotlib.pyplot as plt

python
x=np.array([0.50,0.75,1.00,1.25,1.50,1.75,1.75,2.00,2.25,2.50,
2.75,3.00,3.25,3.50,4.00,4.25,4.50,4.75,5.00,5.50])
x=x.reshape(-1,1)
y=np.array([0,0,0,0,0,0,1,0,1,0,1,0,1,0,1,1,1,1,1,1])
x=np.concatenate((np.ones((x.shape[0],1)),x),axis=1)#添加偏置项
y=y.reshape(-1,1)
y.shape

(20, 1)

plt.scatter(x[:,1],y)
plt.xlabel('Hours')
plt.ylabel('pass')
plt.title('exam-data')
plt.show()

在这里插入图片描述

$J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}{(y_{i}log(h_{\theta}(x_{i}))+(1-y_{i})log(1-h_{\theta}(x_{i})))}$

$\frac{\partial}{\theta}J(\theta)=\frac{1}{m}\sum_{i=1}^{m}(h_{\theta}(x_{i})-y_{i})x_{i}$

def sigmoid(x):#sigmoid函数
    return 1.0/(1+np.exp(-x))
def cost(x,y,theta):#代价函数
    x=np.matrix(x)
    y=np.matrix(y)
    theta=np.matrix(theta)
    first=np.multiply(y,np.log(sigmoid(x*theta)))
    second=np.multiply(1-y,np.log(1-sigmoid(x*theta)))
    return np.sum(first+second)/(-len(x))
def grad(x,y,theta,epochs=1000,lr=0.001):#进行梯度下降
    x=np.matrix(x)
    y=np.matrix(y)
    theta=np.matrix(theta)
    #print(x.shape,' ',theta.shape)
    m=x.shape[0]
    costList=[]
    for i in range(epochs+1):
        #print(x.shape,' ',theta.shape)
        h=sigmoid(x*theta)
        delta=x.T*(h-y)/m
        theta=theta-lr*delta
        if(i%50==0):
            costList.append(cost(x,y,theta))#计算损失值
    return theta,costList

theta=np.ones((x.shape[1],1))
#print(theta.shape)
theta,costList=grad(x,y,theta,3000,0.3)
a=np.linspace(0,3000,61)#生成61个数
plt.plot(a,costList,c='y')
plt.show()

在这里插入图片描述

from sklearn.linear_model import LogisticRegression
x=np.array([0.50,0.75,1.00,1.25,1.50,1.75,1.75,2.00,2.25,2.50,
2.75,3.00,3.25,3.50,4.00,4.25,4.50,4.75,5.00,5.50])
x=x.reshape(-1,1)
y=np.array([0,0,0,0,0,0,1,0,1,0,1,0,1,0,1,1,1,1,1,1])
y=y.reshape(-1,1)
model=LogisticRegression()
model.fit(x,y)
b=model.intercept_
a=model.coef_
print(a,b)
print(theta)

[[1.14860386]] [-3.13952411]
[[-4.07770898]
 [ 1.50464392]]

from sklearn.metrics import classification_report
def predect(x,theta):

    x=np.matrix(x)
    theta=np.matrix(theta)
    return [1 if i>0.5 else 0 for i in (sigmoid(x*theta))]
x2=np.concatenate((np.ones((x.shape[0],1)),x),axis=1)#添加偏置项
prediction=predect(x2,theta)
print(classification_report(y,prediction))

              precision    recall  f1-score   support

           0       0.80      0.80      0.80        10
           1       0.80      0.80      0.80        10

    accuracy                           0.80        20
   macro avg       0.80      0.80      0.80        20
weighted avg       0.80      0.80      0.80        20

可以看出正确率有80%

5.(补充)reshape函数

①numpy.arange(n).reshape(a, b) 依次生成n个自然数，并且以a行b列的数组形式显示

②mat (or array).reshape(c, -1) 必须是矩阵格式或者数组格式，才能使用 .reshape(c, -1) 函数，表示将此矩阵或者数组重组，以 c行d列的形式表示
③reshape(1,-1)转化成1行
reshape(2,-1)转换成两行
reshape(-1,1)转换成1列
reshape(-1,2)转化成两列

>>> import numpy as np
>>> np.arange(16).reshape(2,8)#生成16个数，以2行8列形式显示
array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14, 15]])
>>> a=np.arange(16).reshape(2,8)#生成16个数，以2行8列形式显示
>>> a.shape
(2, 8)
>>> a.reshape(4,-1)#改变为m行，d列（-1表示列数自动计算,d=a*b/m）
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])
>>> a.reshape(-1,2)#改变为d行，m列（-1表示行数自动计算,d=a*b/m）
array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11],
       [12, 13],
       [14, 15]])
>>> np.array(1,12,2)#(a,b.c) 从数字a起，步长为c，到b结束
>>> np.arange(1,12,2)#(a,b.c) 从数字a起数，步长为c，到b结束
array([ 1,  3,  5,  7,  9, 11])
>>> a.reshape(1,-1)#1行
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15]])
>>> a.reshape(2,-1)#2行
array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14, 15]])
>>> a.reshape(-1,1)#一列
array([[ 0],
       [ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5],
       [ 6],
       [ 7],
       [ 8],
       [ 9],
       [10],
       [11],
       [12],
       [13],
       [14],
       [15]])
>>> a.reshape(-1,2)#2列
array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11],
       [12, 13],
       [14, 15]])
>>>

1.sigmoid函数

2.损失函数

1对数损失函数

2.最大似然函数

3.对θ\thetaθ求偏导

4.代码演示

5.(补充)reshape函数

3.对 $\theta$ 求偏导