逻辑回归(Logistic Regression)-吴恩达-机器学习python

逻辑回归介绍

逻辑回归是用来研究分类问题，输出值总是0-1之间。输入数据值x，计算函数h(x)，预测y值，进行分类。

逻辑回归模型

其中：X表示特征向量，g表示sigmoid函数

代价函数

代价函数原理如下

1.为了防止逻辑回归因为模型复杂，误差平方和定义的代价函数是非凸函数，后面梯度下降局部极小化，选择重新定义代价函数。

2.重新定义的代价函数

梯度下降

公式推导

正则化正则化

过拟合问题，以多项式理解，푥 的次数越高，拟合的越好，但相应的预测的能力就可能变差。例如，模型是一个四次方的模型，过于强调拟合原始数据，而丢失了算法的本质：预测新数据。我们可以看出，若给出一个新的值使之预测，它将表现的很差，是过拟合。

如何结果过拟合问题？

1.丢弃一些不能帮助我们正确预测的特征。可以是手工选择保留哪些特征，或者使用一些模型选择的算法来帮忙（例如 PCA）

2.正则化。保留所有的特征，但是减少参数的大小（magnitude）

正则化线性回归

正则化逻辑回归模型

逻辑回归代价函数

梯度下降过程

练习一：逻辑回归Logistic Regression 第一步：输入数据，并对数据可视化

data =pd.read_csv('ex2data1.txt',names =['试卷1','试卷2','合格'])
#data.head()
#data[['试卷1','试卷2']].head()

部分数据如下：

#数据可视化，输出两个标签的散点图

#分类提取，标签为1和0的数据
data_pass =data[data['合格']==1]
data_fail =data[data['合格']==0]
#数据可视化，输出两个标签的散点图
plt.rcParams['font.sans-serif']='SimHei'
plt.rcParams['axes.unicode_minus'] =False
plt.figure(figsize =(8,7))
plt.scatter(data_pass["试卷1"],data_pass["试卷2"],marker ='o',c ='yellow')
plt.scatter(data_fail["试卷1"],data_fail["试卷2"],marker ='x',c ='red')
plt.legend(['合格','不合格'])
plt.title('考试成绩散点图')
plt.xlabel('考试1成绩')
plt.ylabel('考试2成绩')
plt.show()

第二步：定义代价函数和梯度下降

#数据处理，提取特征数据和标签数据

#对特征插入第一列1值
ones =pd.Dataframe({'1值':np.ones(len(data))})
data_X =pd.concat([ones,data[['试卷1','试卷2']]],axis=1)
#标签数据
data_y =data['合格']
data_X.head()

data_X.shape,data_y.shape
((100, 3), (100,))
#初始化θ零值
theta =np.zeros(data_X.shape[1])
theta.shape
(3,)
#化为数组运算
X =np.array(data_X.values)
y =np.array(data_y.values)
type(X),type(y),type(theta)
(numpy.ndarray, numpy.ndarray, numpy.ndarray)

定义损失函数

#定义损失函数
def cost(theta,X,y):
    '''
    输入：theta 参数3*1，X 特征数据 100*3，y标签数据100*1
    输出：cost 损失值
    '''
    #假设函数
    h =X@theta
    g =1/(1+np.exp(-h))
    #损失值
    first =-y*np.log(g)
    second =(1-y)*(np.log(1-g))
    return np.mean(first-second)
cost(theta,X,y)
#0.6931471805599453

定义梯度迭代函数

#定义梯度迭代函数
def gradient(theta,X,y):
    '''
    输入：theta 参数3*1，X 特征数据 100*3，y标签数据100*1
    输出：迭代后的参数θ值 3*1
    '''
    #假设函数
    h =X@theta
    g =1/(1+np.exp(-h))
    
    temp =g-y
    return (X.T@temp)/len(X)
gradient(theta,X,y)
#array([ -0.1       , -12.00921659, -11.26284221])

第三步：训练模型和学习参数θ

#利用scipy自带的minimize来迭代参数θ
result =opt.minimize(fun =cost,x0 =theta, args=(X,y),method ='Newton-CG',jac =gradient)
result

#最后迭代的参数theta
result.x
#array([-25.15336243,   0.20616795,   0.2014071 ])
#最后的损失值
result.fun
#0.20349771104070527

第四步：评估模型和可视化结果

h =X@theta_final
g =1/(1+np.exp(-h))
g.shape
predict =[]
for i in g:
    if i>=0.5:
        predict.append(1)
    else:
        predict.append(0)
predict[:5]      
#[0, 0, 0, 1, 1]
from sklearn.metrics import classification_report#这个包是评价报告
print(classification_report(y,predict))

数据可视化

#数据可视化
a =np.linspace(data["试卷1"].min(),data["试卷2"].max(),50)
b = -(theta_final[0]+theta_final[1]*a)/theta_final[2]
plt.figure(figsize =(8,7))
plt.scatter(data_pass["试卷1"],data_pass["试卷2"],marker ='o',c ='yellow')
plt.scatter(data_fail["试卷1"],data_fail["试卷2"],marker ='x',c ='red')
plt.legend(['合格','不合格'])
plt.plot(a,b)
plt.title('考试成绩散点图')
plt.xlabel('考试1成绩')
plt.ylabel('考试2成绩')
plt.show()

练习二：正则化线性回归第一步：可视化数据

data2 =pd.read_csv('ex2data2.txt',names =['试卷1','试卷2','合格'])
data2.head()

# 分类提取，标签为1和0的数据
data_pass2 =data2[data2['合格']==1]
data_fail2 =data2[data2['合格']==0]
#数据可视化，输出两个标签的散点图
plt.rcParams['font.sans-serif']='SimHei'
plt.rcParams['axes.unicode_minus'] =False
plt.figure(figsize =(8,7))
plt.scatter(data_pass2["试卷1"],data_pass2["试卷2"],marker ='o',c ='yellow')
plt.scatter(data_fail2["试卷1"],data_fail2["试卷2"],marker ='x',c ='red')
plt.legend(['合格','不合格'])
plt.title('考试成绩散点图')
plt.xlabel('考试1成绩')
plt.ylabel('考试2成绩')
plt.show()

第二步：特征映射（x1,x2的一个6阶多项式）

#定义特征映射函数
def feature_mapping(x1,x2,power):
    '''
    输入：x1,x2 原始数据的两个特征 数组118*1，power 对特征数据映射后的多项式阶数
    输出；映射后的特征数据的数组格式
    '''
    d =dict()
    for i in range(power+1):
        for j in range(i+1):
            d['{}-{}'.format(i-j,j)]=np.power(x1,i-j)*np.power(x2,j)
    d =pd.Dataframe(d)
    return np.array(d.values)
dx1 =np.array(data2['试卷1'])
dx2 =np.array(data2['试卷2'])
y2 =np.array(data2['合格'])
dx1.shape,dx2.shape,y2.shape
#((118,), (118,), (118,))
#对特征数据进行映射
X2 =feature_mapping(dx1,dx2,6)
X2.shape
#(118, 28)
#初始化θ零值
theta2 =np.zeros(X2.shape[1])
theta2.shape
type(X2),type(y2),type(theta2)
#(numpy.ndarray, numpy.ndarray, numpy.ndarray)
X2.shape,y2.shape,theta2.shape
#((118, 28), (118,), (28,))

第三步：损失函数和梯度迭代函数

#定义sigmoid函数
def sigmoid(x):
    f =1/(1+np.exp(-x))
    return f

#定义损失函数
def cost_regularize(theta,X,y,lamda=1):
    '''
    输入：theta 参数28*1，X 特征数据 118*28，y标签数据118*1,l 正则化参数lamda，默认为1
    输出：cost 损失值
    '''
    #假设函数
    h =X@theta
    g =1/(1+np.exp(-h))
    #损失值
    first =-y*np.log(g)
    second =(1-y)*(np.log(1-g))
    third =np.sum(np.power(theta2,2))*lamda/(len(X)*2)
    return np.mean(first-second)+third
cost_regularize(theta2,X2,y2)
#0.6931471805599454
#定义梯度迭代函数
def gradient_regularize(theta,X,y,lamda=1):
    '''
    输入：theta 参数28*1，X 特征数据 118*28，y标签数据118*1,l 正则化参数lamda，默认为1
    输出：迭代后的参数θ值 28*1
    注：theta0的梯度不用正则化
    '''
    #假设函数
    h =X@theta
    g =1/(1+np.exp(-h))
    
    temp =g-y
    theta_ =(X.T@temp)/len(X)
    #正则化
    theta_reg =theta*lamda/(len(X))
    theta_reg[0] =0
    return theta_+theta_reg

gradient_regularize(theta2,X2,y2)

第四步：训练模型和学习参数theta

#利用scipy自带的minimize来迭代参数θ
result =opt.minimize(fun =cost_regularize,x0 =theta2, args=(X2,y2),method ='Newton-CG',jac =gradient_regularize)
result

第五步：可视化结果，画出边界

#模型评价
theta_final =result.x
h =X2@theta_final
g =1/(1+np.exp(-h))
g.shape
#(118,)

predict =[]
for i in g:
    if i>=0.5:
        predict.append(1)
    else:
        predict.append(0)
predict[:5]      
#[1, 1, 1, 1, 1]
from sklearn.metrics import classification_report#这个包是评价报告
print(classification_report(y2,predict))

可视化

逻辑回归(Logistic Regression)-吴恩达-机器学习python

Python相关栏目本月热门文章