栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 软件开发 > 后端开发 > Python

python 使用plt.tricontour绘制逻辑回归决策边界(不规则空间下的等高线图)

Python 更新时间: 发布时间: IT归档 最新发布 模块sitemap 名妆网 法律咨询 聚返吧 英语巴士网 伯小乐 网商动力

python 使用plt.tricontour绘制逻辑回归决策边界(不规则空间下的等高线图)

前言

最近在做有关逻辑回归的作业,需要绘制决策边界。绘制原理是:
对于逻辑回归,其决策边界为 θ T X = 0 theta^TX = 0 θTX=0处,其中 θ = [ θ 0 , θ 1 , θ 2 , ⋯   , θ n ] ; X = [ X 0 , X 1 , X 2 , ⋯   , X n ] theta = [theta_0,theta_1,theta_2,cdots,theta_n ]; X = [X_0,X_1,X_2,cdots,X_n ] θ=[θ0​,θ1​,θ2​,⋯,θn​];X=[X0​,X1​,X2​,⋯,Xn​]。我们将训练所得的 θ theta θ代入,再使用plt.contour(xx,yy,zz,0)即可。

在该题目中,所给数据的决策边界并非线性,因此需要进行一定的多项式变换。poly_feat返回两个特征的五阶组合多项式如 x 1 5 , x 1 x 2 4 , x 1 2 x 2 3 , ⋯   , x 2 5 x_1^5,x_1x_2^4, x_1^2x_2^3,cdots,x_2^5 x15​,x1​x24​,x12​x23​,⋯,x25​

from sklearn.preprocessing import PolynomialFeatures#%%poly feature transformation
poly_feat = PolynomialFeatures(degree=5, include_bias=True)
X_poly = poly_feat.fit_transform(X[:,1:])

使用五阶多项式变化,便可以将一个只有三个特征(其中第一个特征为1)的X变成一个有21个特征的样本。而且,可以绘制非线性的决策边界。

from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
epochs = 1000000
lr = 0.01
lamb = 0
degree = 5
poly_feat = PolynomialFeatures(degree, include_bias=True)
theta = np.zeros((X_poly.shape[1],1))
final_theta = batch_gradient_descent(X_poly, y, theta, epoch=epochs, lr=lr, lamb=lambs)
test1 = np.array(data['Test 1'])#feature1, disorder
test2 = np.array(data['Test 2'])#feature2 disorder
Test1, Test2 = np.meshgrid(test1, test2)
score_mesh = np.zeros((test1.size, test2.size))
#construce score mesh by iteraing every element of features
for idx1, t1 in enumerate(test1):
    for idx2, t2 in enumerate(test2):
        poly = poly_feat.fit_transform(np.array([t1, t2]).reshape(1,-1))#consture polynomial features
        score_mesh[idx1, idx2] = poly@final_theta
cs = plt.contour(test1, test2, score_mesh,0)
cs.collections[0].set_label('lamb = '+str(lambs))# add label for contour
#plot data scatter    
positive = data[data['Accepted'].isin([1])]
negative = data[data['Accepted'].isin([0])]

plt.scatter(positive['Test 1'], positive['Test 2'], s=20, c='c', marker='o', label='Accepted')
plt.scatter(negative['Test 1'], negative['Test 2'], s=30, c='m', marker='x', label='Not Accepted')
plt.legend()
plt.xlabel('Test 1 Score')
plt.ylabel('Test 2 Score')

其中test1, test2如下图所示,均为不规则序列

最终所得决策边界如下图所示

可以看出,该决策边界十分混乱,出现了多条高程值为0的线。通过分析可知,这是由于等高线图的X与Y并不是规则空间。由于X与Y不是递增或者递减,所以会出现多条等高线。解决方法有以下几种:

解决方法一

构建规则格网,利用递增或者递减的X或Y构建高程格网

from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
epochs = 1000000
lr = 0.01
lamb = 0
degree = 5
color = ['r','g','b']
for idx,lambs in enumerate([0]):
    poly_feat = PolynomialFeatures(degree, include_bias=True)
    X_poly = poly_feat.fit_transform(X[:,1:])
    theta = np.zeros((X_poly.shape[1],1))
    final_theta = batch_gradient_descent(X_poly, y, theta, epoch=epochs, lr=lr, lamb=lambs)
    xk = np.linspace(-1, 1, test1.size)#constuct orderly sequency by np.linspace
    yk = xk
    xx, yy = np.meshgrid(xk,yk) 
    score_mesh = np.zeros((test1.size, test2.size))
    for idx1, t1 in enumerate(xk):
        for idx2, t2 in enumerate(yk):
            poly = poly_feat.fit_transform(np.array([t1, t2]).reshape(1,-1))
            score_mesh[idx2, idx1] = poly@final_theta
    cs = plt.contour(xx, yy, score_mesh,0)
    cs.collections[0].set_label('lamb = '+str(lambs))
    
positive = data[data['Accepted'].isin([1])]
negative = data[data['Accepted'].isin([0])]

plt.scatter(positive['Test 1'], positive['Test 2'], s=20, c='c', marker='o', label='Accepted')
plt.scatter(negative['Test 1'], negative['Test 2'], s=30, c='m', marker='x', label='Not Accepted')
plt.legend()
plt.xlabel('Test 1 Score')
plt.ylabel('Test 2 Score')

所得等高线如下图所示:

解决方法二

依然使用不规则数据test1,test2构建高程格网,但是使用plt.tricontour函数对不规则三角网进行插值,得到等高线:

from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
epochs = 1000000
lr = 0.01
lamb = 0
degree = 5
color = ['r','g','b']
for idx,lambs in enumerate([0]):
    poly_feat = PolynomialFeatures(degree, include_bias=True)
    theta = np.zeros((X_poly.shape[1],1))
    final_theta = batch_gradient_descent(X_poly, y, theta, epoch=epochs, lr=lr, lamb=lambs)
    score_mesh_flat = poly_feat.fit_transform(np.stack([test1, test2],axis = 1))@final_theta
    test1 = np.array(data['Test 1'])#feature1, disorder
    test2 = np.array(data['Test 2'])#feature2 disorder
    cs = plt.tricontour(test1, test2, score_mesh_flat.flatten(),levels = 0)
    cs.collections[0].set_label('lamb = '+str(lambs))
    
positive = data[data['Accepted'].isin([1])]
negative = data[data['Accepted'].isin([0])]

plt.scatter(positive['Test 1'], positive['Test 2'], s=20, c='c', marker='o', label='Accepted')
plt.scatter(negative['Test 1'], negative['Test 2'], s=30, c='m', marker='x', label='Not Accepted')
plt.legend()
plt.xlabel('Test 1 Score')
plt.ylabel('Test 2 Score')

解决方法三

依然使用原始数据与plt.contour函数,但是此时对数据进行排序(test1.sort(); test2.sort)

from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
epochs = 1000000
lr = 0.01
lamb = 0
degree = 5
color = ['r','g','b']
for idx,lambs in enumerate([0]):
    poly_feat = PolynomialFeatures(degree, include_bias=True)
    X_poly = poly_feat.fit_transform(X[:,1:])
    theta = np.zeros((X_poly.shape[1],1))
    final_theta = batch_gradient_descent(X_poly, y, theta, epoch=epochs, lr=lr, lamb=lambs)
    test1 = np.array(data['Test 1'])
    test2 = np.array(data['Test 2'])
    test1.sort();test2.sort();# sort the disorder sequency
    Test1, Test2 = np.meshgrid(test1, test2)
    score_mesh = np.zeros((test1.size, test2.size))
    for idx1, t1 in enumerate(test1):
        for idx2, t2 in enumerate(test2):
            poly = poly_feat.fit_transform(np.array([t1, t2]).reshape(1,-1))
            score_mesh[idx2, idx1] = poly@final_theta
    cs = plt.contour(Test1, Test2, score_mesh,levels = 0)
    cs.collections[0].set_label('lamb = '+str(lambs))
    
positive = data[data['Accepted'].isin([1])]
negative = data[data['Accepted'].isin([0])]

plt.scatter(positive['Test 1'], positive['Test 2'], s=20, c='c', marker='o', label='Accepted')
plt.scatter(negative['Test 1'], negative['Test 2'], s=30, c='m', marker='x', label='Not Accepted')
plt.legend()
plt.xlabel('Test 1 Score')
plt.ylabel('Test 2 Score')

原理与思考

同样是不规则空间的等高线绘制,plt.contour函数与plt.tricontour,之所以会出现这么大的不同
是由于plt.contour的等高线算法是针对规则格网的等高线算法,要求X与Y是单调递增或递减的,而plt.contour针对不规则三角网的等高线算法。

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/843978.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号