Programming Exercise 5:Regularized Linear Regression and Bias v.s.Variance

1 Regularized Linear Regression 1.1 Visualizing the dataset

his dataset is divided into three parts:
•A training set that your model will learn on: X, y
•A cross validation set for determining the regularization parameter:Xval, yval
•A test set for evaluating performance. These are “unseen” examples which your model did not see during training: Xtest, ytest

plot the training data

# 1 Regularized Linear Regression
from scipy.io import loadmat

data=loadmat('ex5data1.mat')
X = data['X']
y = data['y']
Xval = data['Xval']
yval = data['yval']
Xtest = data['Xtest']
ytest = data['ytest']

# 1.1 Visualizing the dataset
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.scatter(X, y,c='r', marker='x')
ax.set_xlabel('Change in water level (x)')
ax.set_ylabel('Water flowing out of the dam (y)')
plt.show()

1.2 Regularized linear regression cost function

write a function to calculate the regularized linear regression cost function

try to vectorize your code and avoid writing loops

using theta initialized at [1; 1]. You should expect to see an output of 303.993.

# 1.2 Regularized linear regression cost function
import numpy as np
X = np.matrix(np.insert(data['X'], 0, values=np.ones(data['X'].shape[0]), axis=1))

def cost(theta,X,y):
    m=X.shape[0]
    inner=X*theta.T-y
    cost=(inner.T*inner)/(2*m)
    return cost

def costReg(theta,X,y,lamda):
    m=X.shape[0]
    reg=lamda/(2*m)*(theta*theta.T)
    return cost(theta,X,y)+reg

theta=np.ones(X.shape[1])
theta=np.matrix(theta)

print(theta)
print(X)
print(costReg(theta,X,y,1)[0,0])

[[1. 1.]]
[[  1.         -15.93675813]
 [  1.         -29.15297922]
 [  1.          36.18954863]
 [  1.          37.49218733]
 [  1.         -48.05882945]
 [  1.          -8.94145794]
 [  1.          15.30779289]
 [  1.         -34.70626581]
 [  1.           1.38915437]
 [  1.         -44.38375985]
 [  1.           7.01350208]
 [  1.          22.76274892]]
304.0348588869309

1.3 Regularized linear regression gradient

run your gradient function using theta initialized at [1; 1]

expect to see a gradient of [-15.30; 598.250]

# 1.3 Regularized linear regression gradient
def gradient(theta,X,y):
    m=X.shape[0]
    inner=X.T*(X*theta.T-y)/m
    return inner

def gradientReg(theta,X,y,lamda):
    m = X.shape[0]
    reg=(lamda/m)*theta[:, 1:theta.shape[1]]
    return gradient(theta,X,y)+reg

print(gradientReg(theta,X,y,1))

[[-15.21968234]
 [598.25074417]]

1.4 Fitting linear regression

uses fmincg to optimize the cost function

set regularization parameter λ to zero

(trying to fit a 2-dimensional θ,regularization will not be incredibly helpful for a θ of such low dimension)

plot the best fit line

# 1.4 Fitting linear regression
import scipy.optimize as opt

print(theta)
print(X)
y=np.matrix(y)
final_theta = opt.minimize(fun=costReg, x0=theta, args=(X, y, 0), method='TNC', jac=gradientReg, options={'disp': True}).x
print(final_theta)

b = final_theta[0] # intercept
m = final_theta[1] # slope

fig, ax = plt.subplots()
plt.scatter(X[:,1], y, c='r', marker='x',label="Training data")
plt.plot(X[:, 1], X[:, 1]*m + b, c='b', label="Prediction")
ax.set_xlabel('Change in water level (x)')
ax.set_ylabel('Water flowing out of the dam (y)')
ax.legend()
plt.show()

  File "F:/AI/ex5-bias vs variance/1.py", line 25, in cost
    inner=X*theta.T-y
  File "D:Anaconda3envskglibsite-packagesnumpymatrixlibdefmatrix.py", line 218, in __mul__
    return N.dot(self, asmatrix(other))
  File "<__array_function__ internals>", line 6, in dot
ValueError: shapes (12,2) and (1,2) not aligned: 2 (dim 1) != 1 (dim 0)

这里报错就很莫名其妙，明明前面调用这个函数完全没有问题。。。并且如果把转置去掉前面又会出错。。。可能是minimize的问题？？？

Programming Exercise 5:Regularized Linear Regression and Bias v.s.Variance

Python相关栏目本月热门文章