~~aaaaaa怎么那么多东西要看啊我要谢了~~

呃呃呃呃这篇都是单纯掉包的呢，根据原理跑的代码在另一篇吧

前言

查看调用api：sklearn官方文档

np.random.

梯度下降

1 以梯度下降为例，为什么要预处理？

2 原理

2.1 批量梯度下降编辑

2.2 随机梯度下降

3 代码

正则化

1 如何解决过拟合的问题？--正则化

2 为什么L1更稀疏？

3 L1正则化项

4 L2正则化项

Lasso回归和岭回归

1 关于正则化项：

2 岭回归：L2正则化

3 Lasso回归：L1正则化

案例：对波士顿房价进行预测

前言

查看调用api：sklearn官方文档
# 搜索api文档：ctrl+F搜索关键词 API Reference — scikit-learn 1.1.1 documentation

np.random.

rand（x,y) 生成【x行y列】的（0，1）区间的随机数
randn（x,y) 生成【x行y列】（0，1）区间正态分布的随机数
randint 生成整数

import numpy as np
X=2*np.random.rand(100,1)
y=4+3*X+np.random.randn(100,1)

梯度下降

1 以梯度下降为例，为什么要预处理？
### 问题:
1.步长太小，时间多
2.步长太大，更糟糕
3.局步最优和全局最优点
（影响：随机参数初始化位置，多调学习率）
4.如果是凸函数就没问题，只有全局最优
5.取值不同也会影响
所以要标准化和归一化!!

学习率不同的影响，代码见另一篇哈哈

2 原理

2.1 批量梯度下降

2.2 随机梯度下降
就是随机选一个样本来算那个偏导数，然后随机下降

！可以随着迭代次数增多步长减小啊

3 代码

直接调api简单的代码：根据原理跑的代码在另一个博客

from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression,SGDRegressor,Ridge
from sklearn.metrics import mean_squared_error

#梯度下降进行预测
def linear2():
    #数据获取和分割
    boston=load_boston()
    x_trian,x_test,y_train,y_test=train_test_split(boston.data,boston.target,random_state=22)
    #数据预处理
    transfer=StandardScaler()
    x_train=transfer.fit_transform(x_trian)
    x_test=transfer.transform(x_test)
    #预估器
    estimator=SGDRegressor()
    estimator.fit(x_train,y_train)
    #得出模型
    print('coef:',estimator.coef_)
    print('bias:',estimator.intercept_)
    #模型评估
    y_predict=estimator.predict(x_test)
    print('预测房价:',y_predict)
    error=mean_squared_error(y_test,y_predict)
    print('均方误差：',error)

if __name__=='__main__':
    linear2()

正则化

1 如何解决过拟合的问题？--正则化
添加正则化项（惩罚项），均衡各个参数；

举个例子：例如x=[1,1,1,1],而w1=[1,0,0,0],w2=[0.25,0.25,0.25,0.25],实际上预测结果w1x和w2x一样，但我们认为w2考虑更均衡。因此增加惩罚项w的平方。（L2)

那么W1方=1，W2方=1/4，W2方惩罚项小，我们认为它更优。

2 为什么L1更稀疏？
比如下边右图，

二维为例子，L1是|x|+|y|,L2是x2+y2，L1与（等高线）切的往往在坐标轴上的点，因而个别参数变为0，因此稀疏；而L2往往切在平滑的点，选择了更多的特征。

3 L1正则化项

4 L2正则化项

Lasso回归和岭回归
L1正则化对应Lasso回归，L2正则化对应岭回归。

1 关于正则化项：

正则化力度越小时候，权重值大；正则化力度小，权重值大（图中可以看出)
正则化项=lamda*惩罚项，lamda大时惩罚项小，权重系数小；反之亦然

2 岭回归：L2正则化

sklearn.linear_model.Ridge()

api默认的是用asg（批量随机梯度下降方式

alpha：惩罚项系数，正则化力度，就是lamda
fit_intercept=True 是否添加偏置
solver=’auto'自动选择优化方法数据多时自动选择：sag随机梯度下降优化
normalize=False true的话就是先标准化处理
.coef_权重系数
.intercept_ 偏置

from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression,SGDRegressor,Ridge
from sklearn.metrics import mean_squared_error

#岭回归对波士顿房价进行预测
def linear():
    #数据获取和分割
    boston=load_boston()
    x_trian,x_test,y_train,y_test=train_test_split(boston.data,boston.target,random_state=22)
    #数据预处理
    transfer=StandardScaler()
    x_train=transfer.fit_transform(x_trian)
    x_test=transfer.transform(x_test)
    #预估器
    estimator=Ridge()
    estimator.fit(x_train,y_train)
    #得出模型
    print('coef:',estimator.coef_)
    print('bias:',estimator.intercept_)
    #模型评估
    y_predict=estimator.predict(x_test)
    print('预测房价:',y_predict)
    error=mean_squared_error(y_test,y_predict)
    print('均方误差：',error)

if __name__=='__main__':
    linear()

实际上还有另外一种方法l2正则化，SGDRegression penalty='l2'线性回归加上l2正则化

from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression,SGDRegressor,Ridge
from sklearn.metrics import mean_squared_error

#梯度下降进行预测
def linear2():
    #数据获取和分割
    boston=load_boston()
    x_trian,x_test,y_train,y_test=train_test_split(boston.data,boston.target,random_state=22)
    #数据预处理
    transfer=StandardScaler()
    x_train=transfer.fit_transform(x_trian)
    x_test=transfer.transform(x_test)
    #预估器
    estimator=SGDRegressor(penalty='l2')
    estimator.fit(x_train,y_train)
    #得出模型
    print('coef:',estimator.coef_)
    print('bias:',estimator.intercept_)
    #模型评估
    y_predict=estimator.predict(x_test)
    print('预测房价:',y_predict)
    error=mean_squared_error(y_test,y_predict)
    print('均方误差：',error)


if __name__=='__main__':
    linear2()

但这个是随机梯度下降，没有上一个api效果好

3 Lasso回归：L1正则化

SGDRegression penalty='l1'线性回归加上l1正则化

from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression,SGDRegressor,Ridge
from sklearn.metrics import mean_squared_error

#梯度下降进行预测
def linear2():
    #数据获取和分割
    boston=load_boston()
    x_trian,x_test,y_train,y_test=train_test_split(boston.data,boston.target,random_state=22)
    #数据预处理
    transfer=StandardScaler()
    x_train=transfer.fit_transform(x_trian)
    x_test=transfer.transform(x_test)
    #预估器
    estimator=SGDRegressor(penalty='l1')
    estimator.fit(x_train,y_train)
    #得出模型
    print('coef:',estimator.coef_)
    print('bias:',estimator.intercept_)
    #模型评估
    y_predict=estimator.predict(x_test)
    print('预测房价:',y_predict)
    error=mean_squared_error(y_test,y_predict)
    print('均方误差：',error)

if __name__=='__main__':
    linear2()

案例：对波士顿房价进行预测

linear1 () 线性回归跑的

linear2（）梯度下降跑的

linear3（）岭回归跑的

from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression,SGDRegressor,Ridge
from sklearn.metrics import mean_squared_error

#线性回归
def linear1():
    #数据获取和分割
    boston=load_boston()
    x_trian,x_test,y_train,y_test=train_test_split(boston.data,boston.target,random_state=22)
    #数据预处理
    transfer=StandardScaler()
    x_train=transfer.fit_transform(x_trian)
    x_test=transfer.transform(x_test)
    #预估器
    estimator=LinearRegression()
    estimator.fit(x_train,y_train)
    #得出模型
    print('coef:',estimator.coef_)
    print('bias:',estimator.intercept_)
    #模型评估
    y_predict=estimator.predict(x_test)
    print('预测房价:',y_predict)
    error=mean_squared_error(y_test,y_predict)
    print('均方误差：',error)

#梯度下降进行预测
def linear2():
    #数据获取和分割
    boston=load_boston()
    x_trian,x_test,y_train,y_test=train_test_split(boston.data,boston.target,random_state=22)
    #数据预处理
    transfer=StandardScaler()
    x_train=transfer.fit_transform(x_trian)
    x_test=transfer.transform(x_test)
    #预估器
    estimator=SGDRegressor()
    estimator.fit(x_train,y_train)
    #得出模型
    print('coef:',estimator.coef_)
    print('bias:',estimator.intercept_)
    #模型评估
    y_predict=estimator.predict(x_test)
    print('预测房价:',y_predict)
    error=mean_squared_error(y_test,y_predict)
    print('均方误差：',error)

#岭回归对波士顿房价进行预测
def linear3():
    #数据获取和分割
    boston=load_boston()
    x_trian,x_test,y_train,y_test=train_test_split(boston.data,boston.target,random_state=22)
    #数据预处理
    transfer=StandardScaler()
    x_train=transfer.fit_transform(x_trian)
    x_test=transfer.transform(x_test)
    #预估器
    estimator=Ridge()
    estimator.fit(x_train,y_train)
    #得出模型
    print('coef:',estimator.coef_)
    print('bias:',estimator.intercept_)
    #模型评估
    y_predict=estimator.predict(x_test)
    print('预测房价:',y_predict)
    error=mean_squared_error(y_test,y_predict)
    print('均方误差：',error)

if __name__=='__main__':
    linear1()
    linear2()
    linear3()

岭回归和Lasso回归&正则化，梯度下降。

前言

查看调用api：sklearn官方文档
# 搜索api文档：ctrl+F搜索关键词 API Reference — scikit-learn 1.1.1 documentation

np.random.
rand（x,y) 生成【x行y列】的（0，1）区间的随机数
randn（x,y) 生成【x行y列】（0，1）区间正态分布的随机数
randint 生成整数

import numpy as np X=2np.random.rand(100,1) y=4+3X+np.random.randn(100,1)

梯度下降

2 原理

2.1 批量梯度下降

2.2 随机梯度下降
就是随机选一个样本来算那个偏导数，然后随机下降

！可以随着迭代次数增多步长减小啊

正则化

2 为什么L1更稀疏？
比如下边右图，

二维为例子，L1是|x|+|y|,L2是x2+y2，L1与（等高线）切的往往在坐标轴上的点，因而个别参数变为0，因此稀疏；而L2往往切在平滑的点，选择了更多的特征。

3 L1正则化项

4 L2正则化项

Lasso回归和岭回归
L1正则化对应Lasso回归，L2正则化对应岭回归。

1 关于正则化项：

正则化力度越小时候，权重值大；正则化力度小，权重值大（图中可以看出)
正则化项=lamda*惩罚项，lamda大时惩罚项小，权重系数小；反之亦然

Python相关栏目本月热门文章

岭回归和Lasso回归&正则化，梯度下降。

前言

查看调用api：sklearn官方文档 # 搜索api文档：ctrl+F搜索关键词 API Reference — scikit-learn 1.1.1 documentation

np.random. rand（x,y) 生成【x行y列】的（0，1）区间的随机数randn（x,y) 生成【x行y列】（0，1）区间正态分布的随机数randint 生成整数 import numpy as np X=2*np.random.rand(100,1) y=4+3*X+np.random.randn(100,1)

梯度下降

2 原理

2.1 批量梯度下降

2.2 随机梯度下降 就是随机选一个样本来算那个偏导数，然后随机下降 ！可以随着迭代次数增多步长减小啊

正则化

2 为什么L1更稀疏？ 比如下边右图， 二维为例子，L1是|x|+|y|,L2是x2+y2，L1与（等高线）切的往往在坐标轴上的点，因而个别参数变为0，因此稀疏；而L2往往切在平滑的点，选择了更多的特征。

3 L1正则化项

4 L2正则化项

Lasso回归和岭回归 L1正则化对应Lasso回归，L2正则化对应岭回归。

1 关于正则化项： 正则化力度越小时候，权重值大；正则化力度小，权重值大（图中可以看出) 正则化项=lamda*惩罚项，lamda大时惩罚项小，权重系数小；反之亦然

Python相关栏目本月热门文章

查看调用api：sklearn官方文档
# 搜索api文档：ctrl+F搜索关键词 API Reference — scikit-learn 1.1.1 documentation

np.random.
rand（x,y) 生成【x行y列】的（0，1）区间的随机数
randn（x,y) 生成【x行y列】（0，1）区间正态分布的随机数
randint 生成整数

import numpy as np X=2np.random.rand(100,1) y=4+3X+np.random.randn(100,1)

2.2 随机梯度下降
就是随机选一个样本来算那个偏导数，然后随机下降

！可以随着迭代次数增多步长减小啊

2 为什么L1更稀疏？
比如下边右图，

二维为例子，L1是|x|+|y|,L2是x2+y2，L1与（等高线）切的往往在坐标轴上的点，因而个别参数变为0，因此稀疏；而L2往往切在平滑的点，选择了更多的特征。

Lasso回归和岭回归
L1正则化对应Lasso回归，L2正则化对应岭回归。

1 关于正则化项：

正则化力度越小时候，权重值大；正则化力度小，权重值大（图中可以看出)
正则化项=lamda*惩罚项，lamda大时惩罚项小，权重系数小；反之亦然