Python K-NN分类器机器学习

监督性机器学习：输入训练数据（包括object的特征值及分类label），根据训练数据，寻找出一个最好的分类器（classifer），输入test 数据，使用训练出来的分类器进行分类，看结果是否跟已知的分类符合。监督性机器学习主要包括分类以及回归方程

监督性机器学习的一个实例

%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split

输入数据
fruits = pd.read_table(‘readonly/fruit_data_with_colors.txt’)
fruits.head()

create a mapping from fruit label value to fruit name to make results easier to interpret

lookup_fruit_name = dict(zip(fruits.fruit_label.unique(), fruits.fruit_name.unique()))
lookup_fruit_name

Examining the data：看训练数据的分布，以及是否存在异常值等

plotting a scatter matrix

from matplotlib import cm

X = fruits[[‘height’, ‘width’, ‘mass’, ‘color_score’]]
y = fruits[‘fruit_label’]
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

cmap = cm.get_cmap(‘gnuplot’)
scatter = pd.scatter_matrix(X_train, c= y_train, marker = ‘o’, s=40, hist_kwds={‘bins’:15}, figsize=(9,9), cmap=cmap)

plotting a 3D scatter plot

from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure()
ax = fig.add_subplot(111, projection = ‘3d’)
ax.scatter(X_train[‘width’], X_train[‘height’], X_train[‘color_score’], c = y_train, marker = ‘o’, s=100)
ax.set_xlabel(‘width’)
ax.set_ylabel(‘height’)
ax.set_zlabel(‘color_score’)
plt.show()

创建分类器

训练分类器

评估分类器的准确性

使用一个新的实例去验证k-NN 分类器

plot不同水果的分类界限区域

验证不同K取值对K-NN分类器的准确性

验证不同train-test的数据分割比率对K-NN分类器的准确性

Python K-NN分类器机器学习

Python相关栏目本月热门文章