栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 面试经验 > 面试问答

是否可以使用scikit-learn中的自定义内核通过网格搜索来调整参数?

面试问答 更新时间: 发布时间: IT归档 最新发布 模块sitemap 名妆网 法律咨询 聚返吧 英语巴士网 伯小乐 网商动力

是否可以使用scikit-learn中的自定义内核通过网格搜索来调整参数?

要做到这一点的方法之一是使用

Pipeline,SVC(kernel='precomputed')
和包装定制的内核函数的
sklearn
估计(的子类
baseEstimator
TransformerMixin
))。

例如,

sklearn
包含一个自定义内核函数
chi2_kernel(X, Y=None, gamma=1.0)
,该函数可计算特征向量
X
和的内核矩阵
Y
。该函数采用一个参数
gamma
,最好使用交叉验证进行设置。我们可以按以下方式对该函数的参数进行网格搜索:

from __future__ import print_functionfrom __future__ import divisionimport sysimport numpy as npimport sklearnfrom sklearn.base import baseEstimator, TransformerMixinfrom sklearn.cross_validation import train_test_splitfrom sklearn.datasets import load_digitsfrom sklearn.grid_search import GridSearchCVfrom sklearn.metrics import accuracy_scorefrom sklearn.metrics.pairwise import chi2_kernelfrom sklearn.pipeline import Pipelinefrom sklearn.svm import SVC# Wrapper class for the custom kernel chi2_kernelclass Chi2Kernel(baseEstimator,TransformerMixin):    def __init__(self, gamma=1.0):        super(Chi2Kernel,self).__init__()        self.gamma = gamma    def transform(self, X):        return chi2_kernel(X, self.X_train_, gamma=self.gamma)    def fit(self, X, y=None, **fit_params):        self.X_train_ = X        return selfdef main():    print('python: {}'.format(sys.version))    print('numpy: {}'.format(np.__version__))    print('sklearn: {}'.format(sklearn.__version__))    np.random.seed(0)    # Get some data to evaluate    dataset = load_digits()    X = dataset.data    y = dataset.target    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)    # Create a pipeline where our custom predefined kernel Chi2Kernel    # is run before SVC.    pipe = Pipeline([        ('chi2', Chi2Kernel()),        ('svm', SVC()),    ])    # Set the parameter 'gamma' of our custom kernel by    # using the 'estimator__param' syntax.    cv_params = dict([        ('chi2__gamma', 10.0**np.arange(-9,4)),        ('svm__kernel', ['precomputed']),        ('svm__C', 10.0**np.arange(-2,9)),    ])    # Do grid search to get the best parameter value of 'gamma'.    model = GridSearchCV(pipe, cv_params, cv=5, verbose=1, n_jobs=-1)    model.fit(X_train, y_train)    y_pred = model.predict(X_test)    acc_test = accuracy_score(y_test, y_pred)    print("Test accuracy: {}".format(acc_test))    print("Best params:")    print(model.best_params_)if __name__ == '__main__':    main()

输出:

    python: 2.7.3 (default, Dec 18 2014, 19:10:20)    [GCC 4.6.3]    numpy: 1.8.0    sklearn: 0.16.1    Fitting 5 folds for each of 143 candidates, totalling 715 fits    [Parallel(n_jobs=-1)]: Done   1 jobs       | elapsed:    0.4s    [Parallel(n_jobs=-1)]: Done  50 jobs       | elapsed:    2.7s    [Parallel(n_jobs=-1)]: Done 200 jobs       | elapsed:    9.8s    [Parallel(n_jobs=-1)]: Done 450 jobs       | elapsed:   21.6s    [Parallel(n_jobs=-1)]: Done 701 out of 715 | elapsed:   34.8s remaining:    0.7s    [Parallel(n_jobs=-1)]: Done 715 out of 715 | elapsed:   35.4s finished    Test accuracy: 0.989898989899    Best params:    {'chi2__gamma': 0.01, 'svm__C': 10.0, 'svm__kernel': 'precomputed'}

在您的情况下,只需将其替换chi2_kernel为计算内核矩阵的函数即可。



转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/623619.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号