栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 面试经验 > 面试问答

使用scikit-learn分为多个类别

面试问答 更新时间: 发布时间: IT归档 最新发布 模块sitemap 名妆网 法律咨询 聚返吧 英语巴士网 伯小乐 网商动力

使用scikit-learn分为多个类别

您想要的就是多标签分类。Scikits-learn可以做到。参见此处:http : //scikit-
learn.org/dev/modules/multiclass.html

我不确定您的示例出了什么问题,我的sklearn版本显然没有WordNGramAnalyzer。也许这是使用更多训练示例或尝试使用其他分类器的问题?但是请注意,多标签分类器希望目标是元组/标签列表的列表。

以下对我有用:

import numpy as npfrom sklearn.pipeline import Pipelinefrom sklearn.feature_extraction.text import CountVectorizerfrom sklearn.svm import LinearSVCfrom sklearn.feature_extraction.text import TfidfTransformerfrom sklearn.multiclass import OneVsRestClassifierX_train = np.array(["new york is a hell of a town",         "new york was originally dutch",         "the big apple is great",         "new york is also called the big apple",         "nyc is nice",         "people abbreviate new york city as nyc",         "the capital of great britain is london",         "london is in the uk",         "london is in england",         "london is in great britain",         "it rains a lot in london",         "london hosts the british museum",         "new york is great and so is london",         "i like london better than new york"])y_train = [[0],[0],[0],[0],[0],[0],[1],[1],[1],[1],[1],[1],[0,1],[0,1]]X_test = np.array(['nice day in nyc',        'welcome to london',        'hello welcome to new york. enjoy it here and london too'])   target_names = ['New York', 'London']classifier = Pipeline([    ('vectorizer', CountVectorizer(min_n=1,max_n=2)),    ('tfidf', TfidfTransformer()),    ('clf', oneVsRestClassifier(LinearSVC()))])classifier.fit(X_train, y_train)predicted = classifier.predict(X_test)for item, labels in zip(X_test, predicted):    print '%s => %s' % (item, ', '.join(target_names[x] for x in labels))

对我来说,这产生了输出:

nice day in nyc => New Yorkwelcome to london => Londonhello welcome to new york. enjoy it here and london too => New York, London

希望这可以帮助。



转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/625550.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号