栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

pyspark mllib 二分类是以softmax作为输出的解释

pyspark mllib 二分类是以softmax作为输出的解释

直接上代码

我用的是GBDT,预测结果如下:

只给出关键代码

gbdt = GBTClassifier(featuresCol="features", labelCol="y", predictionCol="prediction",)

df_train_eval = model.transform(df_train)

df_train_eval.select(*['y', 'rawPrediction', 'probability', 'prediction']).show(truncate=False)
+---+----------------------------------------+-----------------------------------------+----------+
|y  |rawPrediction                           |probability                              |prediction|
+---+----------------------------------------+-----------------------------------------+----------+
|1  |[-1.5435020027249835,1.5435020027249835]|[0.04364652142729318,0.9563534785727068] |1.0       |
|1  |[-1.5435020027249835,1.5435020027249835]|[0.04364652142729318,0.9563534785727068] |1.0       |
|0  |[1.5435020027249835,-1.5435020027249835]|[0.9563534785727067,0.043646521427293306]|0.0       |
|0  |[1.5435020027249835,-1.5435020027249835]|[0.9563534785727067,0.043646521427293306]|0.0       |
|0  |[1.5435020027249835,-1.5435020027249835]|[0.9563534785727067,0.043646521427293306]|0.0       |
|0  |[1.5435020027249835,-1.5435020027249835]|[0.9563534785727067,0.043646521427293306]|0.0       |
|0  |[1.5435020027249835,-1.5435020027249835]|[0.9563534785727067,0.043646521427293306]|0.0       |
|1  |[-1.5435020027249835,1.5435020027249835]|[0.04364652142729318,0.9563534785727068] |1.0       |
|1  |[-1.5435020027249835,1.5435020027249835]|[0.04364652142729318,0.9563534785727068] |1.0       |
|0  |[1.5435020027249835,-1.5435020027249835]|[0.9563534785727067,0.043646521427293306]|0.0       |
|1  |[-1.5435020027249835,1.5435020027249835]|[0.04364652142729318,0.9563534785727068] |1.0       |
|1  |[-1.5435020027249835,1.5435020027249835]|[0.04364652142729318,0.9563534785727068] |1.0       |
+---+----------------------------------------+-----------------------------------------+----------+

y:原始标签rawPrediction:原始预测值probability:概率值prediction:预测结果

为什么说是softmax,而不是sigmoid?

原因一:

我们看概率值probability这一列,每一行加起来都是1,符合softmax的互斥原则。

原因二:

def softmax(x):
	e_x = np.exp(x)
	return e_x / e_x.sum()

def sigmoid(x):
	return 1. / (1 + np.exp(-x))

同过将原始预测值rawPrediction,带到softmax和sigmoid函数中,也可以证明,它的结果就是softmax的结果。

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/758582.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号