在用GBDT系列训练时,报错ValueError: X.dtype should be np.float32, got float64,如下所示。
ValueError Traceback (most recent call last)解决方法in () ----> 1 abc.apply(X_train) ~/tmp/dataset/Augboost+FM/AugBoost.py in apply(self, X) 461 for j in range(n_classes): 462 estimator = self.estimators_[i, j] --> 463 leaves[:, i, j] = estimator.apply(np.concatenate([X_original, X], axis=1), check_input=False) 464 465 return leaves ~/anaconda3/lib/python3.7/site-packages/sklearn/tree/tree.py in apply(self, X, check_input) 464 check_is_fitted(self, 'tree_') 465 X = self._validate_X_predict(X, check_input) --> 466 return self.tree_.apply(X) 467 468 def decision_path(self, X, check_input=True): sklearn/tree/_tree.pyx in sklearn.tree._tree.Tree.apply() sklearn/tree/_tree.pyx in sklearn.tree._tree.Tree.apply() sklearn/tree/_tree.pyx in sklearn.tree._tree.Tree._apply_dense() ValueError: X.dtype should be np.float32, got float64
很显然,就是字面上的意思,只能是np.float32,但是给出的是float64
我看了tree.py
sklearn内置的代码一路走下来就应该是32位的。所以怀疑是自己前面int类型的输入训练集在转化是转化为了64位
看了一下,前面有这样的代码
X_original = X X_normed = self.normalizer.transform(X)
X是我输入的X_train ,dataframe格式,int类型的数据
看下X_normed
ok,发现问题了,经过normalizer.transform()我的数据变成了numpy.ndarray类型,float64。那么咱们把类型转化过来就行了
ndarray的数据类型:
https://blog.csdn.net/weixin_43181110/article/details/83996915?spm=1001.2101.3001.6650.1&utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7Edefault-1.no_search_link&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7Edefault-1.no_search_link
X_normed = X_normed.astype(np.float32)
这样就可以啦



