您应该拟合MinMaxScaler
使用training
数据,然后testing
在进行预测之前将定标器应用于数据。
综上所述:
- 步骤1:装
scaler
上TRAINING data
- 步骤2:使用
scaler
至transform the TRAINING data
- 第3步:使用
transformed training data
来fit the predictive model
- 步骤4:使用
scaler
至transform the TEST data
- 步骤5:
predict
使用trained model
(步骤3)和transformed TEST data
(步骤4)。
使用数据的示例:
from sklearn import preprocessingmin_max_scaler = preprocessing.MinMaxScaler()#training datadf = pd.Dataframe({'A':[1,2,3,7,9,15,16,1,5,6,2,4,8,9],'B':[15,12,10,11,8,14,17,20,4,12,4,5,17,19],'C':['Y','Y','Y','Y','N','N','N','Y','N','Y','N','N','Y','Y']})#fit and transform the training data and use them for the model trainingdf[['A','B']] = min_max_scaler.fit_transform(df[['A','B']])df['C'] = df['C'].apply(lambda x: 0 if x.strip()=='N' else 1)#fit the modelmodel.fit(df['A','B'])#after the model training on the transformed training data define the testing data df_testdf_test = pd.Dataframe({'A':[25,67,24,76,23],'B':[2,54,22,75,19]})#before the prediction of the test data, onLY APPLY the scaler on themdf_test[['A','B']] = min_max_scaler.transform(df_test[['A','B']])#test the modely_predicted_from_model = model.predict(df_test['A','B'])使用虹膜数据的示例:
import matplotlib.pyplot as pltfrom sklearn import datasetsfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import MinMaxScalerfrom sklearn.svm import SVCdata = datasets.load_iris()X = data.datay = data.targetX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)scaler = MinMaxScaler()X_train_scaled = scaler.fit_transform(X_train)model = SVC()model.fit(X_train_scaled, y_train)X_test_scaled = scaler.transform(X_test)y_pred = model.predict(X_test_scaled)
希望这可以帮助。
另请参阅此处的帖子: https :
//towardsdatascience.com/everything-you-need-to-know-about-min-max-
normalization-in-
python-b79592732b79



