你可以用
get_dummies与
reindex所有可能的类别:
df1 = pd.Dataframe({'A': ['Me', 'Myself', 'and', 'Irene']})df2= pd.Dataframe({'A': ['Me', 'Myself', 'and']})df3 = pd.Dataframe({'A': ['Me', 'Myself', 'or', 'Irene']})all_categories = pd.concat([df1.A, df2.A, df3.A]).unique()print (all_categories)['Me' 'Myself' 'and' 'Irene' 'or']df1 = pd.get_dummies(df1.A).reindex(columns=all_categories, fill_value=0)print(df1) Me Myself and Irene or0 1 0 0 0 01 0 1 0 0 02 0 0 1 0 03 0 0 0 1 0df2 = pd.get_dummies(df2.A).reindex(columns=all_categories, fill_value=0)print(df2) Me Myself and Irene or0 1 0 0 0 01 0 1 0 0 02 0 0 1 0 0df3 = pd.get_dummies(df3.A).reindex(columns=all_categories, fill_value=0)print(df3) Me Myself and Irene or0 1 0 0 0 01 0 1 0 0 02 0 0 0 0 13 0 0 0 1 0


