不需要带有连接值的新列,默认情况下,通过两个列进行内部合并,如果需要,则使用
df2.indexadd值进行合并
reset_index:
df1 = pd.Dataframe({'col1':['cx','cx','cx2'], 'col2':[1,4,12]})df2 = pd.Dataframe({'col1':['cx','cx','cx','cx','cx2','cx2'], 'col2':[1,3,5,10,12,12]})df3 = pd.merge(df1,df2.reset_index(), on = ['col1','col2'])print (df3) col1 col2 index0 cx 1 01 cx2 12 42 cx2 12 5对于两个索引都需要:
df4 = pd.merge(df1.reset_index(),df2.reset_index(), on = ['col1','col2'])print (df4) index_x col1 col2 index_y0 0 cx 1 01 2 cx2 12 42 2 cx2 12 5
仅对于两个Dataframe的交集:
df5 = pd.merge(df1,df2, on = ['col1','col2'])#if 2 column Dataframe #df5 = pd.merge(df1,df2)print (df5) col1 col20 cx 11 cx2 122 cx2 12



