将参数
duplicatedwith
keep=False用于所有重复行,然后
groupby按所有列使用,并将索引值转换为元组,最后将输出转换
Series为
list:
df = df[df.duplicated(keep=False)]df = df.groupby(list(df)).apply(lambda x: tuple(x.index)).tolist()print (df)[(1, 6), (2, 4), (3, 5)]
如果还希望看到重复的值:
df1 = (df.groupby(df.columns.tolist()) .apply(lambda x: tuple(x.index)) .reset_index(name='idx'))print (df1) param_a param_b param_c idx0 0 0 0 (1, 6)1 0 2 1 (2, 4)2 2 1 1 (3, 5)



