这将名称分组
from fuzzywuzzy import fuzzcombined_list = ['rakesh', 'zakesh', 'bikash', 'zikash', 'goldman LLC', 'oldman LLC']combined_list.append('bakesh')print('input names:', combined_list)grs = list() # groups of names with distance > 80for name in combined_list: for g in grs: if all(fuzz.ratio(name, w) > 80 for w in g): g.append(name) break else: grs.append([name, ])print('output groups:', grs)outlist = [el for g in grs for el in g]print('output list:', outlist)生产
input names: ['rakesh', 'zakesh', 'bikash', 'zikash', 'goldman LLC', 'oldman LLC', 'bakesh']output groups: [['rakesh', 'zakesh', 'bakesh'], ['bikash', 'zikash'], ['goldman LLC', 'oldman LLC']]output list: ['rakesh', 'zakesh', 'bakesh', 'bikash', 'zikash', 'goldman LLC', 'oldman LLC']
如您所见,名称已正确分组,但顺序可能不是您想要的。



