这应该比正则表达式快得多,并且您可以根据需要传递分隔符列表:
def split(txt, seps): default_sep = seps[0] # we skip seps[0] because that's the default separator for sep in seps[1:]: txt = txt.replace(sep, default_sep) return [i.strip() for i in txt.split(default_sep)]
如何使用它:
>>> split('ABC ; DEF123,GHI_JKL ; MN OP', (',', ';'))['ABC', 'DEF123', 'GHI_JKL', 'MN OP']性能测试:
import timeitimport reTEST = 'ABC ; DEF123,GHI_JKL ; MN OP'SEPS = (',', ';')rsplit = re.compile("|".join(SEPS)).splitprint(timeit.timeit(lambda: [s.strip() for s in rsplit(TEST)]))# 1.6242462980007986print(timeit.timeit(lambda: split(TEST, SEPS)))# 1.3588597209964064并使用更长的输入字符串:
TEST = 100 * 'ABC ; DEF123,GHI_JKL ; MN OP , 'print(timeit.timeit(lambda: [s.strip() for s in rsplit(TEST)]))# 130.67168392999884print(timeit.timeit(lambda: split(TEST, SEPS)))# 50.31940778599528



