如果
heapq.merge()标准库中有内容,为什么还要自己动手?不幸的是,它没有提供关键参数-您必须进行装饰-合并-自己修饰装饰:
from itertools import imapfrom operator import itemgetterimport heapqdef extract_timestamp(line): """Extract timestamp and convert to a form that gives the expected result in a comparison """ return line.split()[1] # for examplewith open("log1.txt") as f1, open("log2.txt") as f2: sources = [f1, f2] with open("merged.txt", "w") as dest: decorated = [ ((extract_timestamp(line), line) for line in f) for f in sources] merged = heapq.merge(*decorated) undecorated = imap(itemgetter(-1), merged) dest.writelines(undecorated)上面的每一步都是“懒惰的”。因为避免
file.readlines()了文件中的行会根据需要读取。同样,使用生成器表达式而不是list-
comps的修饰过程。
heapq.merge()也是懒惰的-
每个输入迭代器需要一个项目同时进行必要的比较。最后,我使用
itertools.imap()了内置的map()的惰性变体进行修饰。
(在Python 3中,map()变得很懒,因此您可以使用它)



