如果您不想使用collections.Counter,则可以编写自己的函数:
import sysfilename = sys.argv[1]fp = open(filename)data = fp.read()words = data.split()fp.close()unwanted_chars = ".,-_ (and so on)"wordfreq = {}for raw_word in words: word = raw_word.strip(unwanted_chars) if word not in wordfreq: wordfreq[word] = 0 wordfreq[word] += 1对于更好的东西,请看正则表达式。



