python出现的次数最多的元素(python 时间序列)

1 问题

你有一个序列，你希望能够找出序列中出现频率最高的元素。

2. 解决方案

collections.Counter 类就是专门用于实现上述需求的。该类的实例有一个名为 most_common() 的方法，该方法可以快速的找到出现频率最高的元素。例如：

>>> words = [
...    'look', 'into', 'my', 'eyes', 'look', 'into', 'my', 'eyes',
...    'the', 'eyes', 'the', 'eyes', 'the', 'eyes', 'not', 'around', 'the',
...    'eyes', "don't", 'look', 'around', 'the', 'eyes', 'look', 'into',
...    'my', 'eyes', "you're", 'under'
...    ]
>>> from collections import Counter
>>> word_counter = Counter(words)
>>> top_three = word_counter.most_common(3)
>>> top_three
[('eyes', 8), ('the', 5), ('look', 4)]

3. 讨论

Counter 是通过继承 dict 实现的，不同的是，Counter 的实例是以序列中可哈希的元素为键，值为元素在序列中的出现次数。例如：

>>> word_counter['not']
1
>>> word_counter['eyes']
8

如果你希望通过另一个序列来更新 Counter 实例中的计数，你可以手动进行如下操作：

>>> more_words = ['why', 'are', 'you', 'not', 'looking', 'in', 'my', 'eyes']
>>> for word in more_words:
...    word_counter[word] += 1
    
>>> word_counter['eyes']
9

或者，你可以直接调用 Counter 实例的 update() 方法：

word_counter.update(more_words)

实际上，Counter 实例可以非常简单地和常见数学运算结合使用。例如：

>>> a = Counter(words)
>>> b = Counter(more_words)
>>> a
Counter({'eyes': 8, 'the': 5, 'look': 4, 'into': 3, 'my': 3, 'around': 2, 'not': 1, "don't": 1, "you're": 1, 'under': 1})
>>> b
Counter({'why': 1, 'are': 1, 'you': 1, 'not': 1, 'looking': 1, 'in': 1, 'my': 1, 'eyes': 1})
>>> c = a + b
>>> c
Counter({'eyes': 9, 'the': 5, 'look': 4, 'my': 4, 'into': 3, 'not': 2, 'around': 2, "don't": 1, "you're": 1, 'under': 1, 'why': 1, 'are': 1, 'you': 1, 'looking': 1, 'in': 1})
>>> d = a - b
>>> d
Counter({'eyes': 7, 'the': 5, 'look': 4, 'into': 3, 'my': 2, 'around': 2, "don't": 1, "you're": 1, 'under': 1})

通过以上可知，在任何时候，如果你希望统计并对数据进行计数，那么使用 Counter 总是比使用 dict 要更加方便。

python出现的次数最多的元素(python 时间序列)

Python相关栏目本月热门文章