python_Python

python

文章目录

1. 集合
2. 字典
3. 练习：词频统计练习

1. 集合

"""
1. 字典和集合
1). 集合 (无序且不重复)
- 创建
s = {}          # s 不是集合，是空字典
s = set{}       # 如何创建空集合?
- 集合的特性:    in, not
- 集合的方法:
    增: add, update
    删: pop, remove(删除的value不存在，会报错), discard(删除的value不存在，不会报错)
    删: pop, remove(if not exists, error),  discard(if not exists, do nothing)
    查: 交集(s1 & s2), 并集(s1 | s2), 差集(s1 - s2), issubsest, issuperset, isdisjoint
"""

2. 字典

"""
2). 字典
- 创建
d = {'name':'westos', 'age':18, 'city':'西安'}    # key-value对或者键值对
- 特性: in, not in (注意: 判断是否为所有key的成员)
    print('name' in d)      # True
    print('westos' in d)    # False
- 常用方法
    增: d[key] = value, update, d.setdefault(), key存在则do nothing, key不存在则增加
    删: pop, popitem, clear
    改: d[key] = value, d.setdefault()
        d[key] = value, key存在则修改，不存在则增加
    查: keys(), values(), items(), d[key], d.get(key), d.get(key, default-value)
        d.get(key), 如果key不存在，则返回None
"""

3. 练习：词频统计练习

song.txt：
hello python
hello k8s
hello redis
hello mysql
hello lnmp
hello hadoop
hello java
hello matlab
k8s is good
k8s is best

方法一：

"""
技能需求:
    1. 文件操作
    2. 字符串的分割操作
    3. 字典操作

功能需求: 词频统计
    1. 读取song.txt文件
        with open(filename) as f: content=f.read()
    2. 分析文件中的每一个单词，统计每个单词出现的次数
    - 分析文件中的每一个单词
        content = "hello python hello java"
        words = content.split()
    - 统计每个单词出现的次数
        {'hello':2, 'python':1, 'java':1}

"""

# 1. 加载文件中所有的单词
with open('song.txt') as f:
    #f.seek(0, 0)
    words = f.read().split()
    #print(words)

# 2. 统计
result = {}             # 空字典
for word in words:      # 一边将word添加到字典中，一边统计字典中key的value值
    if word in result:  # 判断字典中是否有word
        result[word] += 1
    else:
        result[word] = 1

# *小扩展: 友好打印信息
import pprint
pprint.pprint(result)
# {'best': 1,
#  'good': 1,
#  'hadoop': 1,
#  'hello': 8,
#  'is': 2,
#  'java': 1,
#  'k8s': 3,
#  'lnmp': 1,
#  'matlab': 1,
#  'mysql': 1,
#  'python': 1,
#  'redis': 1}

# 3. 获取出现次数最多的5个单词
#    实现统计
from collections import Counter
counter = Counter(words)        # 统计words
print(counter)
print(type(counter))
# Counter({'hello': 8, 'k8s': 3, 'is': 2, 'python': 1, 'redis': 1, 'mysql': 1, 'lnmp': 1, 'hadoop': 1, 'java': 1, 'matlab': 1, 'good': 1, 'best': 1})
# 

# 获取出现次数最多的5个单词
from collections import Counter
counter = Counter(words)        # 统计words
result = counter.most_common(5) # 统计前5个最多的单词
print(result)
# [('hello', 8), ('k8s', 3), ('is', 2), ('python', 1), ('redis', 1)]

方法二：

# 方法二：
with open('song.txt', 'r') as f:
    data = f.read().split()         # 分割
    set_data = set(data)            # 将单词排序
    for item in set_data:           # 统计个数
        count = data.count(item)
        print(f'{item}出现{count}次')
# python出现1次
# mysql出现1次
# good出现1次
# k8s出现3次
# hello出现8次
# redis出现1次
# best出现1次
# lnmp出现1次
# is出现2次
# matlab出现1次
# java出现1次
# hadoop出现1次

python

Python相关栏目本月热门文章