找到论坛网址, 提取热门话题相关信息:
回复/浏览量
标题
发布时间
帖子链接
论坛热门榜:
效果如下:
代码实现如下:
import requests
from bs4 import BeautifulSoup
url = 'https://bbs.hupu.com/gp-hot'
res = requests.get(url)
content = res.text
soup = BeautifulSoup(content, 'html.parser')
all_body = soup.find_all(class_='bbs-sl-web-post-body')
for i in all_body:
title = i.find('a')
post_time = i.find(class_='post-time')
reply = i.find(class_='post-datum')
post_url1 = title['href']
post_url = 'https://bbs.hupu.com' + post_url1
print(
'''Reply/Read -> 33[7;36;40m [{0}] 33[0m
Title -> 33[7;36;1m [{1}] 33[0m
post_time -> [{2}]
post_url -> [{3}]
'''.format(reply.text, title.text, post_time.text, post_url))
print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')



