Python html 爬虫抓取论坛内容

Python 更新时间：2026-04-08 04:05:54 发布时间：1428天前 IT归档最新发布模块sitemap 名妆网法律咨询聚返吧英语巴士网伯小乐网商动力

找到论坛网址，提取热门话题相关信息：

回复/浏览量

标题

发布时间

帖子链接

论坛热门榜：

效果如下：

代码实现如下：

import requests
from bs4 import BeautifulSoup
url = 'https://bbs.hupu.com/gp-hot'
res = requests.get(url)
content = res.text
soup = BeautifulSoup(content, 'html.parser')
all_body = soup.find_all(class_='bbs-sl-web-post-body')
for i in all_body:
    title = i.find('a')
    post_time = i.find(class_='post-time')
    reply = i.find(class_='post-datum')
    post_url1 = title['href']
    post_url = 'https://bbs.hupu.com' + post_url1
    print(
        '''Reply/Read -> 33[7;36;40m [{0}] 33[0m 
Title -> 33[7;36;1m  [{1}] 33[0m 
post_time -> [{2}]
post_url -> [{3}]
'''.format(reply.text, title.text, post_time.text, post_url))
    print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')

转载请注明：文章转载自 www.mshxw.com

本文地址：https://www.mshxw.com/it/869254.html

上一篇基于Python的selenium实践

下一篇 NVIDIA CUDA 高度并行处理器编程（一）：CUDA简介习题

Python相关栏目本月热门文章

关于我们文章归档网站地图联系我们

Python html 爬虫 抓取论坛内容

Python相关栏目本月热门文章

Python html 爬虫抓取论坛内容