如下所示:
from bs4 import BeautifulSoup
path = './web/new_index.html'
with open(path, 'r') as f:
Soup = BeautifulSoup(f.read(), 'lxml')
titles = Soup.select('ul > li > div.article-info > h3 > a')
for title in titles:
print(title.text)
输出:
Sardinia's top 10 beaches
How to get tanned
How to be an Aussie beach bum
Summer's cheat sheet
#其中
titles = Soup.select('ul > li > div.article-info > h3 > a')
#等效
titles = Soup.select('h3 a')
print(title.text) #等效 print(title.get_text()) print(title.string)
也可以使用以下代码
import bs4
path = './web/new_index.html'
with open(path, 'r') as f:
Soup = bs4.BeautifulSoup(f.read(), 'lxml')
titles = Soup.select('h3 a')
for title in titles:
print(title.string)
Html原文:
Article
-
Sardinia's top 10 beaches
white sands and turquoise waters
4.5 -
How to get tanned
hot bikini girls on beach
5.0 -
How to be an Aussie beach bum
To make the most of your visit
3.5 -
Summer's cheat sheet
choosing a beach in Cape Cod
3.0
© Mugglecoding
以上这篇Python读取本地文件并解析网页元素的方法就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持考高分网。



