您可以使用css选择器,使用标题文本拉出所需的跨度:
soup = BeautifulSoup("""<div ><div ><span title="000 Plus Minimum RAM Requirement">1 GB</span> </div>""", "xml")print(soup.select_one("span[title*=RAM]").text)找到具有包含 RAM 的 title 属性的 跨度 ,等效于在python中说。
if "RAM" inspan["title"]
或在 re.compile中 使用 find __
import reprint(soup.find("span", title=re.compile("RAM")).text)要获取所有数据:
from bs4 import BeautifulSoup r = requests.get("http://www.game-debate.com/games/index.php?g_id=21580&game=000%20Plus").contentsoup = BeautifulSoup(r,"lxml")cont = soup.select_one("div.systemRequirementsRamContent")ram = cont.select_one("span")print(ram["title"], ram.text)for span in soup.select("div.systemRequirementsSmallerBox.sysReqGameSmallBox span"): print(span["title"],span.text)这会给你:
000 Plus Minimum RAM Requirement 1 GB000 Plus Minimum Operating System Requirement Win Xp 32000 Plus Minimum Direct X Requirement DX 9000 Plus Minimum Hard Disk Drive Space Requirement 500 MB000 Plus GD Adjusted Operating System Requirement Win Xp 32000 Plus GD Adjusted Direct X Requirement DX 9000 Plus GD Adjusted Hard Disk Drive Space Requirement 500 MB000 Plus Recommended Operating System Requirement Win Xp 32000 Plus Recommended Hard Disk Drive Space Requirement 500 MB



