Python标准库中有一个HTML解析器,但是它不是很有用,并且自Python
2.6起已弃用。使用BeautifulSoup进行这种事情真的很容易:
from BeautifulSoup import BeautifulSoupfrom os.path import basename, splitextsoup = BeautifulSoup(my_html_string)for img in soup.findAll('img'): img['src'] = 'cid:' + splitext(basename(img['src']))[0]my_html_string = str(soup)


