我知道这可能是“创伤性的”,但是对于那些自动生成的页面,您只想将那些令人讨厌的图像移走,再也不会回来,因此,采用所需模式的快速n脏正则表达式通常是我的选择(没有Beautiful
Soup依赖项是一个很大的优势):
import urllib, resource = urllib.urlopen('http://www.cbssports.com/nba/draft/mock-draft').read()## every image name is an abbreviation composed by capital letters, so...for link in re.findall('http://sports.cbsimg.net/images/nba/logos/30x30/[A-Z]*.png', source): print link ## the pre above just prints the link; ## if you want to actually download, set the flag below to True actually_download = False if actually_download: filename = link.split('/')[-1] urllib.urlretrieve(link, filename)希望这可以帮助!



