问题是服务器返回由Gzip压缩的数据。尝试这个:
#-*- coding: utf-8 -*-from __future__ import print_functionimport gzipimport StringIOimport urllib2from BeautifulSoup import BeautifulSoupurl = 'http://iccna.blog.sohu.com/164572951.html'response = urllib2.urlopen(url)data = response.read()data = StringIO.StringIO(data)gzipper = gzip.GzipFile(fileobj=data)html = gzipper.read()soup = BeautifulSoup(html, fromEncoding='gbk')print(soup)
在我的系统上,汉字看起来仍然不对,但这可能会为您提供正确的方向。



