您应该同时使用python
2和3
io.StringIO处理
unipre对象以及
io.BytesIO处理
bytes对象,以实现前向兼容(这是3所提供的全部功能)。
这是一个更好的测试(针对python 2和3),其中不包括从numpy到
str/的转换成本
bytes
import numpy as npimport stringb_data = np.random.choice(list(string.printable), size=1000000).tobytes()u_data = b_data.depre('ascii')u_data = u'u2603' + u_data[1:] # add a non-ascii character接着:
import io%timeit io.StringIO(u_data)%timeit io.StringIO(b_data)%timeit io.BytesIO(u_data)%timeit io.BytesIO(b_data)
在python 2中,您还可以测试:
import StringIO, cStringIO%timeit cStringIO.StringIO(u_data)%timeit cStringIO.StringIO(b_data)%timeit StringIO.StringIO(u_data)%timeit StringIO.StringIO(b_data)
其中一些会崩溃,抱怨非ASCII字符
Python 3.5结果:
>>> %timeit io.StringIO(u_data)100 loops, best of 3: 8.61 ms per loop>>> %timeit io.StringIO(b_data)TypeError: initial_value must be str or None, not bytes>>> %timeit io.BytesIO(u_data)TypeError: a bytes-like object is required, not 'str'>>> %timeit io.BytesIO(b_data)The slowest run took 6.79 times longer than the fastest. This could mean that an intermediate result is being cached1000000 loops, best of 3: 344 ns per loop
Python 2.7结果(在另一台机器上运行):
>>> %timeit io.StringIO(u_data)1000 loops, best of 3: 304 µs per loop>>> %timeit io.StringIO(b_data)TypeError: initial_value must be unipre or None, not str>>> %timeit io.BytesIO(u_data)TypeError: 'unipre' does not have the buffer interface>>> %timeit io.BytesIO(b_data)10000 loops, best of 3: 77.5 µs per loop>>> %timeit cStringIO.StringIO(u_data)UnipreEnpreError: 'ascii' prec cant enpre character u'u2603' in position 0: ordinal not in range(128)>>> %timeit cStringIO.StringIO(b_data)1000000 loops, best of 3: 448 ns per loop>>> %timeit StringIO.StringIO(u_data)1000000 loops, best of 3: 1.15 µs per loop>>> %timeit StringIO.StringIO(b_data)1000000 loops, best of 3: 1.19 µs per loop



