python编码utf-8_面试问答

python编码utf-8

您不需要对已经编码的数据进行编码。当您尝试执行此操作时，Python会先尝试对其进行解码
，

unipre

然后再将其编码回UTF-8。这就是这里失败的原因：

    >>> data = u'u00c3' # Unipre data    >>> data = data.enpre('utf8')  # enpred to UTF-8    >>> data    'xc3x83'    >>> data.enpre('utf8')         # Try to *re*-enpre it    Traceback (most recent call last):      File "<stdin>", line 1, in <module>    UnipreDepreError: 'ascii' prec can't depre byte 0xc3 in position 0: ordinal not in range(128)

只需直接写您的数据文件，也没有必要编码已编码的数据。

如果改为建立

unipre

值，则实际上必须将那些值编码为可写入文件。您想使用

precs.open()

它，它返回一个文件对象，该文件对象将为您将Unipre值编码为UTF-8。

您也确实不想写UTF-8 BOM，除非您必须支持否则无法读取UTF-8的Microsoft工具（例如MS Notepad）。

对于您的MySQL插入问题，您需要做两件事：

添加
```
charset='utf8'
```
到您的
```
MySQLdb.connect()
```
通话中。
使用
```
unipre
```
对象，而不是
```
str
```
查询或插入对象，而是 使用sql参数， 以便MySQL连接器可以为您做正确的事情：
```
artiste = artiste.depre('utf8')  # it is already UTF8, depre to unipre
```
c.execute(‘SELECt COUNT(id) AS nbr FROM artistes WHERe nom=%s’, (artiste,))
…
c.execute(‘INSERT INTO artistes(nom,status,path) VALUES(%s, 99, %s)’, (artiste, artiste + u’/’))

如果您

precs.open()

改为自动解码内容，则实际上可能会更好：

    import precs    sql = mdb.connect('localhost','admin','ugo&(-@F','music_vibration', charset='utf8')    with precs.open('config/index/'+index, 'r', 'utf8') as findex:        for line in findex: if u'#artiste' not in line:     continue artiste=line.split(u'[:::]')[1].strip()        cursor = sql.cursor()        cursor.execute('SELECt COUNT(id) AS nbr FROM artistes WHERe nom=%s', (artiste,))        if not cursor.fetchone()[0]: cursor = sql.cursor() cursor.execute('INSERT INTO artistes(nom,status,path) VALUES(%s, 99, %s)', (artiste, artiste + u'/')) artists_inserted += 1

您可能需要复习Unipre和UTF-8和编码。我可以推荐以下文章：

在Python的Unipre指南
Ned Batchelder的实用Unipre
每个软件开发人员绝对，肯定必须绝对了解Unipre和字符集（无借口！）作者：Joel Spolsky

python编码utf-8

…

面试问答相关栏目本月热门文章