用于
map执行查找:
In [46]:df['1st'] = df['1st'].map(idxDict)dfOut[46]: 1st 2nd0 a 21 b 42 c 6
为了避免没有有效密钥的情况,您可以通过
na_action='ignore'
您还可以使用
df['1st'].replace(idxDict)回答有关效率的问题:
时机
In [69]:%timeit df['1st'].replace(idxDict)%timeit df['1st'].map(idxDict)1000 loops, best of 3: 1.57 ms per loop1000 loops, best of 3: 1.08 ms per loopIn [70]: %%timeitfor k,v in idxDict.items(): df ['1st'] = df ['1st'].replace(k, v)100 loops, best of 3: 3.25 ms per loop
因此,
map这里的使用速度快了3倍以上
在更大的数据集上:
In [3]:df = pd.concat([df]*10000, ignore_index=True)df.shapeOut[3]:(30000, 2)In [4]: %timeit df['1st'].replace(idxDict)%timeit df['1st'].map(idxDict)100 loops, best of 3: 18 ms per loop100 loops, best of 3: 4.31 ms per loopIn [5]: %%timeitfor k,v in idxDict.items(): df ['1st'] = df ['1st'].replace(k, v)100 loops, best of 3: 18.2 ms per loop
对于30K行df,
map速度要快约4倍,因此扩展性好于
replace或循环



