- 创建数据
- 修改其中几个值为NaN
- 删除NaN
- 去除含有NaN的所有行或列
- 替代NaN
- 将所有NaN替换成其他值
- 判断是否缺失数据
- 输出缺失数据的位置
hangIndex = np.array([i for i in string.ascii_letters[:5]])
table = pd.Dataframe(np.arange(20).reshape(5,4),hangIndex,pd.date_range('20211010',periods=4))
Run:
table:
2021-10-10 2021-10-11 2021-10-12 2021-10-13
a 0 1 2 3
b 4 5 6 7
c 8 9 10 11
d 12 13 14 15
e 16 17 18 19
table.iloc[0,0] = np.nan table.iloc[1:2,:2]=np.nan table['20211013'] = np.nan
Run;
table:
2021-10-10 2021-10-11 2021-10-12 2021-10-13
a NaN 1.0 2 NaN
b NaN NaN 6 NaN
c 8.0 9.0 10 NaN
d 12.0 13.0 14 NaN
table.dropna(
axis=0,
how='any'
)
- axis=0 表示行
- axis=1 表示列
- how=‘any’ 表示只要含有NaN就删除该行或该列
- how=‘all’ 必须全部是NaN才删除
Run:
table:
Empty Dataframe
Columns: [2021-10-10 00:00:00, 2021-10-11 00:00:00, 2021-10-12 00:00:00, 2021-10-13 00:00:00]
Index: []
table.fillna(value=-1)
Run:
table:
2021-10-10 2021-10-11 2021-10-12 2021-10-13
a -1.0 1.0 2 -1.0
b -1.0 -1.0 6 -1.0
c 8.0 9.0 10 -1.0
d 12.0 13.0 14 -1.0
e 16.0 17.0 18 -1.0
table.isnull()
Run:
table:
2021-10-10 2021-10-11 2021-10-12 2021-10-13
a True False False True
b True True False True
c False False False True
d False False False True
e False False False True
table.isna()
Run:
table:
2021-10-10 2021-10-11 2021-10-12 2021-10-13
a True False False True
b True True False True
c False False False True
d False False False True
e False False False True
hang,lie = -1,-1
for label,contents in table.isna().iteritems():
lie+=1
for content in contents:
hang+=1
if content:
print((hang%5,lie%4),sep='t')
- contents遍历顺序是按列遍历
Run:
(0, 0)
(1, 0)
(1, 1)
(0, 3)
(1, 3)
(2, 3)
(3, 3)
(4, 3)
value = table.isna().values
for i in range(len(value)):
for j in range(len(value[0])):
if value[i][j]:
print((i,j))
Run:
(0, 0)
(1, 0)
(1, 1)
(0, 3)
(1, 3)
(2, 3)
(3, 3)
(4, 3)



