python数据清洗4.Pandas常用数据结构DataFrame和方法

·通过pandas.Dataframe来创建Dataframe数据结构
·pandas.Dataframe(data,index,dtype,columns)
·上述参数中，data可以为列表，array或者dict
·上述参数中，index表示行索引，columns代表列名或者列标签

用列表创建dataframe

import numpy as np
import pandas as pd

list1 = [['张三',23,'男'],['李四','27','女'],['王二','26','女']]
df1 = pd.Dataframe(list1,columns=['姓名','年龄','性别'])
print(df1)
print(df1.head(5))

head函数用于返回前五行数值

Dataframe.head(n=5)
			Return the first n rows.

Parameters:	    n : int, default 5
						    	Number of rows to select.

Returns:	           obj_head : type of caller
						      The first n rows of the caller object.

head的底层方法可以看到，head( )函数的原型中，默认的参数size大小是 5，所以会返回 5 个数据

用字典创建dataframe

df2 = pd.Dataframe({'姓名':['张三','李四','王二'],'年龄':[23,27,29],'性别':['男','女','女']})
print(df2)

用数组创建dataframe

array1 = np.array([['张三',23,'男'],['李四',27,'女'],['王二',29,'女']])
df3 = pd.Dataframe(array1,columns=['姓名','年龄','性别'],index=['a','b','c'])

print(df3)
print(df3.values)
print(df3.index) # 行索引标签
print(df3.columns.tolist())
print(df3.ndim)
print(df3.shape)
print(df3.size)
print(df3.dtypes)

series和dataframe常用方法如下

方法名称	说明
values	返回对象所有元素的值
index	返回行索引
dtypes	返回索引
shape	返回对象数据形状
size	返回对象的个数
ndim	返回对象的维度
columns	返回列标签（只针对dataframe数据结构）

python数据清洗4.Pandas常用数据结构DataFrame和方法

Python相关栏目本月热门文章