numpy库 - 基础操作（第一部分）

numpy库的引入：

import numpy as np

1、建立数组 1.1 用array建立数组 1.1.1 一维数组的建立

>>> import numpy as np
>>> a1 = np.array([1,2,3,4,5,6])       #参数为列表
>>> a1
array([1, 2, 3, 4, 5, 6])
>>> type(a1)

>>> a2 = np.array((1,2,3,4,5,6))
>>> a2
array([1, 2, 3, 4, 5, 6])
>>> type(a2)

1.1.2 二维数组的建立

>>> b1 = ([[1,2,3],[4,5,6]])
>>> b1
[[1, 2, 3], [4, 5, 6]]

>>> b2 = ([(0,0,0),(1,2,3),(4,5,6)])
>>> b2
[(0, 0, 0), (1, 2, 3), (4, 5, 6)]

1.1.3 三维数组的建立

>>> c1 = np.array([[[1,2,3],
                [4,5,6]],
               [[1,1,1],
                [2,2,2]],
               [[4,4,4],
                [5,5,6]]])
>>> c1
array([[[1, 2, 3],
        [4, 5, 6]],

       [[1, 1, 1],
        [2, 2, 2]],

       [[4, 4, 4],
        [5, 5, 6]]])
>>> type(c1)

1.1.4 注意事项

数组元素的类型除了整型以外，还有字符串、布尔值、浮点型、复数型等。

import numpy as np
d1 = np.array(['b', '中国', '100003600'])
d2 = np.array([True, False])
d3 = np.array([-1, 2, 10])
d4 = np.array([10.5, 6.2222, 7.248888])
d5 = np.array([10+2j, 8j, 2.1+3j])
print('d1:', d1,'n','d2:', d2,'n',
      'd3:', d3,'n','d4:', d4,'n','d5:', d5)
print('d1.dtype:',d1.dtype)    #使用dtytpe查看数组元素类型
print('d2.dtype:',d2.dtype)
print('d3.dtype:',d3.dtype)
print('d4.dtype:',d4.dtype)
print('d5.dtype:',d5.dtype)
print('d2.alltrue:', np.alltrue(d2))
print('d3.alltrue:', np.alltrue(d3))  #使用alltrue测试数组元素是否都为True
>>> output:
    d1: ['b' '中国' '100003600'] 
 	d2: [ True False] 
 	d3: [-1  2 10] 
 	d4: [10.5       6.2222    7.248888] 
 	d5: [10. +2.j  0. +8.j  2.1+3.j]
	d1.dtype:  
numpy的数组元素要求同一类型，也就是不能出现即是整型，又是字符串的现象。如果不小心输入不同类型元素，array函数会将其他类型的元素自动进行转化。 
import numpy as np
e1 = np.array(['OK?', 10, '岁', 0.3, False])   #输入值类型不一致
print('e1:', e1)
print('e1.dtype:', e1.dtype)
e2 = np.array([1, 0.2, 2.3333333])
print('e2:', e2)
print('e2.dtype:', e2.dtype)
e3 = np.array([1, 0.2, 2.3333333, 1+8j])
print('e3:', e3)
print('e3.dtype:', e3.dtype)
>>> output:
    e1: ['OK?' '10' '岁' '0.3' 'False']
	e1.dtype:  
1.2 其他常见数组的建立方法 
1.2.1 arange()函数 
以指定步长累加产生指定范围有序元素的数组。
函数使用格式：numpy.arange([start,]stop[,step,],dtype=None)，其中，start指定开始数字，stop指定结束数字，step为增量步长，dtype可以指定产生数组元素的数值类型。 
import numpy as np
f1 = np.arange(1, 10, 2, dtype = 'float64')
# 从 1 开始，到 10（不包含10）结束，每隔 2 取一个数，并将所有元素转化为'float64'类型。
f2 = np.arange(5)
f3 = np.arange(0, 5)
f4 = np.arange(0, 5, 0.5)
f5 = np.arange(5, 0, -1)
print('f1:', f1)
print('f2:', f2)
print('f3:', f3)
print('f4:', f4)
print('f5:', f5)
>>> output:
    f1: [1. 3. 5. 7. 9.]
	f2: [0 1 2 3 4]
	f3: [0 1 2 3 4]
	f4: [0.  0.5 1.  1.5 2.  2.5 3.  3.5 4.  4.5]
	f5: [5 4 3 2 1]
 
1.2.2 linspace()函数 
在指定的范围内返回均匀步长的样本数组。
函数使用格式：numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)，其中，start指定开始数字，stop指定结束数字，这两个参数必须提供，其他可选。endpoint=True表示指定stop也包含在内，retstep=True表示返回的数组里带步长数。 
import numpy as np
g1 = np.linspace(0,1,2)
g2 = np.linspace(0,4,4)
g3 = np.linspace(0,4,10)
print('g1:', g1)
print('g2:', g2)
print('g3:', g3)
g4 = np.linspace(0,4,4,endpoint = False)
print('g4:', g4)
g5 = np.linspace(0,4,4,retstep = True)
print('g5:', g5)
g6 = np.linspace(0,4,4,endpoint = False,retstep = True)
print('g6:', g6)
>>> output:
    g1: [0. 1.]
	g2: [0.         1.33333333 2.66666667 4.        ]
	g3: [0.         0.44444444 0.88888889 1.33333333 1.77777778 2.22222222
 		 2.66666667 3.11111111 3.55555556 4.        ]
	g4: [0. 1. 2. 3.]
	g5: (array([0.        , 1.33333333, 2.66666667, 4.        ]), 1.3333333333333333)
	g6: (array([0., 1., 2., 3.]), 1.0)
 
1.2.3 zeros()函数 
产生值为 0 的数组。
函数使用格式：numpy.zeros(shape, dtype =float, order = 'C')，其中，order指定存储格式是 C 语言风格（C）还是 F 语言风格（F）。 
import numpy as np
z1 = np.zeros(5)
z2 = np.zeros((3,3))
print('z1:',z1)
print('z2:',z2)
>>> output:
    z1: [0. 0. 0. 0. 0.]
	z2: [[0. 0. 0.]
 		[0. 0. 0.]
 		[0. 0. 0.]]
 
1.2.4 ones()函数 
产生值为 1 的数组。
函数使用格式：numpy.oneos(shape, dtype =float, order = 'C')。 
import numpy as np
o1 = np.ones(5)
o2 = np.ones((2,3))
print('o1:',o1)
print('o2:',o2)
>>> output:
    o1: [1. 1. 1. 1. 1.]
	o2: [[1. 1. 1.]
 		[1. 1. 1.]]
 
1.2.5 empty()函数 
产生不指定值的数组，数组内的值不指定。
函数使用格式：numpy.empty(shape, dtype =float, order = 'C')。 
import numpy as np
e1 = np.empty(5)
e2 = np.empty((4,4))
print('e1:',e1)
print('e2:',e2)
>>> output1:
    e1: [9.90263869e+067 8.01304531e+262 2.60799828e-310 0.00000000e+000
 		0.00000000e+000]
	e2: [[6.23042070e-307 4.67296746e-307 1.69121096e-306 1.29061074e-306]
 		 [1.89146896e-307 7.56571288e-307 3.11525958e-307 1.24610723e-306]
 		 [1.37962320e-306 1.29060871e-306 2.22518251e-306 1.33511969e-306]
 		 [1.78022342e-306 1.05700345e-307 1.11261027e-306 4.84184333e-322]]
>>> output2:
    e1: [ 6.23042070e-307  1.42417221e-306  1.37961641e-306 -2.65972661e-207
  		7.22947795e+223]
	e2: [[6.23042070e-307 4.67296746e-307 1.69121096e-306 1.29061074e-306]
         [1.89146896e-307 7.56571288e-307 3.11525958e-307 1.24610723e-306]
 		 [1.37962320e-306 1.29060871e-306 2.22518251e-306 1.33511969e-306]
 		 [1.78022342e-306 1.05700345e-307 3.11521884e-307 3.72363246e-317]]
 
1.2.6 logspace()函数 
返回在对数刻度上均匀间隔的数字。
函数使用格式：numpy.linspace(start, stop, num=50, endpoint=True, base=10, dtype=None, axis=0)，其中，base参数为指定对数的底，默认为10. 
import numpy as np
h1 = np.logspace(2.0, 3.0, 5)
print('h1:', h1)
>>> output:
    h1: [ 100.          177.827941    316.22776602  562.34132519 1000.        ]
 
1.2.7 full()函数 
返回指定值的数组。
函数使用格式：numpy.full(shape, fill_value, dtype=None, order='C')，其中，fill_value指定需要填充的数值。 
import numpy as np
f1 = np.full(5,10)
f2 = np.full((3,3), 8)
f3 = np.full((3,3), np.inf)    # inf 为正无穷
f4 = np.full((3,3), '中国')
print('f1:', f1)
print('f2:', f2)
print('f3:', f3)
print('f4:', f4)
>>> output:
    f1: [10 10 10 10 10]
	f2: [[8 8 8]
 		 [8 8 8]
 		 [8 8 8]]
	f3: [[inf inf inf]
 		 [inf inf inf]
 	 	 [inf inf inf]]
	f4: [['中国' '中国' '中国']
 		 ['中国' '中国' '中国']
		 ['中国' '中国' '中国']]
 
1.2.8 eye()函数 
返回对角线为 1 ，其它都为 0 的一个二维数组。
函数使用格式：numpy.eye(N,M=None,k=0,dtype=,order='C')，其中，N 指定返回数组的行数，M 指定返回数组的列数（默认情况下M=N），k 用于指定对角线位置， 0 为主对角线，正数为上对角线，负数为下对角线。 
import numpy as np
e1 = np.eye(4)
e2 = np.eye(4,4, 2)
e3 = np.eye(4,4, -2)
print('e1:', e1)
print('e2:', e2)
print('e3:', e3)
>>> output:
    e1: [[1. 0. 0. 0.]
		 [0. 1. 0. 0.]
		 [0. 0. 1. 0.]
		 [0. 0. 0. 1.]]
	e2: [[0. 0. 1. 0.]
		 [0. 0. 0. 1.]
		 [0. 0. 0. 0.]
		 [0. 0. 0. 0.]]
	e3: [[0. 0. 0. 0.]
		 [0. 0. 0. 0.]
		 [1. 0. 0. 0.]
		 [0. 1. 0. 0.]]
 
1.2.9 repeat()函数 
建立每个元素重复 N 次的数组。
函数使用格式：numpy.repeat(a,repeats,axis=None)，a 为集合对象，repeats 为指定元素重复次数，在多维数组的情况下，axis可以指定重复维度的方向。 
import numpy as np
r1 = np.repeat([0,1,0], 5)
print('r1:', r1)
>>> output:
    r1: [0 0 0 0 0 1 1 1 1 1 0 0 0 0 0]
 
1.3 数组属性的使用 
ndim属性：返回数组的维数。
shape属性：返回数组的形状大小。
size属性：返回数组元素个数。
dtype属性：返回数组元素类型。
itemsize属性：返回数组元素字节大小。 
import numpy as np
t1 = np.array([['a','b','c'],['d','e','f'],['g','h','i']])
print('t1:', t1)
print('t1.ndim:', t1.ndim)
print('t1.shape:', t1.shape)
print('t1.size:', t1.size)
print('t1.dtype:', t1.dtype)
print('t1.itemsize:', t1.itemsize)
>>> output:
    t1: [['a' 'b' 'c']
		 ['d' 'e' 'f']
		 ['g' 'h' 'i']]
	t1.ndim: 2
	t1.shape: (3, 3)
	t1.size: 9
	t1.dtype: dtype(' 
1.4 数组方法的使用 
reshape方法：改变数组形状。
all方法：判断指定的数组元素是否都是非0，则返回True，否则返回False。
any方法：判断数组元素若有非0值，则返回True，否则返回False。
copy方法：复制数组副本。
astype方法：改变数组元素类型。 
import numpy as np
t1 = np.arange(9)
# reshape方法
t2 = t1.reshape(3,3)  #从一维数组转化为3行3列的二维数组。
print('t1:', t1)
print('t2:', t2)
# all方法
t3 = np.ones(9).all()
t4 = np.array([1,0,2])
t5 = np.array([[1,0,2],[1,2,3]])
print('t3:',t3)
print('t4.all:', t4.all())
print('t5.all(axis=1):', t5.all(axis=1))  # 从行方向判断每行是否都为 0。
# 参数axis指向维度方向，0表示第二维，列方向；1表示第一维，行方向，以此类推。
# any方法
t6 = np.array([[1,0,2],[0,0,2]])
print('t6.any(axis=1):', t6.any(axis=1))
print('t6.any(axis=0):', t6.any(axis=0))
# copy方法
t7 = np.array([['a', 'b', 'c'],
               ['d', 'e', 'f'],
               ['g', 'h', 'i']])
t8 = t7
t9 = np.copy(t7)
print('id(t7):', id(t7))
print('id(t8):', id(t8))
print('id(t9):', id(t9))
# astype方法
t10 = np.ones(9, dtype = int)
print('t10:', t10)
print('t10.astype(float):', t10.astype(float))
>>> output:
    t1: [0 1 2 3 4 5 6 7 8]
	t2: [[0 1 2]
		 [3 4 5]
		 [6 7 8]]
	t3: True
	t4.all: False
	t5.all(axis=1): [False  True]
	t6.any(axis=1): [ True  True]
	t6.any(axis=0): [ True False  True]
	id(t7): 2182816274448
	id(t8): 2182816274448
	id(t9): 2182816274544
	t10: [1 1 1 1 1 1 1 1 1]
	t10.astype(float): [1. 1. 1. 1. 1. 1. 1. 1. 1.]
 
2、索引与切片 
2.1 基本索引 
Numpy数组的基本索引使用方法，同Python列表对象的使用方法，也采用方括号[]来索引数组值。 
import numpy as np
# 一维数组单一元素的读、写
n1=np.arange(10)
print('n1:', n1)
print('n1[9]:', n1[9])
print('n1[-1]:', n1[-1])
n1[0]=10   # 根据下标修改元素（下标为0的元素）
print('n1:', n1)
# 二维数组单一元素的读、写
n2=n1.reshape(2,5)
print('n2:', n2)
print('n2[1,0]:', n2[1,0])
n2[1,1]=-1    # 根据下标修改元素（第2行第2列）
print('n2[1,1]:', n2[1,1])
# 三维数组单一元素的读、写
n3=np.arange(12).reshape(2,2,3)
print('n3:', n3)
print('n3[1,0,0]:', n3[1,0,0])  #下标1为第3维第2行，中间0为第2维第1行，最右边0为第1维第1列。
print('n3[1,1,1]:', n3[1,1,1])
n3[1,0,2]=-1    # 根据下标修改元素（三维：第2行，二维：第1行，一维：第3列）
print('n3:', n3)
>>> output:
    n1: [0 1 2 3 4 5 6 7 8 9]
	n1[9]: 9
	n1[-1]: 9
	n1: [10  1  2  3  4  5  6  7  8  9]
	n2: [[10  1  2  3  4]
		 [ 5  6  7  8  9]]
	n2[1,0]: 5
	n2[1,1]: -1
	n3: [[[ 0  1  2]
		  [ 3  4  5]]
		 [[ 6  7  8]
 		  [ 9 10 11]]]  #数字8为第三维第2行，第二维第1行，第一维第3列。
	n3[1,0,0]: 6
	n3[1,1,1]: 10
	n3: [[[ 0  1  2]
		  [ 3  4  5]]
	 	 [[ 6  7 -1]
		  [ 9 10 11]]]
 
2.1.1 索引省略用法 
当确定不了维度时，可以通过下标右边…省略号或直接省略下标数，来读取数组。
从下标左边开始省略，只能用“···”的形式，不能用n3[,2]的方式，否则将报错。
从下标中间维度只能用“···”的形式，不能直接省略中间的下标数，否则将报错。 
import numpy as np
n3=np.arange(12).reshape(2,2,3)
print('n3:', n3)
print('n3[1,...]:',n3[1,...])  # 省略右边
print('n3[1,]:',n3[1,])  
print('n3[...,2]:',n3[...,2])   # 省略左边
print('n3[1,...,2]:',n3[1,...,2])   # 省略中间
>>> output:
    n3: [[[ 0  1  2]
  		  [ 3  4  5]]
		 [[ 6  7  8]
 		  [ 9 10 11]]]   #数字8为第三维第2行，第二维第1行，第一维第3列。
	n3[1,...]: [[ 6  7  8]
			    [ 9 10 11]]
	n3[1,]: [[ 6  7  8]
 			 [ 9 10 11]]
	n3[...,2]: [[ 2  5]
  				[ 8 11]]
	n3[1,...,2]: [ 8 11]
 
2.1.2 生成数组索引 
import numpy as np
n4=np.arange(4).reshape(2,2)
print('n4:', n4)
print('n4[1][1]:', n4[1][1])  # 等价于n4[1,1]
print('n4[1,1]:', n4[1,1])
>>> output:
    n4: [[0 1]
		 [2 3]]
	n4[1][1]: 3
	n4[1,1]: 3
 
2.2 切片 
下标切片的基本格式为[b:e:s]，b为下标开始数字，e为下标结束数字（该数字本身位置的下标不包括，即对应数学上的右开区间），s为步长，其默认值为1。b、e、s可以任意省略。 
import numpy as np
# 一维数组切片
s1=np.arange(1,10)
print('s1:', s1)
print('s1[1:4]:', s1[1:4])   # 取一维数组下标1~3的元素
print('s1[:5]:', s1[:5])   # 取一维数组下标0~4的元素
print('s1[5:]:', s1[5:])   # 取一维数组下标5到最后的元素
print('s1[:-1]:', s1[:-1])   # 取一维数组下标倒数第二开始往前所有的元素
print('s1[:]:', s1[:])   # 取一维数组所有下标的元素
print('s1[::2]:', s1[::2])  #步长为2
# 二维数组切片
s2=np.arange(9).reshape(3,3)
print('s2:', s2)
  # 二维数组行切片
print('s2[1:3]:', s2[1:3])    # 取2、3行子数组值，等价于s2[1:3,]或s2[1:3,:]
print('s2[1:3,]:', s2[1:3,])
print('s2[1:3,:]:', s2[1:3,:])
print('s2[:2]:', s2[:2])   # 取1、2行子数组
print('s2[2:]:', s2[2:])   # 取第3行子数组，等价于s2[2]
    # 行切片，列指定
print('s2[:,2]:', s2[:,2])    # 取所有行，第3列子数组
  # 二维数组列切片
print('s2[:,:2]:', s2[:,:2])    # 取所有行，第1、2列子数组
print('s2[...,:]:', s2[...,:])  # 取所有行，所有列
print('s2[1,2:]:', s2[1,2:])    # 取第2行，第3列
# 三维数组切片
s3=np.array([[['Tom',10,'boy'],['John',11,'girl']],[['Alice',12,'girl'],['Kite',11,'boy']]])
print('s3:', s3)
print('s3[1,1,:]:', s3[1,1,:])      #获取第三维的第2行，第二维的第1行，第一维的所有列的子数组
print('s3[0,:,:2]:', s3[0,:,:2])    #获取第三维第1行，第二维所有行，第一维1，2列的子数组
>>> output:
    s1: [1 2 3 4 5 6 7 8 9]
	s1[1:4]: [2 3 4]
	s1[:5]: [1 2 3 4 5]
	s1[5:]: [6 7 8 9]
	s1[:-1]: [1 2 3 4 5 6 7 8]
	s1[:]: [1 2 3 4 5 6 7 8 9]
	s1[::2]: [1 3 5 7 9]
	s2: [[0 1 2]
 		 [3 4 5]
		 [6 7 8]]
	s2[1:3]: [[3 4 5]
			  [6 7 8]]
	s2[1:3,]: [[3 4 5]
			   [6 7 8]]
	s2[1:3,:]: [[3 4 5]
			    [6 7 8]]
	s2[:2]: [[0 1 2]
 			 [3 4 5]]
	s2[2:]: [[6 7 8]]
	s2[:,2]: [2 5 8]
	s2[:,:2]: [[0 1]
			   [3 4]
 			   [6 7]]
	s2[...,:]: [[0 1 2]
			    [3 4 5]
 				[6 7 8]]
	s2[1,2:]: [5]
	s3: [[['Tom' '10' 'boy']
  		  ['John' '11' 'girl']]
		 [['Alice' '12' 'girl']
  		  ['Kite' '11' 'boy']]]
	s3[1,1,:]: ['Kite' '11' 'boy']
	s3[0,:,:2]: [['Tom' '10']
 				 ['John' '11']]
 
2.3 花式索引 
花式索引（Facny indexing），利用整数数组的所有元素作为下标值进行索引，又叫数组索引。 
2.3.1 整数数组索引 
import numpy as np
# 一维数组索引
fi1=np.array(['Tom猫','加菲猫','波斯猫','黑猫','英国短脸猫','田园猫'])
f1=np.array([1,2,4,5])
print('fi1[f1]:', fi1[f1])   #去掉不是猫品种分类的猫
# 二维数组索引
  # 用一维整数组作为数组索引，生成指定行的子数组
fi2=np.array([['Tom猫',1,200],['加菲猫',10,1000],['波斯猫',5,2000],['黑猫',2,180],['英国短脸猫',8,1800],['田园猫',20,100]])
f2=np.array([1,2,3])   # 用一维数组指定第2、3、4行
print('fi2[f2]:', fi2[f2])
  # 指定x,y坐标的数组，求指定数组对应坐标的元素，形成子数组。
fi3=np.array([[0,-1,9],[8,1,10],[-2,8,3]])
print('fi3:', fi3)
x=np.array([[0,1,2]])   #指定x坐标值
y=np.array([0,1,1])          #指定y坐标值
print('fi3[x,y]:', fi3[x,y])    #求x,y坐标对应的所有元素
>>> output:
    fi1[f1]: ['加菲猫' '波斯猫' '英国短脸猫' '田园猫']
	fi2[f2]: [['加菲猫' '10' '1000']
			  ['波斯猫' '5' '2000']
 			  ['黑猫' '2' '180']]
	fi3: [[ 0 -1  9]
 		  [ 8  1 10]
		  [-2  8  3]]
	fi3[x,y]: [[0 1 8]]
 
2.3.2 布尔数组索引 
 布尔索引要求布尔数组与被引数组保持一样的形状，索引结果生成新的一维数组。
 
 另外，也可以以行为单位进行布尔索引。
  
import numpy as np
# 布尔索引
s4=np.arange(9).reshape(3,3)
print('s4:', s4)
b1=np.array([[True,True,False],[False,True,False],[False,False,True]])
print('b1:', b1)
print('s4[b1]:', s4[b1])
b2=b1[:,1]
print('b2:', b2)
print('s4[b2]:', s4[b2])
# 以行为单位进行布尔索引
b3=np.array([True,False,False])
print('s4[b3]:', s4[b3])
>>> output:
    s4: [[0 1 2]
 		 [3 4 5]
 		 [6 7 8]]
	b1: [[ True  True False]
 		 [False  True False]
 		 [False False  True]]
	s4[b1]: [0 1 4 8]
	b2: [ True  True False]
	s4[b2]: [[0 1 2]
			 [3 4 5]]
	s4[b3]: [[0 1 2]]
 
2.4 迭代 
数组作为元素集合，可以被迭代读取相应的元素。 
import numpy as np
# 一维数组的迭代
print('g:')
d1=np.arange(3)
for g in d1:
    print(g)
# 二维数组的迭代
print('------------------')
print('g1:')
d2=np.array([['Tom',1,10],['John',2,100],['Mike',3,200]])
for g1 in d2:
    print(g1)
>>> output:
    g:
		0
		1
		2
	------------------
	g1:
		['Tom' '1' '10']
		['John' '2' '100']
		['Mike' '3' '200']
 
END 
编辑 | sxlibe 
往期目录： 
Python系列 01 ：数据类型之——数字 
Python系列 01 ：数据类型之——列表+元组 
Python系列 01 ：数据类型之——字典 
Python系列 01 ：数据类型之——集合 
Python系列 01 ：数据类型之——字符串 
Python系列 02 ：语法基础之——变量、分支结构 
Python系列 02 ：语法基础之——循环结构 
Python系列 02 ：语法基础之——函数 
Python系列 03 ：文件操作之——基本操作 
Python系列 03 ：文件操作之——文件属性 
Python系列 03 ：文件操作之——文件及文件夹操作 
Python系列04：xlrd库 
Python系列04：xlwt库和openpyxl库

numpy库 - 基础操作（第一部分）

Python相关栏目本月热门文章