欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

Numpy知识点

程序员文章站 2024-01-18 15:11:01
最近在学习python数据分析的书籍《利用python进行数据分析》,以下是第四章总结的一些知识点 1.ndarray ndarray是一个N维数组对象。 创建ndarray: In [5]: data = [[1,2,3],[4,5,6]] In [6]: arr = numpy.array(da ......
  最近在学习python数据分析的书籍《利用python进行数据分析》,以下是第四章总结的一些知识点
1.ndarray
  ndarray是一个n维数组对象。
  创建ndarray:
in [5]: data = [[1,2,3],[4,5,6]]
in [6]: arr = numpy.array(data, dtype=numpy.int32)
in [7]: arr
out[7]: array([[1, 2, 3],
               [4, 5, 6]])

  查看数组各维度大小:

in [9]: arr.shape
out[9]: (2, 3)

  查看数组数据类型:

in [10]: arr.dtype
out[10]: dtype('int32')

  其他创建方法:

in [11]: numpy.zeros((3,6))  # 创建一个维度大小(3,6)的数组,长度全0
out[11]:
array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.]])

  arange类似于python内置的range:

in [12]: numpy.arange(15)
out[12]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

  转化type:

in [15]: farr = arr.astype(numpy.float64)
in [16]: farr.dtype
out[16]: dtype('float64')

  ps:如果将浮点数转化为整数,那么小数部分将被截断

  数组的切片是原始数组的视图,而不是数据被复制,所以修改切片会反应到原始数组上去:
in [2]: arr = numpy.arange(10)
in [3]: arr_slice = arr[5:8]
in [4]: arr_slice[0] = 123456
in [5]: arr
out[5]:
array([     0,      1,      2,      3,      4, 123456,      6,      7,      8,      9])

  ps:这样做是因为当数量大量数据时,频繁的复制会导致性能降低

  想要得到切片副本而非视图可以使用copy:
in [7]: arr2 = arr[5:8].copy()

  数组和值都可以赋值给ndarray:

in [13]: data = [[[1,2,3],[4,5,6]],[[4,5,6],[7,8,9]]]
in [14]: arr = numpy.array(data)
in [15]: arr2 = arr[0].copy()
in [16]: arr[0] = 123
in [17]: arr
out[17]:
array([[[123, 123, 123],
        [123, 123, 123]],
       [[  4,   5,   6],
        [  7,   8,   9]]])
in [18]: arr[0] = arr2
in [19]: arr
out[19]:
array([[[1, 2, 3],
        [4, 5, 6]],
       [[4, 5, 6],
        [7, 8, 9]]])

  布尔型的数组索引和切片可以一起使用

in [1]: arrr[name=="liu", :2]

  按顺序选区行子集,只需要索引一个列表或ndarray:

in [9]: arr
out[9]:
array([[0., 0., 0., 0.],
       [1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [3., 3., 3., 3.],
       [4., 4., 4., 4.],
       [5., 5., 5., 5.],
       [6., 6., 6., 6.],
       [7., 7., 7., 7.]])
in [10]: arr[[4,3,0,6]]
out[10]:
array([[4., 4., 4., 4.],
       [3., 3., 3., 3.],
       [0., 0., 0., 0.],
       [6., 6., 6., 6.]])

  将一维数组展开成二维数组:

in [11]: arr = numpy.arange(32).reshape((8,4))
in [12]: arr
out[12]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

  花式索引:

in [13]: arr[numpy.ix_([1,5,7,2],[0,3,1,2])]
out[13]:
array([[ 4,  7,  5,  6],
          [20, 23, 21, 22],
          [28, 31, 29, 30],
          [ 8, 11,  9, 10]])

  ps:花式索引是将数据复制到新数组中

  数据转置(transpose):
in [14]: arr
out[14]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])
in [15]: arr.t
out[15]:
array([[ 0,  4,  8, 12, 16, 20, 24, 28],
       [ 1,  5,  9, 13, 17, 21, 25, 29],
       [ 2,  6, 10, 14, 18, 22, 26, 30],
       [ 3,  7, 11, 15, 19, 23, 27, 31]])

  对于高维数组,需要设置编号才能转置:

in [16]: arr = numpy.arange(16).reshape((2,2,4))
in [17]: arr
out[17]:
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],
       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]]])
in [18]: arr.transpose((1,0,2))
out[18]:
array([[[ 0,  1,  2,  3],
        [ 8,  9, 10, 11]],
       [[ 4,  5,  6,  7],
        [12, 13, 14, 15]]])

 

2.利用数组进行数据处理
in [2]: point = numpy.arange(-5,5,0.01)
in [3]: xs, ys = numpy.meshgrid(point, point)
in [4]: ys
out[4]:
array([[-5.  , -5.  , -5.  , ..., -5.  , -5.  , -5.  ],
       [-4.99, -4.99, -4.99, ..., -4.99, -4.99, -4.99],
       [-4.98, -4.98, -4.98, ..., -4.98, -4.98, -4.98],
       ...,
       [ 4.97,  4.97,  4.97, ...,  4.97,  4.97,  4.97],
       [ 4.98,  4.98,  4.98, ...,  4.98,  4.98,  4.98],
       [ 4.99,  4.99,  4.99, ...,  4.99,  4.99,  4.99]])
 
in [6]: import matplotlib.pyplot as plt
in [7]: z = numpy.sqrt(xs**2+ ys**2)
in [8]: z
out[8]:
array([[7.07106781, 7.06400028, 7.05693985, ..., 7.04988652, 7.05693985,7.06400028],
       [7.06400028, 7.05692568, 7.04985815, ..., 7.04279774, 7.04985815,7.05692568],
       [7.05693985, 7.04985815, 7.04278354, ..., 7.03571603, 7.04278354,7.04985815],
       ...,
       [7.04988652, 7.04279774, 7.03571603, ..., 7.0286414 , 7.03571603,7.04279774],
       [7.05693985, 7.04985815, 7.04278354, ..., 7.03571603, 7.04278354,7.04985815],
       [7.06400028, 7.05692568, 7.04985815, ..., 7.04279774, 7.04985815,7.05692568]])
in [9]: plt.imshow(z,cmap=plt.cm.gray);plt.colorbar()
Numpy知识点
Numpy知识点

 

Numpy知识点
3.将条件逻辑表述为数组运算
in [9]: xarr = numpy.array([1.1,1.2,1.3,1.4,1.5])
in [10]: yarr=numpy.array([2.1,2.2,2.3,2.4,2.5])
in [11]: cond =numpy.array([true,false,true,true,false])
in [12]: numpy.where(cond,xarr,yarr)
out[12]: array([1.1, 2.2, 1.3, 1.4, 2.5])

第二/三个参数不一定要传数组

in [9]: numpy.where(arr>0,2,-2)