欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

Python数据分析——numpy

程序员文章站 2022-03-20 18:05:17
...

(本专栏是我的慕课学习笔记,后续发现不足之处会更新)

定义list数据结构:

>>> L1 = [[1, 3, 5], [2, 4, 6]]
>>> print('L1类型:', type(L1))
>>> print('L1:', L1)

L1类型: <class 'list'>
L1: [[1, 3, 5], [2, 4, 6]]

运算速度更快的数据结构:(ndarray只允许存放一种数据类型)

>>> import numpy as np

>>> L1 = [[1, 3, 5], [2, 4, 6]]
>>> L2 = np.array(L1)   # ndarray只允许存放一种数据类型

>>> print('L2类型:', type(L2))
L2类型: <class 'numpy.ndarray'>

>>> print('L2:\n', L2)
L2:
 [[1 3 5]
 [2 4 6]]

>>> print('L2规模:', L2.shape)     # L2的规模(几行几列)
L2规模: (2, 3)

>>> print('L2维度:', L2.ndim)      # L2的维度
L2维度: 2

>>> print('L2数据类型:', L2.dtype)      # L2的数据类型
L2数据类型: int32

>>> print('L2每个元素占字节数:', L2.itemsize)    # L2每个元素的大小
L2每个元素占字节数: 4

>>> print('L2大小:', L2.size)      # L2大小
L2大小: 6

>>> L3 = np.array(L1, dtype=np.float)   # 规定存放数据为float型
>>> print(L3.dtype)
float64

numpy常用Array:

import numpy as np

>>> print(np.zeros([2, 4]))     # 2行4列的0
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]]
 
>>> print(np.ones([3, 5]))      # 3行5列的1
[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]
 
>>> print(np.random.rand())     # 单个随机数
0.5346756487556099

>>> print(np.random.rand(2, 4))     # 2行4列的随机数
[[0.18197163 0.07243006 0.07054381 0.14984757]
 [0.44981785 0.99545415 0.53630171 0.84512616]]
 
>>> print(np.random.randint(1, 20))     # 单个1到20之间的随机整数
10

>>> print(np.random.randint(1, 20, 3))     # 连续生成3个1到20之间的随机整数
[16  1 15]

>>> print(np.random.randn(2, 4))    # 生成2行4列均值为0,方差为1的正态分布随机数
[[-1.43757873 -1.64088587 -0.2095437   0.93030802]
 [ 1.93776804 -0.06551866 -0.81996059  1.25876278]]
 
>>> print(np.random.choice([1, 2, 3, 4, 5]))    # 从这5个数之中随机取数
5

numpy常用操作:(对array对象的每个数据进行操作)

>>> import numpy as np

# 生成等差数列1~10,从1开始,步长为2
# arange()返回一个array()对象,range()返回一个list对象
>>> array = np.arange(1, 11, 2)     
>>> print(array)    # [1,3,5,7,9]
[1 3 5 7 9]

>>> print(np.exp(array))    # [e^1,e^3,e^5,e^7,e^9]
[2.71828183e+00 2.00855369e+01 1.48413159e+02 1.09663316e+03
 8.10308393e+03]

sum():

>>> import numpy as np

>>> L = np.array([[[1, 2, 3, 4],
>>>                [5, 6, 7, 8]],
>>>               [[9, 10, 11, 12],
>>>               [13, 14, 15, 16]]
>>>               ])

>>> print(L.sum())	# 计算各数之和
136

>>> print(L.sum(axis=0))    # 第1个元素为1+9=10,最后1个为8+16=24
[[10 12 14 16]
 [18 20 22 24]]
 
>>> print(L.sum(axis=1))    # 第1个元素为1+5=6,最后1个为12+16=28
[[ 6  8 10 12]
 [22 24 26 28]]
 
>>> print(L.sum(axis=2))    # 第1个元素为1+2+3+4=10,最后1个为13+14+15+16=58
[[10 26]
 [42 58]]

注意list和array的区别:

>>> import numpy as np

>>> L1 = [10, 20, 30, 40]
>>> L2 = [5, 6, 7, 8]
>>> L3 = L1 + L2	# 追加,将L2接在L1后面
>>> print(L3)
[10, 20, 30, 40, 5, 6, 7, 8]

>>> array1 = np.array([10, 20, 30, 40])
>>> array2 = np.array([5, 6, 7, 8])
>>> array3 = array1 + array2    # +,-,*,/,**(后跟多少就是多少次方)同理
>>> print(array3)
[15 26 37 48]

>>> array4 = np.concatenate((array1, array2), axis=0)	# array的追加
>>> print(array4)
[10 20 30 40  5  6  7  8]

>>> print(np.split(array1, 2))	# 将array1分成2份
[array([10, 20]), array([30, 40])]

矩阵相乘:

>>> import numpy as np

>>> array1 = np.array([10, 20, 30, 40])
>>> array2 = np.array([5, 6, 7, 8])

>>> array11 = array1.reshape([2, 2])
>>> array12 = array2.reshape([2, 2])
>>> print('array11:\n', array11, '\n', 'array12:\n', array12)
array11:
 [[10 20]
 [30 40]] 
 array12:
 [[5 6]
 [7 8]]
 
>>> array13 = np.dot(array11, array12)  # 矩阵相乘
>>> print('array13:\n', array13)
array13:
 [[190 220]
 [430 500]]

矩阵相关的操作:

>>> import numpy as np
>>> from numpy.linalg import *

>>> print(np.eye(3))    # 3阶单位矩阵
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
 
>>> L = np.array([[1, 2],
>>>              [3, 4]])

>>> print('逆矩阵:\n', inv(L))   # 逆矩阵
逆矩阵:
 [[-2.   1. ]
 [ 1.5 -0.5]]
 
>>> print('转置矩阵:\n', L.transpose())    # 转置矩阵
转置矩阵:
 [[1 3]
 [2 4]]
 
>>> print('行列式:\n', format(det(L), '.2f'))   # 行列式(保留2位小数)
行列式:
 -2.00
 
>>> print('特征值和特征向量:\n', eig(L))    # 特征值和特征向量
特征值和特征向量:
 (array([-0.37228132,  5.37228132]), array([[-0.82456484, -0.41597356],
       [ 0.56576746, -0.90937671]]))

线性方程组:

>>> import numpy as np
>>> from numpy.linalg import *

>>> A = np.array([[1, 2],
>>>               [3, 4]])      # A为2*2矩阵

>>> B = np.array([[17],
>>>               [41]])  		# B为2*1矩阵

>>> print(solve(A, B))      	# AX=B,求解X
[[7.]
 [5.]]

#   [[1, 2],  [[7],  =  [[17],
#    [3, 4]]   [5]]      [41]]

相关系数:

>>> import numpy as np

>>> print(np.corrcoef([1, 0, 1], [0, 2, 1]))    # r(X,X),r(X,Y),r(Y,X),r(Y,Y)
[[ 1.        -0.8660254]
 [-0.8660254  1.       ]]

生成多项式:

>>> import numpy as np

>>> print(np.poly1d([5, 8, 4, 3]))
   3     2
5 x + 8 x + 4 x + 3

氷鸢鸢鸢
2020.8.4