Python之多线程设计实例讲解

程序员文章站 2022-04-21 20:25:08

◆ 进程： An executing instance of a program is called a process. Each process provides the...

◆ 进程：

An executing instance of a program is called a process.

Each process provides the resources needed to execute a program. A process has a virtual address space, executable code, open handles to system objects, a security context, a unique process identifier, environment variables, a priority class, minimum and maximum working set sizes, and at least one thread of execution. Each process is started with a single thread, often called the primary thread, but can create additional threads from any of its threads.

程序并不能单独运行，只有将程序装载到内存中，系统为它分配资源才能运行，而这种执行的程序就称之为进程。程序和进程的区别就在于：程序是指令的集合，它是进程运行的静态描述文本；进程是程序的一次执行活动，属于动态概念。

在多道编程中，我们允许多个程序同时加载到内存中，在操作系统的调度下，可以实现并发地执行。这是这样的设计，大大提高了CPU的利用率。进程的出现让每个用户感觉到自己独享CPU，因此，进程就是为了在CPU上实现多道编程而提出的。

多道程序设计：

在计算机内存中同时存放几道相互独立的程序，使它们在管理程序控制之下，相互穿插的运行。两个或两个以上程序在计算机系统中同处于开始到结束之间的状态。这就称为多道程序设计。多道程序技术运行的特征：多道、宏观上并行、微观上串行。

★ 进程的缺陷：

进程有很多缺陷，主要体现在两点上：

 进程只能在一个时间干一件事，如果想同时干两件事或多件事，进程就无能为力了。

 进程在执行的过程中如果阻塞，例如等待输入，整个进程就会挂起，即使进程中有些工作不依赖于输入的数据，也将无法执行。

例如，我们在使用qq聊天， qq做为一个独立进程如果同一时间只能干一件事，那他如何实现在同一时刻即能监听键盘输入、又能监听其它人给你发的消息、同时还能把别人发的消息显示在屏幕上呢？

这就需要线程来解决。

◆ 线程：

操作系统能够进行运算调度的最小单位。它被包含在进程之中，是进程中的实际运作单位。一条线程指的是进程中一个单一顺序的控制流，一个进程中可以并发多个线程，每条线程并行执行不同的任务。

原理：CPU给你一种幻觉，它在同时做多个计算操作（执行多个线程）。实际上，CPU记录了每个线程的执行上下文，然后在多个线程中进行快速切换。

A thread is an execution context, which is all the information a CPU needs to execute a stream of instructions., while a process is a bunch of resources associated with a computation.

A process can have one or many threads.

◆ 进程与线程的区别：

1、同一进程中的不同线程，可以共享进程的地址空间；不同的进程间独占自己的地址空间
2、同一进程中的不同线程，可以直接获得进程的数据；进程只能从父进程中复制一份数据给自己使用。
3、同一进程中的不同线程，可以直接和其他线程交互；进程只能通过进程内通讯和其他进程交互。
4、新的线程很容易被创建；新进程的创建需要复制父进程。
5、同一进程中的不同线程，可以实现相互控制；进程只能控制其子进程。
6、主线程的变化可以改变其他线程的运行；父进程的改变不会影响子进程。

◆ Python GIL(Global Interpreter Lock)：

In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython’s memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.)

上面的核心意思就是，无论你启多少个线程，你有多少个cpu, Python在执行的时候会淡定的在同一时刻只允许一个线程运行。

这篇文章透彻的剖析了GIL对python多线程的影响，强烈推荐看一下：https://www.dabeaz.com/python/UnderstandingGIL.pdf

◆ 线程调用的两种方式：

★ 直接调用，将要执行的方法作为参数传给Thread的构造方法

def action(arg):
    time.sleep(1)
    print 'the arg is:%s\r' %arg

for i in xrange(4):
    t =threading.Thread(target=action,args=(i,))
    t.start()

print 'main thread end!'

★ 继承式调用，从Thread继承，并重写run()

# coding:utf-8
import threading
import time

class MyThread(threading.Thread):
    def __init__(self,arg):
        super(MyThread, self).__init__()#注意：一定要显式的调用父类的初始化函数。
        self.arg=arg
    def run(self):#定义每个线程要运行的函数
        time.sleep(1)
        print 'the arg is:%s\r' % self.arg

for i in xrange(4):
    t =MyThread(i)
    t.start()

print 'main thread end!'

◆ 构造方法：

Thread(group=None, target=None, name=None, args=(), kwargs={}) 
　　group: 线程组，目前还没有实现，库引用中提示必须是None； 
　　target: 要执行的方法； 
　　name: 线程名； 
　　args/kwargs: 要传入方法的参数。

◆ 实例方法：
　　

★ isAlive(): 返回线程是否在运行。正在运行指启动后、终止前。 
★ get/setName(name): 获取/设置线程名。 
★ is/setDaemon(bool): 获取/设置是后台线程（默认前台线程（False））。（在start之前设置）
　　如果是后台线程，主线程执行过程中，后台线程也在进行，主线程执行完毕后，后台线程不论成功与否，主线程和后台线程均停止
   如果是前台线程，主线程执行过程中，前台线程也在进行，主线程执行完毕后，等待前台线程也执行完成后，程序停止
★ start(): 启动线程。 
★ join([timeout]): 阻塞当前上下文环境的线程，直到调用此方法的线程终止或到达指定的timeout（可选参数）。

◆ is/setDaemon(bool):

serDeamon(True)后台线程，主线程执行过程中，后台线程也在进行，主线程执行完毕后，后台线程不论成功与否，均停止。

示例：

# coding:utf-8
import threading
import time

def action(arg):
    time.sleep(1)
    print  'sub thread start!the thread name is:%s\r' % threading.currentThread().getName()
    print 'the arg is:%s\r' %arg

for i in xrange(4):
    t =threading.Thread(target=action,args=(i,))
    t.setDaemon(True)#设置线程为后台线程
    t.start()

print 'main_thread end!'

执行结果：

main_thread end!

◆ join()

阻塞当前上下文环境的线程，直到调用此方法的线程终止或到达指定的timeout，即使设置了setDeamon（True）主线程依然要等待子线程结束。

#coding:utf-8
import threading
import time

def action(arg):
    time.sleep(arg)
    print  'the sub thread name is:%s    ' % threading.currentThread().getName()
    print 'the sleep time is:%ss   ' %arg

thread_list = []    #线程存放列表
for i in xrange(2):
    t =threading.Thread(target=action,args=(i,))
    t.setDaemon(True)
    thread_list.append(t)

for t in thread_list:
    t.start()

for t in thread_list:
    t.join()

print 'main_thread end!'

执行结果：

the sub thread name is:Thread-1    
the sleep time is:0s   
the sub thread name is:Thread-2    
the sleep time is:1s   
main_thread end!

join不妥当的用法，使多线程编程顺序执行：

#coding:utf-8
import threading
import time

def action(arg):
    time.sleep(1)
    print  'sub thread start!the thread name is:%s    ' % threading.currentThread().getName()
    print 'the arg is:%s   ' %arg

for i in xrange(2):
    t =threading.Thread(target=action,args=(i,))
    t.setDaemon(True)
    t.start()
    t.join()

print 'main_thread end!'

运行结果：

sub thread start!the thread name is:Thread-1    
the arg is:0   
sub thread start!the thread name is:Thread-2    
the arg is:1

注：共运行了2秒才结束！！每个线程都被上一个线程的join阻塞，使得“多线程”失去了多线程意义。

◆ 线程的锁

由于线程之间随机调度：某线程可能在执行n条后，CPU接着执行其他线程。为了多个线程同时操作一个内存中的资源时不产生混乱，我们使用锁。

Lock（指令锁）是可用的最低级的同步指令。Lock处于锁定状态时，不被特定的线程拥有。Lock包含两种状态——锁定和非锁定，以及两个基本的方法。

可以认为Lock有一个锁定池，当线程请求锁定时，将线程至于池中，直到获得锁定后出池。池中的线程处于状态图中的同步阻塞状态。

RLock（可重入锁）是一个可以被同一个线程请求多次的同步指令。RLock使用了“拥有的线程”和“递归等级”的概念，处于锁定状态时，RLock被某个线程拥有。拥有RLock的线程可以再次调用acquire()，释放锁时需要调用release()相同次数。

可以认为RLock包含一个锁定池和一个初始值为0的计数器，每次成功调用 acquire()/release()，计数器将+1/-1，为0时锁处于未锁定状态。

简言之：Lock属于全局，Rlock属于线程。

构造方法：

Lock()
Rlock()

注：推荐使用Rlock()

实例方法：

★ acquire([timeout]): 尝试获得锁定。使线程进入同步阻塞状态。 
★ release(): 释放锁。使用前线程必须已获得锁定，否则将抛出异常。

★ 使用示例：

# coding:utf-8

import threading
import time

gl_num = 0
lock = threading.RLock()


# 调用acquire([timeout])时，线程将一直阻塞，
# 直到获得锁定或者直到timeout秒后（timeout参数可选）。
# 返回是否获得锁。
def Func():
    lock.acquire()
    global gl_num
    gl_num += 1
    time.sleep(1)
    print gl_num
    lock.release()


for i in range(4):
    t = threading.Thread(target=Func)
    t.start()

运行结果：

全局变量在在每次被调用时都要获得锁，才能操作，因此保证了共享数据的安全性

◆ Lock对比Rlock

★ Lock

#coding:utf-8
import threading

lock = threading.Lock() #Lock对象

lock.acquire()
lock.acquire()  #产生了死锁。
lock.release()
lock.release()

print lock.acquire()

★ Rlock（递归锁）

# coding:utf-8
import threading

rLock = threading.RLock()  #RLock对象
rLock.acquire()
rLock.acquire() #在同一线程内，程序不会堵塞。
rLock.release()
rLock.release()

◆Semaphore(信号量)

互斥锁同时只允许一个线程更改数据，而Semaphore是同时允许一定数量的线程进行操作。

示例：

# coding:utf-8
import threading, time

def run(n):
    semaphore.acquire()
    time.sleep(1)
    print("run the thread: %s\n" % n)
    semaphore.release()


if __name__ == '__main__':
    semaphore = threading.BoundedSemaphore(2)  # 最多允许2个线程同时运行

    for i in range(6):
        t = threading.Thread(target=run, args=(i,))
        t.start()

while threading.active_count() != 1:
    pass  # print threading.active_count()
else:
    print('----all threads done---')

执行结果：

run the thread: 0
run the thread: 1


run the thread: 3
run the thread: 2


run the thread: 5
run the thread: 4


----all threads done---

上一篇：揭秘首部卖到北美的国产VR电影

下一篇：由西云数据运营的中国第二个AWS区域正式开放

Python之多线程设计实例讲解

对Python实现简单的API接口实例讲解

python多线程爬虫实例分析

Python threading多线程编程实例

Python之批量创建文件的实例讲解

python中如何django使用haystack:全文检索的框架的实例讲解

python 信息同时输出到控制台与文件的实例讲解

Python3爬虫中识别图形验证码的实例讲解

Python3以GitHub为例来实现模拟登录和爬取的实例讲解

python插入排序实例讲解

Python高级特性实例讲解