Python(8)线程、进程

时间：2017-03-24 10:25:49 阅读：229 评论：0 收藏：0 [点我收藏+]

标签：timeout enum self mutex 输出编译 parent lob 函数

线程

1.什么是线程？

线程是操作系统能够进行运算调度的最小单位。它被包含在进程之中，是进程中的实际运作单位。一条线程指的是进程中一个单一顺序的控制流，一个进程中可以并发多个线程，每条线程并行执行不同的任务。

2.python GIL全局解释器锁（仅需了解）

无论你启多少个线程，你有多少个cpu, Python在执行的时候会淡定的在同一时刻只允许一个线程运行

首先需要明确的一点是GIL并不是Python的特性，它是在实现Python解析器(CPython)时所引入的一个概念。就好比C++是一套语言（语法）标准，但是可以用不同的编译器来编译成可执行代码。有名的编译器例如GCC，INTEL C++，Visual C++等。Python也一样，同样一段代码可以通过CPython，PyPy，Psyco等不同的Python执行环境来执行。像其中的JPython就没有GIL。然而因为CPython是大部分环境下默认的Python执行环境。所以在很多人的概念里CPython就是Python，也就想当然的把GIL归结为Python语言的缺陷。所以这里要先明确一点：GIL并不是Python的特性，Python完全可以不依赖于GIL

这篇文章透彻的剖析了GIL对python多线程的影响，强烈推荐看一下：http://www.dabeaz.com/python/UnderstandingGIL.pdf

3.python threading模块

threading模块建立在_thread 模块之上。thread模块以低级、原始的方式来处理和控制线程，而threading 模块通过对thread 进行二次封装，提供了更方便的 api来处理线程。

线程有两种调用方式，如下：

1）直接调用

import threading
import time
def sayhi(num): #定义每个线程要运行的函数
    print("running on number:%s" %num)
    time.sleep(3)
if __name__ == ‘__main__‘:
    t1 = threading.Thread(target=sayhi,args=(1,)) #生成一个线程实例 target=函数名 args传元组，元组中是参数
    t2 = threading.Thread(target=sayhi,args=(2,)) #生成另一个线程实例
    t1.start() #启动线程
    t2.start() #启动另一个线程
    print(t1.getName()) #获取线程名
    print(t2.getName())

2）继承调用

import threading 
import time 
    class MyThread(threading.Thread): 
        def __init__(self,num): 
            threading.Thread.__init__(self) 
            self.num = num 
        def run(self):#定义每个线程要运行的函数 
            print("running on number:%s" %self.num) 
            time.sleep(3) 
if __name__ == ‘__main__‘: 
    t1 = MyThread(1) 
    t2 = MyThread(2) 
    t1.start() 
    t2.start()

Python通过两个标准库thread和threading提供对线程的支持。thread提供了低级别的、原始的线程以及一个简单的锁。

thread 模块提供的其他方法：

threading.currentThread(): 返回当前的线程变量。
threading.enumerate(): 返回一个包含正在运行的线程的list。正在运行指线程启动后、结束前，不包括启动前和终止后的线程。
threading.activeCount(): 返回正在运行的线程数量，与len(threading.enumerate())有相同的结果。

除了使用方法外，线程模块同样提供了Thread类来处理线程，Thread类提供了以下方法:

run(): 用以表示线程活动的方法。
start():启动线程活动。
join([time]): 等待至线程中止。这阻塞调用线程直至线程的join() 方法被调用中止-正常退出或者抛出未处理的异常-或者是可选的超时发生。
isAlive(): 返回线程是否活动的。
getName(): 返回线程名。
setName(): 设置线程名。

4.Join & Daemon

join 等待线程执行完后，其他线程再继续执行

import threading,time 
def run(n,sleep_time): 
    print("test...",n) 
    time.sleep(sleep_time) 
    print("test...done", n) 
if __name__ == ‘__main__‘: 
    t1 = threading.Thread(target=run,args=("t1",2)) 
    t2 = threading.Thread(target=run,args=("t2",3)) 
    # 两个同时执行，然后等待t1执行完成后，主线程和子线程再开始执行 
    t1.start() 
    t2.start() 
    t1.join() # 等待t1 
    print("main thread") 
# 程序输出 
# test... t1 
# test... t2 
# test...done t1 
# main thread 
# test...done t2

Daemon 守护进程

t.setDaemon() 设置为后台线程或前台线程（默认：False）;通过一个布尔值设置线程是否为守护线程，必须在执行start()方法之后才可以使用。如果是后台线程，主线程执行过程中，后台线程也在进行，主线程执行完毕后，后台线程不论成功与否，均停止；如果是前台线程，主线程执行过程中，前台线程也在进行，主线程执行完毕后，等待前台线程也执行完成后，程序停止

import threading,time 
def run(n): 
    print(‘[%s]------running----\n‘ % n) 
    time.sleep(2) 
    print(‘--done--‘) 
def main(): 
    for i in range(5): 
        t = threading.Thread(target=run, args=[i, ]) 
        t.start() 
        t.join(1) 
        print(‘starting thread‘, t.getName()) 
        m = threading.Thread(target=main, args=[]) 
        m.setDaemon(True) # 将main线程设置为Daemon线程,它做为程序主线程的守护线程,当主线程退出时, 
        # m线程也会退出,由m启动的其它子线程会同时退出,不管是否执行完任务 
        m.start() 
        m.join(timeout=2) 
        print("---main thread done----") 
# 程序输出 
# [0]------running---- 
# starting thread Thread-2 
# [1]------running---- 
# --done-- 
# ---main thread done----

5.线程锁（互斥锁Mutex）

我们使用线程对数据进行操作的时候，如果多个线程同时修改某个数据，可能会出现不可预料的结果，为了保证数据的准确性，引入了锁的概念。

例：假设列表A的所有元素就为0，当一个线程从前向后打印列表的所有元素，另外一个线程则从后向前修改列表的元素为1,那么输出的时候，列表的元素就会一部分为0，一部分为1,这就导致了数据的不一致。锁的出现解决了这个问题。

不加锁：

import time 
import threading 
def addNum(): 
    global num # 在每个线程中都获取这个全局变量 
    print(‘--get num:‘, num) 
    time.sleep(1) 
    num -= 1 # 对此公共变量进行-1操作 
num = 100 # 设定一个共享变量 
thread_list = [] 
for i in range(100): 
    t = threading.Thread(target=addNum) 
    t.start() 
    thread_list.append(t) 
for t in thread_list: # 等待所有线程执行完毕 
    t.join() 
print(‘final num:‘, num)

加锁：

import time 
import threading 
def addNum(): 
    global num # 在每个线程中都获取这个全局变量 
    print(‘--get num:‘, num) 
    time.sleep(1) 
    lock.acquire() # 修改数据前加锁 
    num -= 1 # 对此公共变量进行-1操作 
    lock.release() # 修改后释放 
num = 100 # 设定一个共享变量 
thread_list = [] 
lock = threading.Lock() # 生成全局锁 
for i in range(100): 
    t = threading.Thread(target=addNum) 
    t.start() 
    thread_list.append(t) 
for t in thread_list: # 等待所有线程执行完毕 
    t.join() 
print(‘final num:‘, num)

GIL VS LOCK

机智的同学可能会问到这个问题，就是既然你之前说过了，Python已经有一个GIL来保证同一时间只能有一个线程来执行了，为什么这里还需要lock? 注意啦，这里的lock是用户级的lock,跟那个GIL没关系，具体我们通过下图来看一下+配合我现场讲给大家，就明白了。

6.递归锁

说白了就是在一个大锁中还要再包含子锁

import threading,time 
 
def run1(): 
    print("grab the first part data") 
    lock.acquire() 
    global num 
    num += 1 
    lock.release() 
    return num 
def run2(): 
    print("grab the second part data") 
    lock.acquire() 
    global num2 
    num2 += 1 
    lock.release() 
    return num2 
def run3(): 
    lock.acquire() 
    res = run1() 
    print(‘--------between run1 and run2-----‘) 
    res2 = run2() 
    lock.release() 
    print(res, res2) 
if __name__ == ‘__main__‘: 
    num, num2 = 0, 0 
    lock = threading.RLock() 
    for i in range(10): 
        t = threading.Thread(target=run3) 
        t.start() 
while threading.active_count() != 1: 
    print(threading.active_count()) 
else: 
    print(‘----all threads done---‘) 
    print(num, num2)

threading.RLock和threading.Lock 的区别：

RLock允许在同一线程中被多次acquire。而Lock却不允许这种情况。如果使用RLock，那么acquire和release必须成对出现，即调用了n次acquire，必须调用n次的release才能真正释放所占用的琐。

import threading 
lock = threading.Lock() #Lock对象 
lock.acquire() 
lock.acquire() #产生了死琐。 
lock.release() 
lock.release()

import threading 
rLock = threading.RLock() #RLock对象 
rLock.acquire() 
rLock.acquire() #在同一线程内，程序不会堵塞。 
rLock.release() 
rLock.release()

1. 多进程multiprocessing

multiprocessing包是Python中的多进程管理包，是一个跨平台版本的多进程模块。与threading.Thread类似，它可以利用multiprocessing.Process对象来创建一个进程。该进程可以运行在Python程序内部编写的函数。该Process对象与Thread对象的用法类似。

创建一个Process实例，可用start()方法启动。

join()方法可以等待子进程结束后再继续往下运行，通常用于进程间的同步。

from multiprocessing import Process
import time
def f(name):
    time.sleep(2)
    print(‘hello‘, name)
if __name__ == ‘__main__‘:
    p = Process(target=f, args=(‘bob‘,))
    p.start()
    p.join()

写个程序，对比下主进程和子进程的ID：

from multiprocessing import Process 
import os 
def info(title): 
    print(title) 
    print(‘进程名称:‘, __name__) 
    print(‘父进程ID:‘, os.getppid()) 
    print(‘子进程ID:‘, os.getpid()) 
    print("\n\n") 
def f(name): 
    info(‘\033[31;1mcalled from child process function f\033[0m‘) 
    print(‘hello‘, name) 
if __name__ == ‘__main__‘: 
    info(‘\033[32;1mmain process line\033[0m‘) 
    p = Process(target=f, args=(‘bob‘,)) 
    p.start()

2. 进程间通信

不同进程间内存是不共享的，要想实现两个进程间的数据交换，可以使用Queue、Pipe、Manager，其中：

1）Queue \ Pipe 只是实现进程间数据的传递；

2）Manager 实现了进程间数据的共享，即多个进程可以修改同一份数据；

2.1 Queue

Queue允许多个进程放入，多个进程从队列取出对象，先进先出。（使用方法跟threading里的queue差不多）

from multiprocessing import Process,Queue 
def f(qq): 
    qq.put([42,None,"hello"]) 
    qq.put([43,None,"HI"]) 
if __name__ == ‘__main__‘: 
    q = Queue() 
    p = Process(target=f,args=(q,)) 
    p.start() 
    print(q.get()) 
    print(q.get()) 
    p.join()

2.2 Pipe

Pipe也是先进先出

from multiprocessing import Process, Pipe 
def f(conn): 
    conn.send([42, None, ‘儿子发送的消息‘]) 
    conn.send([42, None, ‘儿子又发消息啦‘]) 
    print("接收父亲的消息:",conn.recv()) 
    conn.close() 
if __name__ == ‘__main__‘: 
    parent_conn, child_conn = Pipe() 
    p = Process(target=f, args=(child_conn,)) 
    p.start() 
    print(parent_conn.recv()) # prints "[42, None, ‘hello‘]" 
    print(parent_conn.recv()) # prints "[42, None, ‘hello‘]" 
    parent_conn.send("回家吃饭！") # prints "[42, None, ‘hello‘]" 
    p.join()

2.3 Manager

Manager对象类似于服务器与客户之间的通信 (server-client)，与我们在Internet上的活动很类似。我们用一个进程作为服务器，建立Manager来真正存放资源。其它的进程可以通过参数传递或者根据地址来访问Manager，建立连接后，操作服务器上的资源。在防火墙允许的情况下，我们完全可以将Manager运用于多计算机，从而模仿了一个真实的网络情境。

from multiprocessing import Process,Manager 
import os 
def f(d,l): 
    d[os.getpid()] = os.getpid() 
    l.append(os.getpid()) 
    print(l) 
if __name__ == "__main__": 
    with Manager() as manager: 
    d = manager.dict()#生成一个字典，可在多个进程间共享和传递 
    l = manager.list(range(5))#生成一个列表，可在多个进程间实现共享和传递 
    p_list = [] 
for i in range(10): 
    p = Process(target=f,args=(d,l)) 
    p.start() 
    p_list.append(p) 
for res in p_list:#等待结果 
    res.join()

3. 进程池

进程池 (Process Pool)可以创建多个进程。这些进程就像是随时待命的士兵，准备执行任务(程序)。一个进程池中可以容纳多个待命的士兵。

进程池有两种方法：

1）串行：apply

2）并行：apply_async

from multiprocessing import Process,Pool 
import time 
import os 
def Foo(i): 
    time.sleep(2) 
    print("in process",os.getpid()) 
    return i+100 
def Bar(arg): 
    ‘‘‘回调函数‘‘‘ 
    print("-->>exec done:",arg,os.getpid()) 
if __name__ == "__main__": 
    pool = Pool(processes=3)#允许进程池同时放入3个进程 
    print("主进程",os.getpid()) 
for i in range(10): 
    pool.apply_async(func=Foo,args=(i,),callback=Bar) 
    print(‘end‘) 
    pool.close() 
    pool.join()#进程池中进程执行完毕后在关闭；如果注释则程序直接关闭

使用回调函数的目的是：在父进程中执行可以提高效率；（比如连接数据库，写回调函数的话，父进程连接一次数据库即可；如果使用子进程，则需要连接多次）

4. 其他(lock)

lock：屏幕上打印的锁，防止打印显示混乱

from multiprocessing import Process, Lock 
def f(l, i): 
    #上锁 
    l.acquire() 
try: 
    print(‘hello world‘, i) 
finally: 
#解锁 
    l.release() 
    #因为屏幕是共享的，定义锁的目的是打印的信息不换乱，而不是顺序不会乱 
if __name__ == ‘__main__‘: 
#定义锁 
    lock = Lock() 
for num in range(10): 
    Process(target=f, args=(lock, num)).start()

Python(8)线程、进程

标签：timeout enum self mutex 输出编译 parent lob 函数

原文地址：http://www.cnblogs.com/yinuoxiaofang/p/6610027.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行