线程
概念
线程是程序运行的最小单位,一个标准的线程由线程ID,当前指令指针,寄存器集合和堆栈组成。线程是进程中的一个实体,是被系统独立调度和分派的基本单位,线程不占有系统资源,但是可以和同进程下面的其它线程共享该进程所拥有的资源。一个线程可以生成或者撤销另外一个线程,一个进程中的多个线程可以并发执行。线程之间存在相互制约,所以线程在运行的时候会出现间断的情况,因此线程会有三种基本状态:阻塞,就绪和运行。就绪指线程具备运行的条件,等待处理机。运行表示线程占有处理机正在运行;阻塞表示线程在等待一个事件,逻辑上不可执行。每一个应用程序都至少有一个进程和一个线程,线程是程序中一个单一的程序控制流程。一个进程里面有多个线程工作成为多线程
举例:一个机器人由多个部件构成,例如头部,手部。这些部件由不同的加工厂进行加工生产,而每个加工厂里面又会有多条生产线进行同时生产。这个机器人工厂相当于应用程序,加工厂就类似于进程。生产线就类似与线程。
python中的多线程
1 | import threading import time |
我们创建了20个进程,然后把控制器交给CPU,CPU根据相关算法去进行调度,分片执行指令。
下面是Thread类主要的一些方法:
start()
线程进入就绪状态,等待CPU调度
setName
为线程设置名称
getName
获取线程名称
join()
主线程阻塞,等待子线程结束
Wait until the thread terminates.
This blocks the calling thread until the thread whose join() method is called terminates – either normally or through an unhandled exception or until the optional timeout occurs.
When the timeout argument is present and not None, it should be a floating point number specifying a timeout for the operation in seconds (or fractions thereof). As join() always returns None, you must call isAlive() after join() to decide whether a timeout happened – if the thread is still alive, the join() call timed out.
When the timeout argument is not present or None, the operation will block until the thread terminates.
A thread can be join()ed many times.
join() raises a RuntimeError if an attempt is made to join the current thread as that would cause a deadlock. It is also an error to join() a thread before it has been started and attempts to do so raises the same exception.
上面是join()的定义。大致内容就是主线程会一直挂起,直到等到调用join()方法的线程终止,如果发生异常,或者时间到达的设定的超时时间,也会停止阻塞主进程
这里我们先不适用join来看看结果
1 | import threading |
这里能看到,主线程在执行完毕之后就自动退出了,并没有等到子线程的结束。很多情况下我们需要主线程等待子线程完成任务之后再进行退出,因此这种写法并不能满足我们
下面我们使用join()再看一下
1 | import threading |
iimport threading
import time
result = 0
def changeResutl(n):
global result
result += n
result -= n
def threadWorker(n):
for i in range(10000):
changeResutl(n)
for i in range(10):
t1 = threading.Thread(target=threadWorker,args=(20,))
t2 = threading.Thread(target=threadWorker,args=(10,))
t1.start()
t2.start()
t1.join()
t2.join()
print result
1 |
|
0
0
-10
-10
10
10
10
10
30
40
1 |
|
result += 1
1 |
|
tmp = result + 1
result = tmp
1 |
|
result = 0
t1: tmp = result +20 ## tmp = 20
t1: result = tmp ## result = 20
t1: tmp = result -20 ## tmp = 0
t1: result = tmp ## result = 0
t2: tmp = result +10 ## tmp = 10
t2: result = tmp ## result = 10
t2: tmp = result -10 ## tmp = 0
t2: result = tmp ## result = 0
result = 0
1 |
|
result = 0
t1: tmp = result + 20 ## tmp = 20
t2: tmp = result +10 ## tmp = 10
t2: result = tmp ## result = 10
t1: result = tmp ## result = 20
t2: tmp = result - 10 ## tmp = 10
t2: result = tmp ## result = 10
t1: tmp = result - 20 ## tmp = 10
t1: result = tmp ## result = -10
result = -10
1 | 最后的结果就会是错误的内容,因为在改的过程中发生了混乱 |
result = 0
lock = threading.Lock()
def changeResutl(n):
global result
lock.acquire()
result += n
result -= n
lock.release()
1 |
|
result = 0
samaphore = threading.BoundedSemaphore(10)
def changeResutl(n):
global result
samaphore.acquire()
result += n
result -= n
samaphore.release()
1 |
|
cv = threading.Condition()
# Consume one item
cv.acquire()
while not an_item_is_available():
cv.wait()
get_an_available_item()
cv.release()
# Produce one item
cv.acquire()
make_an_item_available()
cv.notify()
cv.release()
```
上面是一个消费者生产者模型,消费者会获取到锁,然后判断条件是否满足,如果不满足,就一直处于wait状态,如果满足条件,就继续运行,然后释放锁。生产者首先获取到锁,然后执行代码,完成之后调用notify通知处于wait状态的线程,(这里的notify并不是立马释放锁,只是对挂起的线程进行通知,释放还需要等到该线程进行释放操作),然后生产者释放锁
全局解释器锁(GIL)
Python在解释器层面限制了一个程序在同一时间只能有一个线程被CPU实际执行,因此不管实际开了多少线程,实际都是只有一个再跑,因此有时多线程编程还不如单线程有效率。避免这个问题,可以使用多进程来解决