wait/notify 机制是解决生产者消费者问题的良药。它的核心逻辑是基于条件变量的锁机制处理。所以,它们到底是什么关系?wait()时是否需要持有锁? notify()是否需要持有锁?先说答案:都需要持有锁。
wait需要持有锁的原因是,你肯定需要知道在哪个对象上进行等待,如果不持有锁,将无法做到对象变更时进行实时感知通知的作用。与此同时,为了让其他线程可以操作该值的变化,它必须要先释放掉锁,然后在该节点上进行等待。不持有锁而进行wait,可能会导致长眠不起。而且,如果不持有锁,则当wait之后的操作,都可能是错的,因为可能这个数据已经过时,其实也叫线程不安全了。总之,一切为了安全,单独的wait做不成这事。
notify需要持有锁的原因是,它要保证线程的安全,只有它知道数据变化了,所以它有权力去通知其他线程数据变化。而且通知完之后,不能立即释放锁,即必须在持有锁的情况下进行通知,否则notify后续的工作的线程安全性将无法保证,尽量它是在lock的范围内,但却因为锁释放,将导致不可预期的结果。而且在notify的时候,并不能真正地将对应的线程唤醒,即不能从操作系统层面唤醒线程,因为此时当前通知线程持有锁,而此时如果将其他等待线程唤醒,它们将立即参与到锁的竞争中来,而这时的竞争是一定会失败的,这可能会导致被唤醒的线程立即又进入等待队列,更糟糕的是它可能再也不会被唤醒 了。所以不能将在持有锁的时,将对应的线程真正唤醒,我们看到的notify只是从语言上下文级别,将它从等待队列转移到同步队列而已,对此操作系统一无所知。
1. 实验验证
我们通过一个实验来看一下,wait/和notify是否会在持有锁的情况下进行。
private ReentrantLock mainLock =new ReentrantLock();
@Testpublicvoid testWaitNotify()throws InterruptedException {
Condition c1= mainLock.newCondition();
Condition c3= mainLock.newCondition();
CountDownLatch t1StartLatch=new CountDownLatch(2);
Thread t1=new Thread(() -> {
mainLock.lock();try {
System.out.println(LocalDateTime.now()+ " - t1 start");
c1.await();
System.out.println(LocalDateTime.now()+ " - t1 c1 await out");// 过早通知问题,导致无法测试下一步// c3.await();// System.out.println(LocalDateTime.now() + " - t1 c2 await out"); t1StartLatch.await();
System.out.println(LocalDateTime.now()+ " - t1 sleeping");
SleepUtil.sleepMillis(10_000L);
c1.signalAll();
c3.signalAll();
System.out.println(LocalDateTime.now()+ " - t1 notified, sleeping again");
SleepUtil.sleepMillis(10_000L);
System.out.println(LocalDateTime.now()+ " - t1 out");
}catch (Exception e) {
System.err.println("t1 exception ");
e.printStackTrace();
}finally {
mainLock.unlock();
}
},"t1");
Thread t2=new Thread(() -> {
mainLock.lock();try {
t1StartLatch.countDown();
System.out.println(LocalDateTime.now()+ " - t2 c1 signal");
c1.signalAll();
System.out.println(LocalDateTime.now()+ " - t2 wait");
c1.await();
System.out.println(LocalDateTime.now()+ " - t2 out");
}catch (Exception e) {
System.err.println("t2 exception ");
e.printStackTrace();
}finally {
mainLock.unlock();
}
},"t2");
Thread t3=new Thread(() -> {
mainLock.lock();try {
t1StartLatch.countDown();
System.out.println(LocalDateTime.now()+ " - t3 c3 signal");
c3.signalAll();
System.out.println(LocalDateTime.now()+ " - t3 wait");
c3.await();
System.out.println(LocalDateTime.now()+ " - t3 out");
}catch (Exception e) {
System.err.println("t2 exception ");
e.printStackTrace();
}finally {
mainLock.unlock();
}
},"t3");
t1.start();
t2.start();
t3.start();
t1.join();
System.out.println(LocalDateTime.now()+ " - main t1 out");
t2.join();
System.out.println(LocalDateTime.now()+ " - main t2 out");
t3.join();
System.out.println(LocalDateTime.now()+ " - main t3 out");
}
大概意思是,针对同一个锁,wait之后,是否可以被其他线程进入临界区?如果wait之前不通知进入,wait之后能进入,说明wait依赖于锁,而且会释放当前锁。notify之后,wait()是否会立即执行,如果必须等到notify的模块完成后,才执行,说明notify是必须要依赖于锁的。
结果如下:
2022-03-27T20:09:43.588 - t1 start2022-03-27T20:09:43.603 - t2 c1 signal2022-03-27T20:09:43.603 - t2 wait2022-03-27T20:09:43.603 - t3 c3 signal2022-03-27T20:09:43.603 - t3 wait2022-03-27T20:09:43.603 - t1 c1 await out2022-03-27T20:09:43.603 - t1 sleeping2022-03-27T20:09:53.605 - t1 notified, sleeping again2022-03-27T20:10:03.612 - t1 out2022-03-27T20:10:03.612 - t2 out2022-03-27T20:10:03.612 - main t1 out2022-03-27T20:10:03.612 - t3 out2022-03-27T20:10:03.612 - main t2 out2022-03-27T20:10:03.612 - main t3 out2022-03-27T20:11:39.982 - t1 start2022-03-27T20:11:39.982 - t2 c1 signal2022-03-27T20:11:39.982 - t2 wait2022-03-27T20:11:39.982 - t3 c3 signal2022-03-27T20:11:39.982 - t3 wait2022-03-27T20:11:39.982 - t1 c1 await out2022-03-27T20:11:39.982 - t1 sleeping2022-03-27T20:11:49.989 - t1 notified, sleeping again2022-03-27T20:11:59.990 - t1 out2022-03-27T20:11:59.990 - t2 out2022-03-27T20:11:59.990 - main t1 out2022-03-27T20:11:59.990 - t3 out2022-03-27T20:11:59.990 - main t2 out2022-03-27T20:11:59.990 - main t3 out
2. wait/notify 的实现机制
我们以AQS的实现机制为线索,探索wait/notify机制。它在唤醒操作队列时,设置状态为 SIGNAL , 但它实际不执行操作系统唤醒。
// java.util.concurrent.locks.AbstractQueuedSynchronizer.ConditionObject#signalAll/**
* Moves all threads from the wait queue for this condition to
* the wait queue for the owning lock.
*
*@throws IllegalMonitorStateException if {@link #isHeldExclusively}
* returns {@code false}*/publicfinalvoid signalAll() {if (!isHeldExclusively())thrownew IllegalMonitorStateException();
Node first= firstWaiter;if (first !=null)
doSignalAll(first);
}// java.util.concurrent.locks.AbstractQueuedSynchronizer.ConditionObject#doSignalAll/**
* Removes and transfers all nodes.
*@param first (non-null) the first node on condition queue*/privatevoid doSignalAll(Node first) {
lastWaiter= firstWaiter =null;do {
Node next= first.nextWaiter;
first.nextWaiter=null;
transferForSignal(first);
first= next;
}while (first !=null);
}// java.util.concurrent.locks.AbstractQueuedSynchronizer#transferForSignal/**
* Transfers a node from a condition queue onto sync queue.
* Returns true if successful.
*@param node the node
*@return true if successfully transferred (else the node was
* cancelled before signal)*/finalboolean transferForSignal(Node node) {/*
* If cannot change waitStatus, the node has been cancelled.*/if (!compareAndSetWaitStatus(node, Node.CONDITION, 0))returnfalse;/*
* Splice onto queue and try to set waitStatus of predecessor to
* indicate that thread is (probably) waiting. If cancelled or
* attempt to set waitStatus fails, wake up to resync (in which
* case the waitStatus can be transiently and harmlessly wrong).*/
Node p= enq(node);int ws = p.waitStatus;// 不到万不得已,不会真正唤醒等待中的队列,从而满足notify无法将线程唤醒的作用,或者说线程仍然在操作系统的等待队列上// 它只是将当前线程移动到本语文的同步队列中,以下线程下次运行过来时可以通过该限制if (ws > 0 || !compareAndSetWaitStatus(p, ws, Node.SIGNAL))
LockSupport.unpark(node.thread);returntrue;
}/**
* Inserts node into queue, initializing if necessary. See picture above.
*@param node the node to insert
*@return node's predecessor*/private Node enq(final Node node) {for (;;) {
Node t= tail;if (t ==null) {// Must initializeif (compareAndSetHead(new Node()))
tail= head;
}else {
node.prev= t;if (compareAndSetTail(t, node)) {
t.next= node;return t;
}
}
}
}// java.util.concurrent.locks.AbstractQueuedSynchronizer.ConditionObject#await()/**
* Implements interruptible condition wait.
* <ol>
* <li> If current thread is interrupted, throw InterruptedException.
* <li> Save lock state returned by {@link #getState}.
* <li> Invoke {@link #release} with saved state as argument,
* throwing IllegalMonitorStateException if it fails.
* <li> Block until signalled or interrupted.
* <li> Reacquire by invoking specialized version of
* {@link #acquire} with saved state as argument.
* <li> If interrupted while blocked in step 4, throw InterruptedException.
* </ol>*/publicfinalvoid await()throws InterruptedException {if (Thread.interrupted())thrownew InterruptedException();
Node node= addConditionWaiter();// 进来等待队列,先释放锁,此时进入线程不安全状态int savedState = fullyRelease(node);int interruptMode = 0;// 此判断只是本语文级别的等待队列限制// notify 时只能满足这个条件,而不会将线程从操作系统挂起队列中唤醒,即不会进行 LockSupport.unpark()while (!isOnSyncQueue(node)) {// 交由操作系统进行线程挂起
LockSupport.park(this);if ((interruptMode = checkInterruptWhileWaiting(node)) != 0)break;
}// 重新进行锁的获取,尝试if (acquireQueued(node, savedState) && interruptMode != THROW_IE)
interruptMode= REINTERRUPT;if (node.nextWaiter !=null)// clean up if cancelled unlinkCancelledWaiters();if (interruptMode != 0)
reportInterruptAfterWait(interruptMode);
}// java.util.concurrent.locks.AbstractQueuedSynchronizer#acquireQueued/**
* Acquires in exclusive uninterruptible mode for thread already in
* queue. Used by condition wait methods as well as acquire.
*
*@param node the node
*@param arg the acquire argument
*@return {@code true} if interrupted while waiting*/finalboolean acquireQueued(final Node node,int arg) {boolean failed =true;try {boolean interrupted =false;for (;;) {final Node p = node.predecessor();// 获取当锁,则替换head后返回// 而 tryAcquire() 则由各自策略实现if (p == head && tryAcquire(arg)) {
setHead(node);
p.next=null;// help GC
failed =false;return interrupted;
}// 如果获取不到锁,则重新进入操作系统等待队列if (shouldParkAfterFailedAcquire(p, node) &&
parkAndCheckInterrupt())
interrupted=true;
}
}finally {if (failed)
cancelAcquire(node);
}
}
所以,总结:
1. wait将会释放持有的锁;
2. wait将会加入到语言级别的等待队列,同时也会提交给操作系统的等待队列,做到真正的线程挂起;
3. wait将会在被操作系统唤醒后,重新进行新一轮的锁获取尝试,返回时已携带回原有的锁,从外部看起来就像锁一直都在一样;
4. notify不会真正的唤醒等待的线程,而只是将各等待线程从语言级别的等待队列移出,到语言级别的同步队列;
5. notify只有在极端情况下,才会做到线程的真正唤醒作用,比如中断,但这被唤醒的线程将无法正常进行业务操作,所以也是安全的;
6. 只有在整体的锁在进行 unlock() 的时候,才会唤醒线程,使其重新参与锁的竞争;
3. lock/unlock 流程
同样的AQS的实现为线索,lock/unlock 流程如下:
// java.util.concurrent.locks.ReentrantLock#lock/**
* Acquires the lock.
*
* <p>Acquires the lock if it is not held by another thread and returns
* immediately, setting the lock hold count to one.
*
* <p>If the current thread already holds the lock then the hold
* count is incremented by one and the method returns immediately.
*
* <p>If the lock is held by another thread then the
* current thread becomes disabled for thread scheduling
* purposes and lies dormant until the lock has been acquired,
* at which time the lock hold count is set to one.*/publicvoid lock() {
sync.lock();
}// java.util.concurrent.locks.ReentrantLock.NonfairSync#lock/**
* Performs lock. Try immediate barge, backing up to normal
* acquire on failure.*/finalvoid lock() {if (compareAndSetState(0, 1))
setExclusiveOwnerThread(Thread.currentThread());else
acquire(1);
}// java.util.concurrent.locks.AbstractQueuedSynchronizer#acquire/**
* Acquires in exclusive mode, ignoring interrupts. Implemented
* by invoking at least once {@link #tryAcquire},
* returning on success. Otherwise the thread is queued, possibly
* repeatedly blocking and unblocking, invoking {@link
* #tryAcquire} until success. This method can be used
* to implement method {@link Lock#lock}.
*
*@param arg the acquire argument. This value is conveyed to
* {@link #tryAcquire} but is otherwise uninterpreted and
* can represent anything you like.*/publicfinalvoid acquire(int arg) {if (!tryAcquire(arg) &&// 同上wait时的锁争抢操作 acquireQueued(addWaiter(Node.EXCLUSIVE), arg))
selfInterrupt();
}// java.util.concurrent.locks.ReentrantLock#unlock/**
* Attempts to release this lock.
*
* <p>If the current thread is the holder of this lock then the hold
* count is decremented. If the hold count is now zero then the lock
* is released. If the current thread is not the holder of this
* lock then {@link IllegalMonitorStateException} is thrown.
*
*@throws IllegalMonitorStateException if the current thread does not
* hold this lock*/publicvoid unlock() {
sync.release(1);
}// java.util.concurrent.locks.AbstractQueuedSynchronizer#release/**
* Releases in exclusive mode. Implemented by unblocking one or
* more threads if {@link #tryRelease} returns true.
* This method can be used to implement method {@link Lock#unlock}.
*
*@param arg the release argument. This value is conveyed to
* {@link #tryRelease} but is otherwise uninterpreted and
* can represent anything you like.
*@return the value returned from {@link #tryRelease}*/publicfinalboolean release(int arg) {if (tryRelease(arg)) {
Node h= head;// 直接唤醒头节点(真正的唤醒)if (h !=null && h.waitStatus != 0)
unparkSuccessor(h);returntrue;
}returnfalse;
}// java.util.concurrent.locks.AbstractQueuedSynchronizer#unparkSuccessor/**
* Wakes up node's successor, if one exists.
*
*@param node the node*/privatevoid unparkSuccessor(Node node) {/*
* If status is negative (i.e., possibly needing signal) try
* to clear in anticipation of signalling. It is OK if this
* fails or if status is changed by waiting thread.*/int ws = node.waitStatus;if (ws < 0)
compareAndSetWaitStatus(node, ws,0);/*
* Thread to unpark is held in successor, which is normally
* just the next node. But if cancelled or apparently null,
* traverse backwards from tail to find the actual
* non-cancelled successor.*/
Node s= node.next;if (s ==null || s.waitStatus > 0) {
s=null;for (Node t = tail; t !=null && t != node; t = t.prev)if (t.waitStatus <= 0)
s= t;
}// 真正唤醒线程,只有一个线程将被唤醒if (s !=null)
LockSupport.unpark(s.thread);
}
总结: lock/unlock 是一个真正的上锁解锁操作,上锁时如未成功,则进行park()进行操作系统挂起,解锁时将头节点unpark()交由操作系统调度。
4. 唤醒多个等待线程
如何唤醒多个等待线程?共享锁有这个需求,其实也是notifyAll 的表面语义所在。
// java.util.concurrent.locks.AbstractQueuedSynchronizer#releaseShared/**
* Releases in shared mode. Implemented by unblocking one or more
* threads if {@link #tryReleaseShared} returns true.
*
*@param arg the release argument. This value is conveyed to
* {@link #tryReleaseShared} but is otherwise uninterpreted
* and can represent anything you like.
*@return the value returned from {@link #tryReleaseShared}*/publicfinalboolean releaseShared(int arg) {if (tryReleaseShared(arg)) {
doReleaseShared();returntrue;
}returnfalse;
}// java.util.concurrent.locks.AbstractQueuedSynchronizer#doReleaseShared/**
* Release action for shared mode -- signals successor and ensures
* propagation. (Note: For exclusive mode, release just amounts
* to calling unparkSuccessor of head if it needs signal.)*/privatevoid doReleaseShared() {/*
* Ensure that a release propagates, even if there are other
* in-progress acquires/releases. This proceeds in the usual
* way of trying to unparkSuccessor of head if it needs
* signal. But if it does not, status is set to PROPAGATE to
* ensure that upon release, propagation continues.
* Additionally, we must loop in case a new node is added
* while we are doing this. Also, unlike other uses of
* unparkSuccessor, we need to know if CAS to reset status
* fails, if so rechecking.*/for (;;) {
Node h= head;if (h !=null && h != tail) {int ws = h.waitStatus;if (ws == Node.SIGNAL) {if (!compareAndSetWaitStatus(h, Node.SIGNAL, 0))continue;// loop to recheck cases// 唤醒头节点 unparkSuccessor(h);
}// 因为上一头节点刚刚被设置为0,说明正在执行中,设置当前head为 PROPAGATEelseif (ws == 0 &&
!compareAndSetWaitStatus(h, 0, Node.PROPAGATE))continue;// loop on failed CAS }// 即尽量只设置一个 head 节点即可// 除非在这期间发生变更if (h == head)// loop if head changedbreak;
}
}// java.util.concurrent.locks.AbstractQueuedSynchronizer#acquireSharedInterruptibly/**
* Acquires in shared mode, aborting if interrupted. Implemented
* by first checking interrupt status, then invoking at least once
* {@link #tryAcquireShared}, returning on success. Otherwise the
* thread is queued, possibly repeatedly blocking and unblocking,
* invoking {@link #tryAcquireShared} until success or the thread
* is interrupted.
*@param arg the acquire argument.
* This value is conveyed to {@link #tryAcquireShared} but is
* otherwise uninterpreted and can represent anything
* you like.
*@throws InterruptedException if the current thread is interrupted*/publicfinalvoid acquireSharedInterruptibly(int arg)throws InterruptedException {if (Thread.interrupted())thrownew InterruptedException();if (tryAcquireShared(arg) < 0)
doAcquireSharedInterruptibly(arg);
}// java.util.concurrent.locks.AbstractQueuedSynchronizer#doAcquireSharedInterruptibly/**
* Acquires in shared interruptible mode.
*@param arg the acquire argument*/privatevoid doAcquireSharedInterruptibly(int arg)throws InterruptedException {final Node node = addWaiter(Node.SHARED);boolean failed =true;try {for (;;) {final Node p = node.predecessor();if (p == head) {int r = tryAcquireShared(arg);if (r >= 0) {// 共享式锁的传播性质实现 setHeadAndPropagate(node, r);
p.next=null;// help GC
failed =false;return;
}
}if (shouldParkAfterFailedAcquire(p, node) &&
parkAndCheckInterrupt())thrownew InterruptedException();
}
}finally {if (failed)
cancelAcquire(node);
}
}// java.util.concurrent.locks.AbstractQueuedSynchronizer#setHeadAndPropagate/**
* Sets head of queue, and checks if successor may be waiting
* in shared mode, if so propagating if either propagate > 0 or
* PROPAGATE status was set.
*
*@param node the node
*@param propagate the return value from a tryAcquireShared*/privatevoid setHeadAndPropagate(Node node,int propagate) {
Node h= head;// Record old head for check below setHead(node);/*
* Try to signal next queued node if:
* Propagation was indicated by caller,
* or was recorded (as h.waitStatus either before
* or after setHead) by a previous operation
* (note: this uses sign-check of waitStatus because
* PROPAGATE status may transition to SIGNAL.)
* and
* The next node is waiting in shared mode,
* or we don't know, because it appears null
*
* The conservatism in both of these checks may cause
* unnecessary wake-ups, but only when there are multiple
* racing acquires/releases, so most need signals now or soon
* anyway.*/if (propagate > 0 || h ==null || h.waitStatus < 0 ||
(h= head) ==null || h.waitStatus < 0) {
Node s= node.next;// 递归进行唤醒下一线程节点,从而级联唤醒if (s ==null || s.isShared())
doReleaseShared();
}
}/**
* Release action for shared mode -- signals successor and ensures
* propagation. (Note: For exclusive mode, release just amounts
* to calling unparkSuccessor of head if it needs signal.)*/privatevoid doReleaseShared() {/*
* Ensure that a release propagates, even if there are other
* in-progress acquires/releases. This proceeds in the usual
* way of trying to unparkSuccessor of head if it needs
* signal. But if it does not, status is set to PROPAGATE to
* ensure that upon release, propagation continues.
* Additionally, we must loop in case a new node is added
* while we are doing this. Also, unlike other uses of
* unparkSuccessor, we need to know if CAS to reset status
* fails, if so rechecking.*/for (;;) {
Node h= head;if (h !=null && h != tail) {int ws = h.waitStatus;if (ws == Node.SIGNAL) {if (!compareAndSetWaitStatus(h, Node.SIGNAL, 0))continue;// loop to recheck cases unparkSuccessor(h);
}elseif (ws == 0 &&
!compareAndSetWaitStatus(h, 0, Node.PROPAGATE))continue;// loop on failed CAS }if (h == head)// loop if head changedbreak;
}
}
总结: 多个线程的唤醒,主要是使用了级联唤醒的机制,在做共享锁时,根据现有的情况,进行唤醒下一线程。而当线程调度很快或算法不确定时,就会给人一种所有线程一起被唤醒工作的效果。