27:01 [slide 16] note that the C++ standard (c++23 and prior) doesn’t provide any assurance that the atomic-read-modify-write operations are wait-free, and all one can portably check/verify is if an operation is lock-free (vs. blocking). In fact, many popular C++ compiler implement fetch_add on x86 as a lock-free cas-loop. This means that it might be impossible to implement portable wait-free algorithms in “pure” c++ at the moment. Daniel addresses some of this in 1:01:09 and 1:03:37
Couldn't read() just return 1 if it sees that val is zero? Obviously, we can't return 0 because that would allow situations where the value switches from 1 to 0 to 1, but if we just lie and say that the decrement hasn't finished until it committed the zero flag, read could return 1, I think this would be entirely consistent for outside observers. One advantage of this would be that we don't need the helper flag anymore. Another advantage is that read always just reads and never branches and writes.
@@freax13 I thought the same during the talk. I believe you are right, in this case the read linearizes before the decrement if the decrement isn't done yet. It should be perfectly valid.
Yeah, I wanted to comment the same thing. It seems a lot simpler. Maybe it's not correct somehow, but I don't see how. If the counter is 0 then it is in the progress of being set to zero... which by the way could be interrupted by an increment, and it seems weird and incorrect to let a read change the counter. It kinda violates const correctness and there's another problem: Let's say there are 3 threads/processes with 3 different priorities: H, M, L (High, Mid, Low). L tries to decrement to 0, then M interrupts and tries to increment (prevent setting it to 0), then H reads. Now H "helps" a lower priority to decrement the value to 0. This seems like priority inversion. M should be able to increment the counter before it gets set to 0. Which would be the case if read just returned 1 when value was 0. It was 1 before the decrement and could still be 1 after the decrement and parallel increment.
4:23 [slide 4] note that the precondition for decrement seems suspicious in a concurrent world - what if the counter is 1 and 2 threads call decrement. The precondition is violated although neither of the threads could “protect itself” from the violation. In practice it’s fine because the typical behavior is that every thread that calls decrement does so after having earlier called increment_if_not_zero() and received a true result. Daniel discusses it in 44:10 Great talk Daniel - definitely worthy of the big room 😂
RE legality of calling decrement around 44:00, I think "you know the counter is not 0" is accurate but can be made more specific: the ways you know the counter is not 0 are (1) you constructed the counter and have not decremented it yet, or (2) you incremented the counter successfully and have not decremented it yet. If you're in such a code branch, you can decrement the counter. The logic here can be perfectly static - or the legality of decrementing can be carried around in a variable - but it's almost automatic to get that right, if you've written a logical program.
At 48:46, would it have been possible to just return 1 in the case where the counter is zero but without the flags? Would saying that read linearize before decrement work too? Because if the counter is at zero, then you know that at the beginning of decrement you were at one. If the decrement operation is not done, you could assume it's still one. If another thread increments in the meantime, then that 1 was true. Otherwise, it was as if decrement was called after since it will finish after the load. Am I right to assume this?
Interesting I got the same results with my own testing, so I've gone with wait free for a game development environment but it looks like a tested swapping system might be best.
27:01 [slide 16] note that the C++ standard (c++23 and prior) doesn’t provide any assurance that the atomic-read-modify-write operations are wait-free, and all one can portably check/verify is if an operation is lock-free (vs. blocking). In fact, many popular C++ compiler implement fetch_add on x86 as a lock-free cas-loop. This means that it might be impossible to implement portable wait-free algorithms in “pure” c++ at the moment.
Daniel addresses some of this in 1:01:09 and 1:03:37
Couldn't read() just return 1 if it sees that val is zero? Obviously, we can't return 0 because that would allow situations where the value switches from 1 to 0 to 1, but if we just lie and say that the decrement hasn't finished until it committed the zero flag, read could return 1, I think this would be entirely consistent for outside observers. One advantage of this would be that we don't need the helper flag anymore. Another advantage is that read always just reads and never branches and writes.
@@freax13 But how we return 0 if it is 0 and other threads finished work?
@@freax13 I thought the same during the talk. I believe you are right, in this case the read linearizes before the decrement if the decrement isn't done yet. It should be perfectly valid.
@@freax13 I had the same thought. Anyone able to poke a hole in this?
Yeah, I wanted to comment the same thing. It seems a lot simpler. Maybe it's not correct somehow, but I don't see how.
If the counter is 0 then it is in the progress of being set to zero... which by the way could be interrupted by an increment, and it seems weird and incorrect to let a read change the counter. It kinda violates const correctness and there's another problem:
Let's say there are 3 threads/processes with 3 different priorities: H, M, L (High, Mid, Low).
L tries to decrement to 0, then
M interrupts and tries to increment (prevent setting it to 0), then
H reads. Now H "helps" a lower priority to decrement the value to 0. This seems like priority inversion. M should be able to increment the counter before it gets set to 0.
Which would be the case if read just returned 1 when value was 0. It was 1 before the decrement and could still be 1 after the decrement and parallel increment.
4:23 [slide 4] note that the precondition for decrement seems suspicious in a concurrent world - what if the counter is 1 and 2 threads call decrement. The precondition is violated although neither of the threads could “protect itself” from the violation. In practice it’s fine because the typical behavior is that every thread that calls decrement does so after having earlier called increment_if_not_zero() and received a true result. Daniel discusses it in 44:10
Great talk Daniel - definitely worthy of the big room 😂
The way to ensure that would be to allow a thread to call decrement iff it constructed the counter or it has previously succeeded at an increment.
excellent talk, thanks!
RE legality of calling decrement around 44:00, I think "you know the counter is not 0" is accurate but can be made more specific: the ways you know the counter is not 0 are (1) you constructed the counter and have not decremented it yet, or (2) you incremented the counter successfully and have not decremented it yet. If you're in such a code branch, you can decrement the counter.
The logic here can be perfectly static - or the legality of decrementing can be carried around in a variable - but it's almost automatic to get that right, if you've written a logical program.
I can't imagine myself using this technique in a million years, but such an interesting talk!
Pretty mind blowing and really creative solution!
At 48:46, would it have been possible to just return 1 in the case where the counter is zero but without the flags? Would saying that read linearize before decrement work too? Because if the counter is at zero, then you know that at the beginning of decrement you were at one. If the decrement operation is not done, you could assume it's still one. If another thread increments in the meantime, then that 1 was true. Otherwise, it was as if decrement was called after since it will finish after the load. Am I right to assume this?
13:22 OMG LIKE THE GAME
Great
Interesting I got the same results with my own testing, so I've gone with wait free for a game development environment but it looks like a tested swapping system might be best.
Great