Hans Boehm gave a keynote address about C++11’s support for concurrency. It was a nice overview of major features and, of course, the most interesting topic, atomics and weak atomics. The official story is that if you use locks and strong atomics, you get the DRF guarantee: If the program has no data races, it will behave in a sequentially consistent manner. How do you prove that you have no data races? You enumerate all possible interleavings, and if you can’t find one where two conflicting memory accesses happen next to each other, you’re golden. That’s more or less what Java memory model guarantees (and what Posix tried to standardize). However C++ offers the programmer a way to relax sequential consistency constraints without introducing data races. Now, if you spin it this way, it sounds like a really cool thing. Hey, look, my program is data-race free! And, get this, I don’t have to suffer sequential consistency! The natural question is, what does it buy me that the C++ Standard doesn’t treat “memory_order_relaxed” accesses as data races? I would like to hear that programs with weak atomics have well defined semantics, even if the semantics are so complex that proofs of correctness of even the simplest algorithms are non-existent. But as far as I know this is not “really” true (maybe “sort of” true?). I tried to get straight answers from Hans, but he chooses his words very carefuly, like a UN diplomat. I’ll see him again at the HotPar and I’lll press him some more.
Hans’s talk was followed by Tony Van Eerd’s presentation on lock-free programming. I liked Tony’s attitude, which was “Use Locks!” Indeed, you should look at lock-free algorithms as a last resort. He showed a few examples that were hair-raising. Even the simplest lock-free linked list is a challenge. It’s really hard to spot danger areas, like the ABA problem when the node you’re pointing at gets deallocated and reallocated when you’re not looking. Your CAS succeeds, because the addresses match, but your update ends up in the great bucket in the sky. The lock-free circular queue of integers with only one thread pushing and one thread popping turned out to be a mine field. Tony claimed that it should work with weak, relaxed memory order, atomics. But, of course, no formal proof is on the horizon. I stared at the code for minutes and it sort of made sense to me, but who knows? Hans stared at it some more and tentatively murmured that it’s probably okay. The bottom line: This is some really scary stuff.
Then I spent half a day with Hartmut and Joel: Me trying to understand Proto and they trying to understand monads. I think we’ve learned a lot from each other and the new formulation of Proto using monads is getting closer and closer. We have sort of nailed the definition of a monadic “function” in C++. I think we should call these things “hybrid” monads because they blend compile-time and runtime aspects of C++. Fascinating stuff!
May 18, 2011 at 11:38 pm
>> “Indeed, you should look at lock-free algorithms as a last resort.”
I’ve been taken the exact opposite approach since I discovered the little gem library boost::lockfree (that is still in the boost review queue if I understand correctly) and its lock-free fifo. Now my base approach to concurrency in C++ is to combine boost::lockfree::fifo with boost::thread to create an event-loop thread that is fed trough a lock-free fifo and by doing so enables a design that uses absolutely no shared state concurrency, but is essentially message passing concurrency in its purest form. Until now, this approach has seemed to me an absolute blessing. Locks and shared state concurrency in bigger projects seem to lead to complexities that make liveness issues both relatively likely to occur and extremely hard to track down and fix. Message passing concurrency seems like the best to avoid problems like that, and the boost::lockfree::fifo seems like a perfect way to implement it in C++. Would you see anything wrong with this assertion?
May 19, 2011 at 12:28 am
Implementing lock-free algorithms is so much easier in a GC world like Java where there is no identity reuse and hence no ABA problem to deal with. Indeed many of the core java.util.concurrent algorithms simply don’t work without GC.
Course, you still have to worry of you are using the primitives…
May 19, 2011 at 2:53 am
@Jed, funny you should mention GC. In my view Tim Blechmann did an amazing job on his lock-free fifo. This lock-free fifo allows me to do lock-free message passing concurrency in C++. Message passing concurrency in turn, next to removing the risk of liveness problems, removes one further argument for GC.
B.t.w, I’m currently struggling with an hypothesis that the old pointer stealing ‘problem’ in auto_ptr, from a message passing concurrency view is in fact an absolute blessing. That is, we make a parallel event loop out of a lock free queue of auto_ptr and a boost::thread, and the need to have any remaining (slow, liveness problem sensitive) thread safe shared state concurrency code, or high overhead serialization code seems to fully disappear.
In any case, so far GC to me seems like part of the problem that lock free algorithms could help us solve.
May 19, 2011 at 9:06 am
@Jed: Indeed, we had a discussion about GC and ABA. I think that was the reason why Maurice Herlihy and Nir Shavit used Java in their excellent book.
@Rob: The message about avoiding lock-free programming is more about “don’t do it at home.” If lock-free algorithm is safely encapsulated in a trusted library, and it doesn’t leak the lock-freedom (which, as Tony discussed, may happen in some very interesting ways), you should use it. Passing unique_ptr’s as messages is something I blogged about, but with the usual qualifications: make sure there aren’t any aliases inside the objects you’re pointing to. (See my blog on uniqueness and on ownership systems.)
May 22, 2011 at 11:25 am
Yeah, it’s probably okay. However, one should think twice before including it into a general-purpose library. The fact that we enqueue and dequeue pointers does not mean that are not passing pointers.
Here are some examples where it won’t work.
May 22, 2011 at 1:25 pm
@dmitri: Exactly! You can’t use the relaxed queue to publish other data (data[x] and g_X in these examples). This is how relaxed atomics can leak into client’s code. And, if I understand well, it doesn’t even matter if data[x] or g_X are strong atomics, right?
May 23, 2011 at 8:14 am
> And, if I understand well, it doesn’t even matter if data[x] or g_X are strong atomics, right?
Yes.
July 27, 2011 at 5:38 am
[…] Read the whole article […]