Saturday, April 05, 2008

Herlihy & Shavit: The Art of Multiprocessor Programming

(Disclaimer: I've taken courses from Maurice Herlihy at Brown.)

So! Concurrency's the hot thing. Intel says so. The technical bookshelves are groaning with books about concurrent programming. Most of these books are occult thread package tutorials; they walk you through using Java's monitors, or pthreads' conditions and mutexes, or Windows' Events or what have you, often with the same shopworn collection of toy problems: philosophers producing elves who dine on ... eh.

The Art of Multiprocessor Programming is different. It is, in part, a theory book. Don't panic! The use of formal reasoning is tasteful; a teaspoon or so of actual math is necessary when designing concurrent programs, because our intuitions can fool us. However, the authors don't overwhelm the reader; an undergraduate level background in CS theory is not necessary to appreciate the book. If you can read the word "tuple" without giggling, you'll be fine.

More importantly, though, the actual data structures and algorithms that are highlighted in this book are not the same-old. Sure, mutual exclusion is discussed, but it's with a focus on implementing mutual exclusion, which is a topic dear to me. Most strikingly, the majority of the examples in the book are of lock-free data structures. You'll find lock-free linked lists, hash tables, trees, etc.

In my day job, I treat lock-free concurrent data structures as, well, not quite black magic, but certainly Powerful Medicine. They take a lot of thought, bugs can hide in them for years, reasoning about their performance is non-trivial, even compared to locking, and frankly, not every programmer is up to maintaining such code; but when you need them, you need them. If the multi-core popular press is to be believed, more and more of us will be needing them in the near future. In as much as this is the first off-the-shelf collection of useful, debugged lock-free data structures, it's an invaluable contribution. Go buy it. Even if you never use a counting network, it will make you a better programmer. It's also the most writerly book about programming you'll read all year; Herlihy's deadpan humor is a welcome break from the super-serious tone programming books usual adopt.

However. ...

Reading between the lines a bit, I see this book as part of a larger project, which I refer to as "pop concurrency*". "PC" is a loose confederation of researchers in programming languages, compilers, architecture and operating systems whose implicit goal is to make concurrent programming, well, "easy." Much, much easier, anyway. Of course, sequential programming isn't all that easy. Before the current fashion in concurrency, the "software engineering" community was haranguing us about the expense, difficulty, and intractability of programming. We've made only incremental progress in making sequential programming easier; garbage collectors and safe languages are probably the most important contributions so far, ad they are not order-of-magnitude revolutions over traditional tools.

It's cheap and easy to make negative predictions about ambitious research agendas. E.g., making fun of strong AI people stopped being fun for me about 10 years ago. But at the risk of seeming to make a cheap shot, I am really, really skeptical about how easy we can make concurrent programming. Even if the thicket of hardware and software problems around transactional memory** resolve, I suspect TM only seems easy because it's the devil we don't know. After all, locks sound easy too: just identify the shared data and wrap mutexes around it! Sounds simple enough, doesn't it? It took years, and lots of big systems being constructed this way before problems like priority inversion were framed, and many years thereafter before they were solved. Paraphrasing Fred Brooks, there's no silver bullet; concurrency is hard not because we're Western linear Cartesian dualist squares, or because of bad notation, but because it's actually hard. Like all software, the easy parts are quickly discovered and factored, and the code that's left to be written represents real problems to solve.

* Warning: I made this term up. Do not use it when trying to sound intelligent.
** E.g.: how do you debug a TM program? When you're stepping through a transaction, won't it basically always fail? System calls during transactions? I/O? What if you've DMA'ed into or out of a piece of physical memory that's being transacted-upon? Smarter people than me are thinking about these problems, but man. They are non-trivial.