Monday, October 7, 2013

The Big Myth of Functional Programming: Immutability Solves All Problems

I'm not against immutability. Indeed, I think that many concurrency problems can be eliminated by making data structures immutable. Most of those problems, however, are caused because their developers never really thought about concurrent access. And while switching to immutable data structures may solve some problems, it creates others — and in my opinion, those others are much more difficult to debug.

The underlying problem is that the whole point of a program is to modify state. A program that takes no inputs and produces no outputs is worthless, except as a way to turn an expensive CPU into a space heater.

But if the purpose of a program is to modify state, how does it hold the state that's being modified? A “pure” functional program must use the arguments and local variables of its functions. There is a top-level function, which creates the program's initial state, passes that initial state to functions, and gets back the final state.

And if your program has a single thread, or multiple independent threads, that's all you need to do (I also like the idea of decomposition into a tree of independent threads). But if your program consists of communicating threads, the purely functional, immutable data approach is not sufficient: different threads will hold different representations of what they consider the same data. Also known as a race condition.

The easiest way to solve such problems is to introduce a well-designed concurrent data structure (such as Java's ConcurrentHashMap) that is shared between threads. This data structure holds the canonical view of shared state: each thread reads the map whenever it needs to know the latest data. However, a concurrent map by itself isn't sufficient to solve race conditions: it guarantees atomic puts and gets, but not updates.

A better alternative, in my opinion, is to follow the Go mantra of “share by communicating, rather than communicate by sharing.” In other words, wrap your shared state in a data structure that has a message queue between it and the rest of the program. The rest of your program appears to be fully functional: mutations are just function invocations, and you can choose to implement your shared data using immutable objects.

This approach doesn't completely eliminate races: there is still a race between writes and reads (also known as “stale data”). However, there is no way to eliminate that particular race. No matter how hard you try, you can never guarantee that multiple independent threads have a consistent view of the world.

But stale data in one part of your program is far better than missing data due to an unrealistic expectation of what immutable data structures give you.


Rüdiger Möller said...

A big con against the "immutable" hype (and Actor approaches) is current CPU architecture. All these approaches permanently flood CPU caches with data copies/newly allocated objects.
Todays applications (wether distributed or local-multithreaded) spend most of the time "getting data to the CPU (and back to mem)"). Immutability and Actor patterns do a bad job in this regard.

IMO Best performance is reached using 'Disruptor'-alike patterns which basically center on mutating state/objects in a static region of memory.

kdgregory said...

My initial response to questions of performance and Scala is "you picked the wrong language." But that's needlessly contentious and not really true. I've looked at the code that scalac generates, and it's clear that its developers are thinking about performance.

I think that the real answer is more like "for 99.9% of applications, absolute CPU performance isn't important." Most of the applications that I've seen end up IO-bound. And the dream of functionalists is that you can easily scale horizontally (I don't buy into that dream, but again, I don't think it matters).

There are, of course, some applications that are CPU-bound. As you noted in another comment, we might end up in a world where Java is seen as the choice for machine performance, while other languages are used for programmer performance (much like the C++/Java divide of 10 years ago).

Finally, if you're interested in the disruptor pattern, you might like this presentation that a former colleague of mine did: