Friday, October 25, 2013

Deprecation and Technical Debt

API designers deprecate classes and methods as a signal that they shouldn't be used. Usually because there's a better way to do the same thing, but occasionally (as in the case of Java's Date constructors) because the existing methods have unintended consequences. Deprecation is the first step in controlled breakage of backwards compatibility, a warning for users that they can't rely on the functionality always being present.

Of course, in the real world, deprecated functionality rarely gets removed. The Date constructors have been with us since Java 1.0.x and show no sign of going away. Indeed, of all the major frameworks that I've used, the only one that takes deprecation seriously is Spring — and even then, only minor things seem to get removed; you can still compile a program written for Spring 1.x with only a few changes.

However, leaving deprecated code in your codebase is a form of technical debt. You run the risk that the functionality will one day be removed, and that you'll have to scramble to replace it. But more important, if your code contains a mix of deprecated and non-deprecated features, you have to expend extra effort to maintain it.

The worst case that I've seen was a company that built websites for clients. One of their features was a content management system: a tool that allowed the client to create their own content, inserted into the page when various conditions were met. Standard stuff, but this company actually had three incompatible content management systems, all home-grown. Each was developed to replace the previous.

The reason that all three systems remained in active use was that there was a cost to convert to the newer framework, and neither the company nor their clients were willing to pay that cost. In effect, they chose to leave deprecated code in their codebase.

Since the frameworks were all built in-house, there was no reason to worry about them going away unexpectedly. But what did go away were the developers who had built the frameworks, and with them the knowledge of how the frameworks worked. Although the older frameworks didn't have a lot of bugs, when one did appear it was a crisis situation, with a mad search to find someone with enough knowledge to track it down and fix it. Even without bugs, the lack of knowledge meant that every enhancement to the client site took extra time, as each new developer learned enough to avoid breaking the CMS.

This situation continued for as long as I worked at that company; for all I know, they've added a fourth or fifth system to the mix. Although the old systems imposed an unnecessary cost on each project, that incremental cost was always lower than the cost to upgrade. But over time, the costs added up, and the client paid much more than an outright conversion (which the company didn't mind, as that increased revenue).

Worse, developers did all they could to avoid contact with the older systems, or to have their names on any checkins associated with it. Getting pulled onto a CMS problem meant that your own project schedules would slip, and you'd have to listen to upset project managers waste more of your time. Much easier to let the other team sink or swim on their own.

I'm not saying that ignoring deprecations will create disfunction and discord within your development team. But do you want to risk it?

Friday, October 18, 2013

Immutability and Thread-Safety, Part 2

I've had a few offline conversations about my last post, and decided to dive a little deeper into concurrency and immutability. As an example, I'll use a shared map: the sort of thing you might use to hold session data in a web application. The first implementation simply wraps Scala's immutable.Map:

class SharedMap {

  var theMap : Map[String,String] = Map.empty;

  def size = theMap.size

  def get(key:String) = theMap.get(key)

  def put(key:String, value:String) {
    theMap = theMap + (key -> value)
  }
}

Because theMap is immutable, you have the following guarantee: put() will not interfere with get(). If they happen to be called at the same time, the get will use the old instance of the map, and not have to worry about partial changes due to the put. This is a powerful guarantee, but it does not mean that this code is thread-safe.

The problem is that this implementation allows an update race: if two threads call put() at the same time, they make their own immutable changes to the same immutable base map. But neither's changes include the other's, and only one change will be saved as the new base map.

Appending an item to an immutable map is a very fast operation. So fast that you might never hit the update race in testing. But in a production environment, with dozens or hundreds of concurrent updates, it is bound to happen, and you'll lose data. To see this happen rather spectacularly, run the following program.*

object ConcurrentUpdateRunner extends App {

  val map = new SharedMap

  val pool = java.util.concurrent.Executors.newFixedThreadPool(20)

  val futures = (1 to 100).map(ii => {
    val callable = new java.util.concurrent.Callable[String]() {
      def call() = {
        val x = Thread.currentThread.getName
        map.put(ii.toString, x)
        x
      }
    }
    pool.submit(callable)
  })

  val results = futures.map(_.get)
  println(s"number of executed Callables: ${results.size}");

  println(s"final map size: ${map.size}")
}

So what can you do?

One option is to introduce synchronization. For a simple data structure like this map, that's the route that I'd take, although some people see any use of synchronization as taboo.

def get(key:String) = theMap.get(key)

def put(key:String, value:String) {
  synchronized { theMap = theMap + (key -> value) }
}

Note that I just synchronized put(), something that I warned against in a previous post. But that warning applies only to mutable data structures. Since the underlying map in this example is immutable, we can get use half-synchronization without worry, and avoid contention between get and put.

A second alternative is to forego the Scala library, and use a ConcurrentHashMap.

At first glance, this appears to be a perfect solution: you don't have to think about concurrent access, just rely on the fact that Doug Lea understands it far better than you. However, a ConcurrentHashMap only guarantees thread safety of individual gets and puts. For our trivial map example this is sufficient. For most real-world applications it isn't, because you need atomic updates of data that's already in the map.

A third option, which I mentioned in my last post, is to introduce a message queue between the map and its consumers. This is not a particularly novel idea: Odersky suggests it in Programming in Scala (in the section “Good actors style”), and Armstrong uses a similar example in Programming Erlang. Here is one possible implementation, using Scala actors (which, although deprecated, are usable without the need for an additional library in the classpath).

class ActorMap1 {

  case class SizeMessage
  case class GetMessage(key:String)
  case class PutMessage(Key:String, value:String)

  val theActor = new DaemonActor {
    var theMap : Map[String,String] = Map.empty;

    def act() {
      loop {
        receive {
          case SizeMessage() => {
            sender ! theMap.size
          }
          case GetMessage(key) => {
            sender ! theMap.get(key)
          }
          case PutMessage(key,value) => {
            theMap = theMap + (key -> value)
          }
        }
      }
    }
  }

  theActor.start

  def size = {
    val fut = theActor !! new SizeMessage
    fut.apply
  }

  def get(key:String) = {
    val fut = theActor !! new GetMessage(key)
    fut.apply
  }

  def put(key:String, value:String) {
    theActor ! new PutMessage(key, value)
  }
}

This example adds quite a bit of complexity to the shared map: new classes to hold messages for the actor, and a facade object to handle the communications. And it still has a significant flaw: there's only one loop, handling both gets and puts. We've achieved thread safety, at the cost of introducing a bottleneck for concurrent operations.

In the case of an in-memory map, this bottleneck probably isn't significant. Unless you have dozens of threads constantly hitting the map, to the exclusion of other actions, the occasional waits will go unnoticed. But what if this map is actually a write-through cache for a database or socket? In that case, the put will take significant time, and contention becomes a real possibility. To resolve this problem, we can add another actor to the mix.

class ActorMap2 {

  case class SizeMessage
  case class GetMessage(key:String)
  case class PutMessage(Key:String, value:String)
  case class InternalPutMessage(Key:String, value:String)

  val getActor = new DaemonActor {
    var theMap : Map[String,String] = Map.empty;

    def act() {
      loop {
        receive {
          case SizeMessage() => {
            sender ! theMap.size
          }
          case GetMessage(key) => {
            sender ! theMap.get(key)
          }
          case InternalPutMessage(key,value) => {
            theMap = theMap + (key -> value)
          }
        }
      }
    }
  }

  val putActor = new DaemonActor {
    def act() {
      loop {
        receive {
          case PutMessage(key,value) => {
            // some long-running operation
            getActor ! InternalPutMessage(key, value)
          }
        }
      }
    }
  }

  getActor.start
  putActor.start

  def size = {
    val fut = getActor !! new SizeMessage
    fut.apply
  }

  def get(key:String) = {
    val fut = getActor !! new GetMessage(key)
    fut.apply
  }

  def put(key:String, value:String) {
    putActor ! new PutMessage(key, value)
  }
}

Yet more complexity, and it still isn't complete: put() doesn't have any way to signal the caller that an error occurred, and real-world actor systems need supervisors. Plus, there's still the possibility that parts of the application will hold onto stale data (a problem that can never truly be solved).

But that was the entire point of my last post. It isn't easy to correctly implement concurrent applications, and immutable data structures don't help you solve the bigger issues. If you rely on them to do so, you will be burned. You need to understand how the larger application mutates state, and choose an appropriate strategy for managing those mutations.

Finally, although in this example it appears to be a case of hunting mosquitoes with a shotgun, decomposition into independent actors does solve many of the bigger problems of concurrency. But that's simply because it forces you into the techniques of information hiding and limited interfaces that have been part of software engineering since the 1970s.


* I used a threadpool, rather than actors, to have better control over the number of threads. I'm also running on a multi-core processor, which means that I'm really hitting the map from multiple threads at the same time. Switching between single-core and multi-core machines tends to highlight different threading bugs.

Monday, October 7, 2013

The Big Myth of Functional Programming: Immutability Solves All Problems

I'm not against immutability. Indeed, I think that many concurrency problems can be eliminated by making data structures immutable. Most of those problems, however, are caused because their developers never really thought about concurrent access. And while switching to immutable data structures may solve some problems, it creates others — and in my opinion, those others are much more difficult to debug.

The underlying problem is that the whole point of a program is to modify state. A program that takes no inputs and produces no outputs is worthless, except as a way to turn an expensive CPU into a space heater.

But if the purpose of a program is to modify state, how does it hold the state that's being modified? A “pure” functional program must use the arguments and local variables of its functions. There is a top-level function, which creates the program's initial state, passes that initial state to functions, and gets back the final state.

And if your program has a single thread, or multiple independent threads, that's all you need to do (I also like the idea of decomposition into a tree of independent threads). But if your program consists of communicating threads, the purely functional, immutable data approach is not sufficient: different threads will hold different representations of what they consider the same data. Also known as a race condition.

The easiest way to solve such problems is to introduce a well-designed concurrent data structure (such as Java's ConcurrentHashMap) that is shared between threads. This data structure holds the canonical view of shared state: each thread reads the map whenever it needs to know the latest data. However, a concurrent map by itself isn't sufficient to solve race conditions: it guarantees atomic puts and gets, but not updates.

A better alternative, in my opinion, is to follow the Go mantra of “share by communicating, rather than communicate by sharing.” In other words, wrap your shared state in a data structure that has a message queue between it and the rest of the program. The rest of your program appears to be fully functional: mutations are just function invocations, and you can choose to implement your shared data using immutable objects.

This approach doesn't completely eliminate races: there is still a race between writes and reads (also known as “stale data”). However, there is no way to eliminate that particular race. No matter how hard you try, you can never guarantee that multiple independent threads have a consistent view of the world.

But stale data in one part of your program is far better than missing data due to an unrealistic expectation of what immutable data structures give you.