Monday, May 5, 2014

Thoughts on Scala After Using It for Six Months

I started working with Scala professionally last September. At the time, I wrote a post about the features of the language that I liked and didn't like, from the perspective of an experienced developer new to the language. My plan was to write a retrospective after several months of use, looking at how my feelings about those features had changed over time.

I wrote several drafts that followed that theme, but liked none of them. While I certainly built up a list of likes and dislikes, I found that those were overwhelmed by two general themes: I find the language clumsy, and it requires too much mental effort — effort that is taken away from whatever problem I'm trying to solve.

With that sentence, you may think the rest of this post is a (long) rant. That isn't my intent. I'm not trying to convince anyone that Scala is a horrible language; in fact, there are many parts of it that I quite like. It's simply one person's experience and opinion, with illustrative examples. Feel free to disagree with my points and with the examples I chose.

I'll start with clumsiness. The example that comes to mind most quickly is the special syntax needed for one function with a variable argument list to call another with the same argument list:

  def foo(xs: Int*) = ???
  
  def bar(xs: Int*) = {
    foo(xs)     // this won't compile
    foo(xs:_*)  // this will
  } 

I have no doubt that there is a theoretical reason for this behavior. From the practical perspective, however, it's annoying: I can't take a parameter and pass it as an argument, even though the values are declared identically. It's not a big annoyance (and I find that I use varargs far less in Scala than in Java), but every time that I run into it — or any other syntactical oddity — I have to stop and think about why the compiler is complaining.*

And that is part of my second issue: high mental effort, because I need to think about what the compiler is doing rather than what my code is doing. Two examples of this are type inference and for comprehensions.

My issues with Scala type inference surprised me. I've worked with duck-typed languages, and never felt that their complete absence of type information impeded my understanding of code. With Scala, however, I don't know what I'd do without my IDE showing me type info on hover (and, unfortunately, it doesn't do that very well).

I think that the reason is that Scala functions tend to be more complex, with chains of higher-order function calls that may themselves require type inferencing from other functions. Such code usually makes perfect sense, but requires the programmer to carefully piece together what's happening at each step. I've adopted the habit of adding a return type specification to every function and minimizing the complexity of anonymous functions, and find these go a long way toward demystifying such constructs.

Something that I find harder to resolve are for-comprehensions. I first worked with for-comprehensions (aka list-comprehensions) in Python, and the mental model from that experience was reinforced when I learned Erlang. In both of these languages, a for-comprehension translates into nested iteration (with access to enclosing scope). **

Scala, by comparison, translates a for-comprehension into map() and flatMap() calls. On the one hand, this lets you do cool things like stringing together operations that return an Option: if any of the operations return None, the operation short-circuits and returns None. On the other hand, it makes you dependent on how a particular class implements those methods.

Here's a somewhat contrived example that represents a common use of for-comprehensions: flattening a hierarchical data structure.

val data = List(
            "foo" -> List("foo", "bar", "baz"),
            "argle" -> List("argle", "bargle", "wargle"))

val result = for {
  (key, values) <- data
  value <- values
} yield (key, value) 

The result is a list of tuples: ("foo", "foo"), ("foo", "bar"), and so on. Now a slight change:

val data = Map(
            "foo" -> List("foo", "bar", "baz"),
            "argle" -> List("argle", "bargle", "wargle"))

val result = for {
  (key, values) <- data
  value <- values
} yield (key, value) 

This comprehension translates into an identical sequence of calls, but now the outermost flatMap() is called on Map rather than List. This means that every tuple produced by yield is added to the map, with its first member as the key. And that means that the result contains only two entries, as repeated tuples with the same key are discarded.

You can look at this, say that it's not something you're likely to do, and moreover, that programmers should know the types of their data. But consider the case where data is actually a function call that's defined in some other module. That function originally returned the List, but then some developer noted that the keys are all unique, needed a Map for some other piece of code, and made the change. At that point, your for-comprehension has silently broken.

I'm going to give one more for-comprehension example; this one doesn't compile.

val data = Map(
            "foo" -> List("foo", "bar", "baz"),
            "argle" -> List("argle", "bargle", "wargle"))

val key = "foo"
val result = for {
  values <- data.get(key)
  value <- values
} yield (key, value)

The problem here is that Map.get() returns an Option, and Option.flatMap() expects a function that returns an Option. But the generator value <- values returns a List. To make this compile, you need to turn the Option into a sequence:

val result = for {
  values <- data.get(key).toSeq
  value <- values
} yield (key, value)

The overall issue is one of mental models. In Erlang, for-comprehensions represent a very simple mental model that can be applied identically in all cases. In Scala, the mental model seems equally simple at first, but in reality it changes depending on runtime data types. To use a Scala for-comprehension effectively, the programmer has to spend time thinking about how a particular class implements map() and flatMap() — and hope that nobody else mucks with the data.

How much mental effort? That's hard to quantify, but subjectively I feel that I take twice as long to do a task in Scala, even when the task is one that's most naturally implemented in a functional style.

Perhaps this is an indictment of my mental capacity, rather than the language. Or perhaps six months just isn't enough time to become productive with Scala. Either of those cases, however, begs the question of whether Scala is an appropriate language for an average development team. Because, regardless of whatever nice features a language provides, you don't want to choose a language that reduces productivity.


* My comment when I ran into this issue: “Whatever would Scala programmers do without an underbar to smooth over the rough spots?”

** To me, coming from a database background, the Erlang approach is very natural: it's equivalent to a query, with joins and predicates. In Scala, as long as every term produces a Seq, the behavior is identical.

1 comment:

Rüdiger Möller said...

I went over the spec and did some smallish samples and had the impression this will never get a mainstream language as it requires deep understanding of the underlying mechanics. C++ suffers from similar issues to a lesser degree, however there is no real competition for C++.
Imo a language must have a simple and small set of rules, so complexity grows out of composing complex constructs out of simple parts (e.g. c, lisp, smalltalk, java). Scala has a very broad set of rules to be aware of and that's why it probably will stay a niche language unlike ruby or python.