Friday, July 19, 2013

What is Compelling?

For all that Java got right, I'm looking for something new. I can't claim that my motivation is a good one: I'm starting to feel that Java is déclassé, occupying a “mainstream niche” in which it provides database access to web applications. Yes, there are some interesting things happening with Big Data and Java, but most of the interesting projects that I've seen look to alternative languages even if they keep the JVM.

However, that presents the question “What is compelling about other languages?” Is there anything in the non-Java space that has a story similar the one I saw from Java in 1999? There has been a lot of language development in the last decade or so, with new languages appearing and older languages gaining in popularity. Most of these languages have features that Java doesn't. Do any of those features present a compelling case to leave Java behind?

Before continuing, I want to state that I do not consider one language to be more “powerful” than another. That term gets a lot of use and abuse, but it's meaningless — at least if both languages are Turing-complete. Instead, I look at a language based on how expressive it is for a particular purpose. For example, APL is an extremely expressive language for an extremely narrow purpose.

My purpose is best described as “applications that transform quantum data, possibly in a distributed fashion.” This is almost a simile for “general purpose” computing, but highlights a few things that I find important:

  • Applications have multiple parts, and will run for an extended period of time. This implies that I want easy modularization, and that performance is important.
  • Quantum data is a fancy way of saying that I want to work with individual pieces of information, on both the input and output side. So statistical analysis features are not terribly important to me, but data manipulation features are. As I side note, I don't accept the LISPer's view that data-is-code-is-data.
  • Transformation has similar implications.
  • My view of “distributed computing” has more to do with how work is partitioned, rather than where those pieces run. Features that support partitioning are important, particularly partitioning of independent concurrent tasks. Once you partition your processing, you can move those pieces anywhere.

So, given this mindset, what are some features that Java doesn't have and that I find interesting?

Interpreted versus Compiled
One of the big things that drew me to Java was its elimination of the traditional build cycle: I didn't have to take a nap while building the entire project. This point was driven home to me last year, talking with a neighbor who is doing iOS development. He was touting how fast the latest high-end Macs could build his projects; I pointed out how the IDE on my nine-year-old desktop PC made that irrelevant.

Build cycles haven't completely gone away with Java: deploying a large Spring web-app, for example, takes a noticeable amount of time. But deployment takes time, even with an “interpreted” framework like Rails. For that matter, client-side JavaScript requires a browser refresh. But saving a few seconds isn't really important.

The REPL (read-eval-print loop), on the other hand, is a big win for interpreted languages. I came of age as a programmer when interactive debugging was ascendant, and I like having my entire environment available to examine and modify.

Duck Typing
Another of the benefits of so-called “scripting” languages is that objects are malleable; they are not rigidly defined by class. If you invoke a method on an object, and that object supports the method, great. If not, what happens depends on the language; in many, an “unimplemented method handler” is called, allowing functionality to be added on the fly. I like that last feature; it was one of the things that drew me to Rails (although I didn't stay there). And I think that function dispatch in a duck-typed language is closer to the “objects respond to messages” ideal of OOP purists.

On a practical basis, I find myself wishing for duck-typing every time I have to repeat a long parameterized type specification in Java. Especially when I have to repeat that specification both to declare and assign a variable. I've often wished for something like typedef, to reduce typing and provide names for generic structures.

But … there's a certain level of comfort in knowing that my compiler will catch my misspellings; I make a lot of them. And at the extreme, the flexibility of this model is no different than the idea of a single method named doSomething that takes a map and returns a map.

Lambdas, Closures, Higher-order Functions
At some point in the last dozen years, function pointers became mainstream. I use that term intentionally: the idea of a function as a something that can be passed around by your program is fundamentally based on the ability to hold a pointer to that function. It's a technique that every C programmer knows, but few use (and if you've ever seen the syntax of a non-trivial function pointer, you know why). It's a feature that is present in Java, albeit with the boilerplate of a class definition and the need to rigidly (and repeatedly) define your function's signature. I think that, as a practical thing, duck typing makes lambdas easier: a function is just another malleable object.

Which is a shame, because the idea of a lambda is extremely useful. For example, reading rows from a JDBC query requires about a dozen lines of boilerplate code, which hides the code that actually processes the data. Ditto for file processing, or list processing in general.

While I like lambdas, closures quite frankly scare me. Especially if used with threads. But that's a topic for another post.

Lazy Evaluation
I don't believe that “functional” languages are solely about writing your code as a series of higher-order functions. To me, a critical part of the functional approach is the idea that functions will lazily evaluate their arguments. The example-du-jour calculates the squares of a limited set of integers. It appears to take the entire set of integers as an argument, but lazy evaluation means that those values are only created when needed.

As people have pointed out, lazy evaluation can be implemented using an iterator in Java. Something to think about is that it also resembles a process feeding a message queue.

Structured Data Abstraction
One of the things I like about Groovy is that everything looks like an object, even XML. Combined with the null-safe navigation operator, you can replace a dozen lines of code (or an XPath) with a very natural foo?.bar?.baz. If your primary goal is data manipulation, features like this are extremely valuable. Unfortunately, the list of languages that support them is very short.
A different concurrency paradigm
Concurrent programming isn't easy. Programming itself isn't easy: for any non-trivial application you have to maintain a mental model of how that application's internal state changes over time, and which parts get to change what state. A concurrent application raises the bar by introducing non-determinism: two (or more) free-running threads of execution that update the same state at what seem to be arbitrary times. Java provided a model for synchronizing these threads of execution, at least within a single process, but Java synchronization brings with it the specter of contention: threads sitting idle waiting for a mutex.

Concurrency has been a hidden issue for years: even a simple web-app may have concurrent requests for the same user, accessing the same session data. With the rise of multi-core processors, the issue will become more visible, especially as programmers try to exploit performance gains via parallel operations. While Java's synchronization tools work well for “accidentally concurrent” programs, I don't believe they're sufficient for full-on concurrent applications.

New languages have the opportunity to create new approaches to managing concurrency, and two of the leading approaches are transactional memory and actors. Of the two, I far prefer the latter.

Those are the features that interest me; other people will have different ones. But given those desired features, along with my overall goals for what I'm doing with computers, I've surveyed the landscape of languages and thought about which could replace Java for me. That's the topic of my next post.

No comments: