Saturday, June 4, 2016

Target Fixation

Motorcyclists have a saying: you go where you look. If an animal runs out in front of you, or you there's a patch of sand in the middle of the corner, or a Corvette is coming the other way, your first response has to be to look elsewhere. If not, you'll almost certainly hit whatever it is that you didn't want to hit.

Another name for this phenomena is target fixation, and that name was driven home to me quite literally — and painfully — in a paintball game many years ago. I was slowly and carefully positioning myself to shoot one of the other players, when all of a sudden I felt a paintball hit the middle of my back. I was so fixated on my target that I stopped paying attention to what was around me.

I suspect that target fixation was an enormous help to our hunter-gatherer ancestors stalking their dinner. They would only get one chance to bring down their quarry, and didn't have the benefit of high-powered rifles and telescopic sights. To a modern human, surrounded by opportunities to fixate on the wrong thing, it's not so great.

Physical dangers are one thing, but we're also faced with intellectual dangers. If focus too closely on the scary thing that's right in front of you, you'll ignore all the pitfalls that lie just beyond. This is a particular concern for software developers, who may adopt and implement a particular design without taking the time to think of the ways that it can fail — or of alternative designs that are simpler and more robust.

For example, you might implement a web application that requires shared state, and become so fixated on transactional access to that state that you don't think about contention … until you start running at scale, and discover the delays that synchronization introduces. If you weren't fixated on concurrent access, you might have thought of better ways to share the state without the need for transactions.

So, how to avoid becoming fixated? In the physical world, where fixation has potentially deadly consequences, training programs focus on prevention via ritual. For motorcyclists, the ritual is “SEE”: search, evaluate, execute. For pilots, there are many rituals, but one that was burned into my brain is aviate, navigate, communicate.

For software development, I think that a preemptive “five whys” exercise is a useful way to avoid design fixation. This exercise is usually used after a problem occurs, to identify the root cause of the problem and potential solutions: you keep asking “why did this happen” until there are no more answers. Recast as a pre-emptive exercise, it is meant to challenge — and ultimately validate — the assumptions that underly your design.

Returning to the concurrency example, the first question might be “why do I want to prevent concurrent access?” One possible answer is ”this is inventory data, and we don't want two customers to buy the last item.” That could lead to several other questions, such as “do I need to use a database transaction?” and “do I need to make that guarantee at this point in the process?”

The chief danger in this exercise is “analysis paralysis,” which is itself a form of target fixation. To move forward, you must accept that you are making assumptions, and be comfortable that they're valid assumptions. If you fixate on the possibility that your assumptions are invalid, you'll never move.

You also need to recognize that, while target fixation is often dangerous, it can have a positive side: preventing you from paying attention to irrelevant details.

I had a real-world experience of this sort a few weeks ago, while riding my motorcycle on a twisting country road: I saw a pickup truck coming the other way and not keeping to his lane. With a closing speed in excess of 100 miles per hour there wasn't much time to make a decision, and not many good decisions to make. I could continue as I was going, assume that the driver would see me and be able to keep within his lane; if I was wrong in that assumption, my trip would be over. I could get on the brakes hard, now, but would come to a stop at the exact point where the pickup would leave his lane while exiting the corner.

My best option was to stop just past the apex of the corner, which would be where the pickup was most likely to be within his lane. I fixated on that spot, and let the muscle memory of 100,000+ miles balance the braking and turning forces necessary to get me there. I have no idea how close the truck came to hitting me; my riding partner said that it was an “oh shit” moment. But once I picked my destination, the pickup truck and everything around me simply disappeared.

Which leads me to think that there might be another name for the phenomena: “flow.”

Monday, May 23, 2016

How (and When) Clojure Compiles Your Code

When I started working with Clojure, one of the challenges I faced was understanding exactly what Clojure was doing with my code. I was intimately familiar with how Java and the JVM works, and tried to slot Clojure into that mental model — but found too many cases where my model didn't quite represent reality. The docs weren't much help: they talked about the features of the language (including compilation), but didn't provide detail on what “automatically compiled to JVM bytecode on the fly” actually meant.

I think that detail is important, especially if you come to Clojure from Java or are running your Clojure code within a Java-centric framework. And I see enough questions on the Internet to realize that not a lot of people actually understand how Clojure works. So this post demonstrates a few key points that are the basis of my new mental model.

I'm using Clojure 1.8 for examples, but I believe that everything that I say is correct for versions as early as 1.6 and probably before.

Clojure is a scripting language

I'll start with some definitions:
  • Compiled languages translate source code into an artifact that is then loaded, unchanged into the runtime. Any decisions in the code rely on state maintained within the executing program.
  • Scripting languages load the source code into the runtime, and execute that code as it is loading. Scripts may make decisions based on state that they manage, as well as global state that has been set by other scripts.

Clojure is very much a scripting language, even though it compiles its scripts into JVM bytecode. All Clojure source files are processed one expression at a time: the reader reads characters from the source until it finds a valid expression (a list of symbols delimited by balanced parentheses), expands any macros, compiles the result to bytecode, then executes that bytecode.

It doesn't matter whether the source code is entered by hand into the REPL, or read from a file as part of a (require ...) form. It's processed a single top-level expression at a time.

Note my term: “top-level” expression. You won't find this term in the Clojure docs; they refer to “expressions” and “forms” more-or-less interchangeably, but don't differentiate between expressions that are nested within other expressions. The reason that I do will become apparent later on.

In my opinion, this form of evaluation is what gives macros their power (the oft-proclaimed homoiconicity of the language simply means that they're easier to write). A macro is able to use any information that has already been loaded into the runtime, including variables that have been created earlier by the same or a different script.

Top-level expressions turn into classes

Here's an interesting experiment: start up a Clojure REPL with the following command (using the correct path to the Clojure distribution JAR):

java -XX:+TraceClassLoading -jar clojure-1.8.0.jar

You'll see a long list of classloading messages flash by before you end up at the REPL. Now enter a simple expression, such as (+ 1 2); you'll see more messages, as the Clojure runtime loads the classes that it needs. Enter that same expression again, and you'll see something like this:

user=> (+ 1 2)
[Loaded user$eval3 from __JVM_DefineClass__]
3

This message indicates that Clojure compiled that expression to bytecode and then loaded the newly-created class to execute it. The class definition is still in memory, and you can inspect it. For example, you can look at its superclass (I've removed the now-distracting classloader messages):

user=> (.getSuperclass (Class/forName "user$eval3"))
clojure.lang.AFunction

AFunction is a class within the Clojure runtime; it is a subclass of AFn, which implements the invoke() method. With this knowledge, it's apparent that the evaluation of this simple expression has four steps:

  1. Parse the expression (including macro expansion, which doesn't apply to this case) and generate the bytes that correspond to a Java .class file.
  2. Pass these bytes to a classloader, getting a Java Class back.
  3. Instantiate this class.
  4. Call invoke on the resulting object.

You can, in fact, do all of this by hand, provided that you know the classname:

user=> (.invoke (.newInstance (Class/forName "user$eval3")))
3

OK, so far so good. Now I want to show why earlier I called out a distinction between “top-level” expressions and nested expressions:

user=> (* 3 (+ 1 2))
[Loaded user$eval5 from __JVM_DefineClass__]
9

Here we have an expression that contains a nested expression. However, note that only one class was generated as a result. In theory, every expression could turn into its own class. Clojure takes a more pragmatic approach, which is a good thing for our memory footprint.

Variables are wrapped in objects

To outward appearances, a Clojure variable is similar to a final Java variable: you assign it (once) with def, and retrieve its value simply by inserting the variable in an expression:

user=> (def x 10)
#'user/x

user=> (class x)
java.lang.Long

user=> (+ x 2)
12

A hint of the truth can be seen if you attempt to use an unbound var in an expression:

user=> (def y)
#'user/y

user=> (+ 2 y)

ClassCastException clojure.lang.Var$Unbound cannot be cast to java.lang.Number  clojure.lang.Numbers.add (Numbers.java:128)

In fact, variables are instances of clojure.lang.Var, which provides functions to get and set the variable's value. When you reference a variable within an expression, that reference translates into a method call that retrieves the actual value.

This allows a great deal of flexibility, including the ability to redefine variables. Application code can do this on a per-thread basis using binding and set!, or within a call tree using with-redefs. The Clojure runtime does much more, such as redefining all variables when you reload a namespace.

A namespace is not a class

For someone coming from a Java background, this is perhaps the hardest thing to grasp. A namespace definition certainly looks like a class definition: you have a dot-delimited namespace identifier, which corresponds to the path where you save the source code. And when you invoke a function from a namespace, you use the same syntax that you would to invoke a static method from a Java class.

The first hint that the two aren't equivalent is that the ns macro doesn't enclose the definitons within the namespace. Another is that you can switch between namespaces at will and add new definitions to each:

user=> (ns foo)
nil
foo=> (def x 123)
#'foo/x
foo=> (ns bar)
nil
bar=> (def x 456)
#'bar/x
bar=> (ns foo)
nil
foo=> (def y 987)
#'foo/y

You could take the above code snippets, save them in an arbitrary file, and then use the load-file function to execute that file as a script. In fact, you could write your entire application, with namespaces, as a single script.

But most (sane) people don't do that. Instead, they create one source file per namespace, store that file in a directory derived from the namespace name, and use the require function to load it (or more often, a :require directive in some other ns declaration).

Loading code from the classpath: require and load

The :require directive is another point of confusion for a Java developer starting Clojure. It certainly looks like the import statement that we already know, especially when it's used in an ns invocation:

(ns example.main
    :require [example.utils :as utils])

In reality, :require is almost, but not quite, entirely unlike import. The Java compiler uses import to load definitions from an already-compiled class so that they can be referenced by the class that's currently being compiled. On a superficial level, the Clojure runtime does the same thing when it sees :require, but it does this by loading (and compiling) the source code for that namespace.

OK, there are some caveats to that statement. First is that require only loads a namespace once, unless you specify the :reload option. So if the required namespace has already been loaded, it won't be loaded again. And if the namespace has already been compiled, and the source file is older than the compiled files, then the runtime loads the already compiled form. But still, there's a lot of stuff happening as the result of a seemingly simple directive.

So, let's dig into the behavior of require, along with its step-brother load. Earlier I wrote about using load-file to load an arbitrary file into the REPL. Here's that file, followed by the command to load and run it:

(ns foo)
(def x 123)

(ns bar)
(def x 456)

(ns user)
(do (println "myscript!") (+ foo/x bar/x))
user=> (load-file "src/example/myscript.clj")
myscript!
579

When you load the file, it creates definitions within the two namespaces, then invokes an expression to add them. After loading the file, you can access those variables from the REPL:

user=> (* foo/x bar/x)
56088

The load function is similar, but loads files relative to the classpath. It also assumes a .clj extension. I'm using Leiningen, so my classpath is everything under src; therefore, I can load the same file like so:

user=> (load "example/myscript")
myscript!
nil

Wait a second, what happened to the expression at the end of the script? It was still evaluated — the println executed — but the result was discarded and load returned nil.

Now let's try loading this same script with require:

user=> (require 'example.myscript :reload)
myscript!
nil

Different syntax, same result. The two variables are defined in their respective namespaces, and the stand-alone expression was evaluated. So what's the difference?

The first difference is that require gives you a bunch of options. For example, you can use :as to create a short alias for the namespace, so that you don't have to reference its vars with fully-qualified names. The way that the runtime uses these flags is probably worthy of a post of its own.

Another difference is that require is a little smarter about loading scripts: it only loads (and compiles) a script if it hasn't already done so — unless, of course, you use the :reload or :reload-all options, like I did here. Omitting that option, we see that a second require doesn't invoke the println.

user=> (require 'example.myscript)
nil

Compiling your code (or, :gen-class doesn't do what you might think)

As you've seen above, the Clojure runtime normally compiles your code when it's loaded, producing the bytes of a .class file but not writing them to the filesystem. However, there are times that you want a real, on-disk class. For example, so that you can invoke that class from Java (note that you'll still need the Clojure JAR on your classpath). Or so that you can reduce startup time for a Clojure application, by avoiding load-time compilation (although I think this is probably premature optimization).

The compile function turns Clojure scripts into classes:

user=> (compile 'example.foo)
example.foo

That was simple enough. Note, however, that I was running in lein repl, which sets the *compile-path* runtime global to a directory that it knows exists. If you try to execute this function from the clojure.main REPL, it will fail unless you create the directory classes.

Here's the example file that I compiled:

(ns example.foo)

(def x 123)

(defn what [] "I'm compiled!")

(defn add2 [x] (+ 2 x))

And here are the classes that it produced:

-rw-rw-r--   1 kgregory kgregory     3008 May  7 09:29 target/base+system+user+dev/classes/example/foo__init.class
-rw-rw-r--   1 kgregory kgregory      683 May  7 09:29 target/base+system+user+dev/classes/example/foo$add2.class
-rw-rw-r--   1 kgregory kgregory     1320 May  7 09:29 target/base+system+user+dev/classes/example/foo$fn__1194.class
-rw-rw-r--   1 kgregory kgregory     1503 May  7 09:29 target/base+system+user+dev/classes/example/foo$loading__5569__auto____1192.class
-rw-rw-r--   1 kgregory kgregory      513 May  7 09:29 target/base+system+user+dev/classes/example/foo$what.class

If you're a bytecode geek like me, you'll of course run javap -c on those files to see what they contain (especially fn__1194, which doesn't appear anywhere in the source!). Have at it. For everyone else, here are the two things I think are important:

  • Every function turns into its own class. If you've been reading along, you aready knew that.
  • The foo__init class is responsible for pulling all of the other classes into memory, creating instances of those classes, and assigning them to vars in the namespace.

If you use Leiningen, you've probably noted that it adds a :gen-class directive to the main class of any “app” project that it creates. If you skim the docs for gen-class you might think this will produce a Java class that exposes all of your namespace's functions. Let's see what really happens, by adding a :gen-class directive to the example script:

(ns example.foo
  (:gen-class))

When you compile, the list of classes now looks like this:

-rw-rw-r--   1 kgregory kgregory     1823 May  7 09:31 target/base+system+user+dev/classes/example/foo.class
-rw-rw-r--   1 kgregory kgregory     3009 May  7 09:31 target/base+system+user+dev/classes/example/foo__init.class
-rw-rw-r--   1 kgregory kgregory      683 May  7 09:31 target/base+system+user+dev/classes/example/foo$add2.class
-rw-rw-r--   1 kgregory kgregory     1320 May  7 09:31 target/base+system+user+dev/classes/example/foo$fn__1194.class
-rw-rw-r--   1 kgregory kgregory     1505 May  7 09:31 target/base+system+user+dev/classes/example/foo$loading__5569__auto____1192.class
-rw-rw-r--   1 kgregory kgregory      513 May  7 09:31 target/base+system+user+dev/classes/example/foo$what.class

Everything's the same, except that we now have foo.class. Looking at this class with javap, we find that it contains overrides of the basic Object methods: equals(), hashCode(), toString(), and clone(). It also creates a Java-standard main() function, which looks for the Clojure-standard -main (which doesn't exist for our script, so will fail if invoked). But it doesn't expose any of your functions.

Reading the doc more closely, if you want to use :gen-class to expose your functions, you need to specify the exposed functions in the directive itself — and use a specified naming format that separates the Clojure method implementations from the names exposed to Java.

Pitfalls of compiling your code

Let's change the namespace declaration on foo, so that it requires bar:

(ns example.foo
  (:require [example.bar :as bar]))

This results in the the expected classes for foo, but also several for bar (which doesn't define any functions):

-rw-rw-r--   1 kgregory kgregory     2219 May  7 09:39 target/base+system+user+dev/classes/example/bar__init.class
-rw-rw-r--   1 kgregory kgregory     1320 May  7 09:39 target/base+system+user+dev/classes/example/bar$fn__1196.class
-rw-rw-r--   1 kgregory kgregory     1503 May  7 09:39 target/base+system+user+dev/classes/example/bar$loading__5569__auto____1194.class
-rw-rw-r--   1 kgregory kgregory     3009 May  7 09:39 target/base+system+user+dev/classes/example/foo__init.class
-rw-rw-r--   1 kgregory kgregory      683 May  7 09:39 target/base+system+user+dev/classes/example/foo$add2.class
-rw-rw-r--   1 kgregory kgregory     1320 May  7 09:39 target/base+system+user+dev/classes/example/foo$fn__1198.class
-rw-rw-r--   1 kgregory kgregory     1891 May  7 09:39 target/base+system+user+dev/classes/example/foo$loading__5569__auto____1192.class
-rw-rw-r--   1 kgregory kgregory      513 May  7 09:39 target/base+system+user+dev/classes/example/foo$what.class

This makes perfect sense: ff you want to ahead-of-time compile one namespace, you probably don't want its dependencies to be compiled at runtime. But recognize that the tree of dependencies can run very deep, and will include any third-party libraries that you use (poking around Clojars, there aren't a lot of libraries that come precompiled).

There is one other detail of compilation that may cause concern: require loads a namespace from the file(s) with the latest modification time. If you have both source and compiled classes on your classpath, this could mean that you're not loading what you think you are. Fortunately, in practice this primarily affects work in the REPL: Leiningen removes the target directory as part of the jar and uberjar tasks, so you won't produce an artifact with a source/class mismatch.

Wrap-up

This has been a long post, so I'll wrap up with what I consider the main points.

  • Startup times for Clojure applications will be longer than for normal Java applications, because of the additional step of compiling and evaluating each expression. This isn't going to be an issue if you've written a long-running server in Clojure, but it does add significant overhead to short-running programs (so Clojure is even less appropriate for small command-line utilities than Java).
  • Pay attention to the Clojure version used by your dependencies, because they might rely on functions from a newer version than your application; this problem manifests itself as an “Unable to resolve symbol” runtime error. While this is a general issue with transitive dependencies, I've found that third-party libraries tend to be at the latest version, while corporate applications tend to use whatever was current when they were begun.
  • As far as I can tell, the Clojure runtime doesn't ever unload the classes that it creates. This means that — on pre-1.8 JVMs — you can fill the permgen space. Not a big problem in development, but be careful if you use a REPL when connected to a production instance.
  • Every script that you load adds to the global state of the runtime. Be aware that the behavior of your scripts may be dependent on the order that they're loaded.

Saturday, April 30, 2016

Taming Maven: Transitive Dependency Pitfalls

Like much of Maven, transitive dependencies are a huge benefit that brings with them the potential for pain. And while I titled this piece “Taming Maven,” the same issues apply to any build tool that uses the Maven dependency mechanism, including Gradle and Leiningen.

Let's start with definitions: direct dependencies are those listed in the <dependencies> section of your POM. Transitive dependencies are the dependencies needed to support those direct dependencies, recursively. You can display the entire dependency tree with mvn dependency:tree; here's the output for a simple Spring servlet:

[INFO] com.kdgregory.pathfinder:pathfinder-testdata-spring-dispatch-1:war:1.0-SNAPSHOT
[INFO] +- javax.servlet:servlet-api:jar:2.4:provided
[INFO] +- javax.servlet:jstl:jar:1.1.1:compile
[INFO] +- taglibs:standard:jar:1.1.1:compile
[INFO] +- org.springframework:spring-core:jar:3.1.1.RELEASE:compile
[INFO] |  +- org.springframework:spring-asm:jar:3.1.1.RELEASE:compile
[INFO] |  \- commons-logging:commons-logging:jar:1.1.1:compile
[INFO] +- org.springframework:spring-beans:jar:3.1.1.RELEASE:compile
[INFO] +- org.springframework:spring-context:jar:3.1.1.RELEASE:compile
[INFO] |  +- org.springframework:spring-aop:jar:3.1.1.RELEASE:compile
[INFO] |  |  \- aopalliance:aopalliance:jar:1.0:compile
[INFO] |  \- org.springframework:spring-expression:jar:3.1.1.RELEASE:compile
[INFO] +- org.springframework:spring-webmvc:jar:3.1.1.RELEASE:compile
[INFO] |  +- org.springframework:spring-context-support:jar:3.1.1.RELEASE:compile
[INFO] |  \- org.springframework:spring-web:jar:3.1.1.RELEASE:compile
[INFO] \- junit:junit:jar:4.10:test
[INFO]    \- org.hamcrest:hamcrest-core:jar:1.1:test

The direct dependencies of this project include servlet-api version 2.4 and :spring-core version 3.1.1.RELEASE. The latter has a dependency on spring-asm, which in turn has a dependency on commons-logging.

In a real-world application, the dependency tree may include hundreds of JARfiles with many levels of transitive dependencies. And it's not a simple tree, but a directed acyclic graph: many JARs will share the same dependencies — although possibly with differing versions.

So, how does this cause you pain?

The first (and easiest to resolve) pain is that you might end up with dependencies that you don't want. For example, commons-logging. I don't subscribe to the fear that commons-logging causes memory leaks, but I also use SLF4J, and don't want two logging facades in my application. Fortunately, it's (relatively) easy to exclude individual dependecial's, as I described in a previous “Taming Maven” post.

The second pain point, harder to resolve, is what, exactly, is the classpath?

A project's dependency tree is the project's classpath. Actually, “the” classpath is a bit misleading: there are separate classpaths for build, test, and runtime, depending on the <scope> specifications in the POM(s). Each plugin can define its own classpath, and some provide a goal that lets you see the classpath they use; mvn dependency:build-classpath will show you the classpath used to compile your code.

This tool lists dependencies in alphabetical order. But if you look at a generated WAR, they're in a different order (which seems to bear no relationship to how they're listed in the POM). If you're using a “shaded” JAR, you'll get a different order. Worse, since a shaded JAR flattens all classes into a single tree, you might end up with one JAR that overwrites classes from another (for example, SLF4J provides the jcl-over-slf4j artifact, which contains re-implemented classes from commons-logging).

Compounding classpath ordering, there is the possibility of version conflicts. This isn't an issue for the simple example above, but for real-world applications that have deep dependency trees, there are bound to be cases where dependencies-of-dependencies have different versions. For example, the Jenkins CI server has four different versions of commons-collections in its dependency tree, ranging from 2.1 to 3.2.1 — along with 20 other version conflicts.

Maven has rules for resolving such conflicts. The only one that matters is that direct dependencies take precedence over transitive. Yes, there are other rules regarding depth of transitive dependencies and ordering, but those are only valid to discover why you're getting the wrong version; they won't help you fix the problem.

The only sure fix is to lock down the version, either via a direct dependency, or a dependency-management section. This, however, carries its own risk: if one of your transitive dependencies requires a newer version than the one you've chosen, you'll have to update your POM. And, let's be honest, the whole point of transitive dependencies was to keep you from explicitly tracking every dependency that your app needs, so this solution is decidedly sub-optimal.

A final problem — and the one that I consider the most insidious — is directly relying on a transitive dependency.

As an example, I'm going to use the excellent XML manipulation library known as Practical XML. This library makes use of the equally excellent utility library KDGCommons. Having discovered the former, you might also start using the latter — deciding, for example, that its implementation of parallel map is far superior to others.

However, if you never updated your POM with a direct reference to KDGCommons, then when the author of PracticalXML decides that he can use functions from Jakarta commons-lang rather than KDGCommons, you've got a problem. Specifically, your build breaks, because the transitive depenedency has disappeared.

You might think that this is a uncommon situation, but it was actually what prompted this post: a colleague changed one of his application's direct dependencies, and his build started failing. After comparing dependencies between the old and new versions we discovered a transitive depenency that disappeared. Adding it back as a direct dependency fixed the build.

To wrap up, here are the important take-aways:

  • Pay attention to transitive dependency versions: whenever you change your direct dependencies, you should run mvn dependency:tree to see what's changed with your transitives. Pay particular attention to transitives that are omitted due to version conflicts.
  • If your code calls it, it should be a direct dependency. Plugging another of my creations, the PomUtil dependency tool can help you discover those.