blog.kdgregory.com: dogma

Showing posts with label dogma. Show all posts

Friday, November 20, 2015

When Immutability Gets In The Way

Here's a recent hack that I perpetrated:

(defn- redact-headers-from-exception-info
  [ei]
  (if-not (get-in (ex-data ei) [:request :headers])
    ei)

  (let [orig-trace (.getStackTrace ei)]
    (doto (ex-info (.getMessage ei)
                   (update-in (ex-data ei) [:request :headers] dissoc "Authorization")
                   (.getCause ei))
          (.setStackTrace orig-trace))))

Clojure provides the ex-info function, which creates a RuntimeException that has an attached map of data. This is quite useful, as you can provide detailed information at the exception site without creating your own exception subclass.

However, it can provide too much information. In this case, the exception was being thrown from a web-service call, and, to assist debugging, contained the entire request. Including the Authorization header, which contains client credentials. Not something that you want to write to a logfile.

And here's how immutability got in our way. Clojure maps are immutable: you can't update them in-place. And the ExceptionInfo object doesn't provide any way to replace the map. So we were faced with a choice: either pass the exception to the logger and have sensitive data end up in the log; or extract the map from the exception, redact the sensitive data, but lose the stack trace. I don't know about you, but I consider stack traces to be one of the most important clues to debugging errors; we make web-service calls all over the place, and need to know which one is failing.

Fortunately, java.lang.Throwable is not immutable: it provides methods to get and set the stack trace. And thus the hack: I extract the stack trace and data map from the existing exception, remove the fields that I don't want to log, and create a new exception from those components. Then I explicitly replace the stack trace in the new exception. It's ugly, but it works: we log the information we need to debug the problem, but not the sensitive information that was present in the original info map.

This post is not intended as a slam on Clojure, or on immutability in general. Instead, it's a reminder that a rigid adherence to immutability can have unintended consequences.

Monday, June 16, 2014

Finalizers

I recently saw a comment on my reference objects article that said, in effect, “he mentioned finalizers but didn't say DON'T USE THEM EVER, so I stopped reading.” The capitalized words are taken exactly from the original comment; the rest is a paraphrase to tone down the rhetoric. The comment didn't bother me — if you stop reading so easily, you probably won't gain much from my articles anyway — but the attitude did. Because finalizers do indeed have a role to play.

They compensate for sloppy programming.

Good Java programmers know that non-memory resource allocation has to take place in a try / catch / finally construct. Just like good C programmers know that every malloc() must be matched by a free(). Sloppy programmers either don't know or don't care. And good programmers sometimes forget.

The question then becomes: what should the program do?

One alternative is to just explode. And this is not necessarily a bad thing: not closing your resources is a bug, and it's better to expose bugs early in the development process. In this view, forgetting to close your resource is no different than dereferencing a null pointer.

The problem with this view is that null pointer exceptions tend to show up early, the first time through the affected code. By comparison, leaked resources tend to hide until production, because you never generate sufficient load during development. Leaked file handles are a great example: a typical developer environment allows 4,096 open files. It will take a long time to run out of them, especially if you constantly restart the app. A typical server environment might allow twice or four times that much, but if you never shut down the app you'll eventually run out. Probably at your time of highest load.

And that's leads to an alternate solution: protect the programmers from themselves, by checking for open resources at the time the owning object is garbage collected. This isn't perfect: in a server with lots of memory, the garbage collection interval might exceed the time taken to exhaust whatever resource you're leaking.*

Once you accept the need for some alternate way to clean up non-memory resources, the only remaining question is how. Jave provides two mechanisms: finalizers or (since 1.2) phantom references.

So which should you use? Before answering that question, I want to point out something to the DON'T USE THEM EVER crowd: finalizers and phantom references are invoked under exactly the same conditions: when the garbage collector decides that an object is eligible for collection. The difference is what happens afterward. With phantom references the object gets collected right away, and the reference is put on a queue for later processing by the application. With finalizers, the object is passed to a JVM-managed thread that runs the finalize() method; actual collection happens once the finalizer runs.

Given the choice between a finalizer and a phantom reference, I will pick the finalizer 95% of the time. Phantom references require a lot of work to implement correctly. In the normal case, the only real benefit they provide is to avoid out-of-memory errors from large objects with long-running finalizers.**

A better answer is to avoid both. Far too many people think of finalizers as equivalent to C++ destructors, a place to put important cleanup code such as transaction commit/rollback. This is simply wrong. But it's just as wrong to invoke that code in a phantom reference.

Rather than dogmatically insist “no finalizers!”, I think the better approach is to adopt some development practices to prevent their invocation. One approach is to identify resources that are allocated but not released, using an analysis tool like FindBugs.

But if you want to recover from mistakes in the field, and can do so quickly and cleanly (so that you don't block the finalization thread), don't be afraid of finalizers.

* This is unlikely: if you're running through resources quickly, the associated objects will almost certainly be collected while still in the young generation. At this point, a pedantic pinhead will say “but the JLS doesn't guarantee that the garbage collector will ever run.” My best response to that: ”well, then you've got bigger problems.”

** A bigger benefit, and this is the 5% where I would use them, is to provide leaked-object tracking. For example, in a database connection pool. But I don't consider that the “normal” case.

Monday, June 2, 2014

Is that SSD Really Helping Your Build Times?

Update: I ran similar tests with full-disk encryption.

As developers, we want the biggest bad-ass machine that we can get, because waiting for the computer is so last century. And part of a bad-ass machine is having a solid-state drive, with sub-millisecond latency. Spinning platters covered in rust are not just last-century, they're reminiscent of the industrial age. But do we really benefit from an SSD?

This post emerged from a conversation with a co-worker: he was surprised that I encrypted my home directory, because of the penalty it caused to disk performance. My response was that I expected most of my files to be living in RAM, unencrypted, in the disk buffer. That led to a discussion about whether an SSD provided any significant benefit, given enough RAM to keep your workspace in the buffer cache. Turns out I was wrong about unencrypted data in the cache, but not about the SSD.

I was confident about the latter because a year ago, when I built my then-seriously-badass home computer (32Gb RAM — because I could), I ran some performance comparisons against my then-seriously-pathetic vintage 2002 machine. The new machine blew away the old, but much of the performance gain seemed to come from CPU-related items: faster clock speed, faster memory, huge L1 and L2 caches, and so on. Once CPU time was deducted, the difference between spinning rust and SSD wasn't that big.

I started to write a post at that time, but went down a rathole of trying to create a C program that could highlight the difference in L1/L2 cache. Then the old machine suffered an “accident,” and that was the end of the experiments.

Now, however, I have a far simpler task: quantify the difference that an SSD makes to a developer's workload. Which can be rephrased as “will buying an SSD speed up my compile times?” This is particularly important to me right now, because I'm on a project where single-module compiles take around a minute, and full builds are over 30.

Here's the experimental protocol:

Hardware:

Thinkpad W520: 4-core Intel Core i7-2860QM CPU @ 2.50GHz, 800MHz FSB. A bad-ass laptop when I got it (I just wish it wasn't so heavy).
8 GB RAM, 8 MB L2 cache
Intel “320 Series” SSD, 160 Gb, formatted as ext4. This is not a terribly fast drive, but with an average access time of 0.2 ms, and an average read rate of 270 MB/sec (as measured by the Gnome disk utility), it blows away anything with a platter.
Western Digital WD2500BMVU, 250 GB, 5400 RPM, formatted as ext4, accessed via USB 2.0. This is a spare backup drive; I don't think I own anything slower unless I were to reformat an old Apple SCSI drive and run it over a USB-SCSI connector (and yes, I have both). Average access time: 17.0 ms; average read rate: 35 MB/sec.
Xubuntu 12.04, 3.2.0-63-generic #95-Ubuntu SMP.

Workload:

Spring Framework 3.2.0.RELEASE. A large Java project with lots of dependencies, this should be the most disk-intensive of the three sample workloads. The build script is Gradle, which downloads and caches all dependencies.*
Scala 2.11.1. I'm currently working on a Scala project, and the Scala compiler itself seemed like a good sample. The main difference between Scala and Java (from a workload perspective) is that the Scala compiler does a lot more CPU-intensive work; in the office I can tell who's compiling because their CPU fan sounds like a jet engine spooling up. The build script is Ant, using Ivy to download and cache dependencies.**
GNU C Compiler 4.8.3. Added because not everyone uses the JVM. I didn't look closely at the makefile, but I'll assume that it has optimization turned up. Disk operations should be confined to reading source files, and repeated reads of header files.

Test conditions:

Unencrypted SSD
Encrypted SSD: same drive, default Ubuntu home directory encryption
USB hard drive

General configuration:

Each test is conducted as a distinct user, with its own home directory, to ensure that there aren't unexpected cross-filesystem accesses.
Each build is run once to configure (gcc) and/or download dependencies.
Timed builds are run from normal desktop environment, but without any other user programs (eg: browser) active.
Timed builds run with network (wifi and wired) disconnected.
The Spring and Scala times are an average of three runs. The gcc time is from a single run (I didn't have the patience to do repeated multi-hour builds, just to improve accuracy by a few seconds).

Per-test sequence:

Clean build directory (depends on build tool).
Sync any dirty blocks to disk (sync).
SSD TRIM (fstrim -v /)
Clear buffer cache (echo 3 > /proc/sys/vm/drop_caches)
Execute build, using time.

And now, the results. Each entry in the table contains the output from the Unix time command, formatted real / user / sys. I've converted all times to seconds, and rounded to the nearest second. The only number that really matters is the first, “real”: it's the time that you have to wait until the build is done. “User” is user-mode CPU time; it's primarily of interest as a measure of how parallel your build is (note that the JVM-based builds are parallel, the gcc build isn't). “Sys” is kernel-mode CPU time; it's included mostly for completeness, but notice the difference between encrypted and non-encrypted builds.

	Spring Framework	Scala	GCC
Unencrypted SSD	`273 / 527 / 10`	`471 / 1039 / 13`	`6355 / 5608 / 311`
Encrypted SSD	`303 / 534 / 38`	`491 / 1039 / 29`	`6558 / 5682 / 400`
USB Hard Drive	`304 / 525 / 11`	`477 / 1035 / 14`	`6462 / 5612 / 311`
Encryption Penalty	`11 %`	`4 %`	`3 %`
Spinning Rust Penalty	`11 %`	`1 %`	`2 %`

Do the numbers surprise you? I have to admit, they surprised me: I didn't realize that the penalty for encryption was quite so high. I haven't investigated, but it appears that ecryptfs, as a FUSE filesystem, does not maintain decrypted block buffers. Instead, the buffered data is encrypted and has to be decrypted on access. This explains the significantly higher sys numbers. Of course, losing my laptop with unencrypted client data has it's own penalty, so I'm willing to pay the encryption tax.

As for the difference between the SSD and hard drive: if you look at your drive indicator light while compiling, you'll see that it really doesn't flash much. Most of a compiler's work is manipulating data in-memory, not reading and writing. So the benefit that you'll get from those sub-millisecond access times is just noise.

On the other hand, if you're doing data analysis with large datasets, I expect the numbers would look very different. I would have killed for an SSD 3-4 years ago, when I was working with files that were tens of gigabytes in length (and using a 32 GB Java heap to process them).

Finally, to borrow an adage from drag racers: there's no subtitute for RAM. With 8 GB, my machine can spare a lot of memory for the buffer cache: free indicated 750 Mb after the Scala build, and several gigabytes after the gcc build. Each block in the cache is a block that doesn't have to be read from the disk, and developers tend to hit the same blocks over and over again: source code, the compiler executable, and libraries. If you have enough RAM, you could conceivably load your entire development environment with the first build Monday morning, and not have to reload it all week.†

At least, that's what I told myself to justify 32Gb in my home computer.

* I picked this particular version tag because it mostly builds: it fails while building spring-context, due to a missing dependency. However, it spends enough time up to that point that I consider it a reasonable example of a “large Java app.” I also tried building the latest 3.X tag, but it fails right away due to a too-long classname. That may be due to the version of Groovy that I have installed, but this experience has shaken my faith in the Spring framework as a whole.

** Scala has issues with long classnames as well, which means that the build will crash if you run it as-is on an encrypted filesystem (because encryption makes filenames longer). Fortunately, there's an option to tell the compiler to use shorter names: ant -Dscalac.args='-Xmax-classfile-name 140'

† What if you don't have a lot of memory to spare? I also tried running the Spring build on my anemic-when-I-bought-it netbook: an Intel Atom N450 @ 1.66 GZ with 1GB of RAM and a 512KB L2 cache. The stock hard drive is a Fujitsu MJA2250BH: 250 GB, 5400 RPM, an average read rate of 72 MB/sec, and an average access time of 18 ms. I also have a Samsung 840 SSD that I bought when I realized just how anemic this machine was, thinking that, if nothing else, it would act as a fast swap device. However, it doesn't help much: with the stock hard drive, Spring builds in 44 minutes; with the SSD, 39. An 11% improvement, but damn! If you look at the specs, that netbook is a more powerful machine than a Cray I. But it's completely unusable as a modern development platform.

Monday, October 7, 2013

The Big Myth of Functional Programming: Immutability Solves All Problems

I'm not against immutability. Indeed, I think that many concurrency problems can be eliminated by making data structures immutable. Most of those problems, however, are caused because their developers never really thought about concurrent access. And while switching to immutable data structures may solve some problems, it creates others — and in my opinion, those others are much more difficult to debug.

The underlying problem is that the whole point of a program is to modify state. A program that takes no inputs and produces no outputs is worthless, except as a way to turn an expensive CPU into a space heater.

But if the purpose of a program is to modify state, how does it hold the state that's being modified? A “pure” functional program must use the arguments and local variables of its functions. There is a top-level function, which creates the program's initial state, passes that initial state to functions, and gets back the final state.

And if your program has a single thread, or multiple independent threads, that's all you need to do (I also like the idea of decomposition into a tree of independent threads). But if your program consists of communicating threads, the purely functional, immutable data approach is not sufficient: different threads will hold different representations of what they consider the same data. Also known as a race condition.

The easiest way to solve such problems is to introduce a well-designed concurrent data structure (such as Java's ConcurrentHashMap) that is shared between threads. This data structure holds the canonical view of shared state: each thread reads the map whenever it needs to know the latest data. However, a concurrent map by itself isn't sufficient to solve race conditions: it guarantees atomic puts and gets, but not updates.

A better alternative, in my opinion, is to follow the Go mantra of “share by communicating, rather than communicate by sharing.” In other words, wrap your shared state in a data structure that has a message queue between it and the rest of the program. The rest of your program appears to be fully functional: mutations are just function invocations, and you can choose to implement your shared data using immutable objects.

This approach doesn't completely eliminate races: there is still a race between writes and reads (also known as “stale data”). However, there is no way to eliminate that particular race. No matter how hard you try, you can never guarantee that multiple independent threads have a consistent view of the world.

But stale data in one part of your program is far better than missing data due to an unrealistic expectation of what immutable data structures give you.

Monday, October 8, 2012

Scriptlets Are Not (Inherently) Evil

JSP scriptlets have a bad reputation; everyone can point to their favorite example of scriptlet abuse. In my case, it was a 12,000 line (yes, three zeros) monstrosity that held a “mere” 1,000 lines or so of markup. The rest was a big if-else construct with plenty of embedded logic. But that example, and ones like it, are examples of bad programming, not indictments of scriptlets as a programming medium. They're no different than Java classes that have a single method with thousands of lines.

Monstrosities aside, I don't think there's a valid argument against scriptlets. Actually, when I Googled for “why are scriptlets bad,” most of the top-ranked pages defended them. As far as I can tell, the main arguments are that scriptlets encourage bad programmers to put business logic in the page, that they make the page untestable, that they limit reuse, and that web developers won't understand pages with embedded Java code. All of which seem to me like red, rotted herrings.

Don't get me wrong, I believe in the separation of concerns that underlies the MVC model. Actually, I believe in the separation of concerns found in what I call the “CSV” model: a lightweight Controller that interacts with business logic via a Service layer, and passes any returned data to a View for rendering. But after working with several alternative technologies and languages, I'm convinced that scriptlets are the best way to implement any programmatic constructs involved in view rendering.

And some amount of programmatic rendering resides in almost every view. One common example is populating a <select> element. On the surface this is an easy task: iterate over a list of values, and emit <option> elements for each. In the real world, it's more complex: the option value will come from a different field than the option text, you probably have a (possibly null) value that should be selected, and maybe you'll decorate different options with different classes or IDs. To handle this, you need a language that, if not Turing-complete, is very close.

I'm going to work through just such an example, comparing scriptlets and JSTL. Both examples use the same Spring controller, which stores two values in the request context: a list of employees in allEmployees, and the desired selection in selectedEmployee (which may be null). If you want to walk through the code, it's available here, and builds with Maven.

First up is the JSTL version. Nothing complex here: a forEach over the list of employees, and some EL to extract the bean values. Perhaps the worst thing about this code is that the HTML it generates looks ugly: line breaks carry through from the source JSP, so the <option> element gets split over three lines. You can always tell a JSP-generated page by the amount of incongruous whitespace it contains (if you're security-conscious, that might be an issue).

<select name="employeeSelector">
<c:forEach var="emp" items="${employees}">
    <option value="${emp.id}"
        <c:if test="${!(empty selectedEmployee) and (selectedEmployee eq emp)}"> selected </c:if>
    >${emp}</option>
</c:forEach>
</select>

Now for the scriptlet version. It's a few lines longer than the JSTL version, partly because I have to retrieve the two variables from the request context (something EL does for me automatically). I also chose to create a temporary variable to hold the “selected” flag; I could have used an inline “if” that matched the JSTL version, or I could have changed the JSTL version to use a temporary variable; I find that I have different programming styles depending on language.

<select name="employeeSelector">
    <%
    List<Employee> allEmployees = (List<Employee>)request.getAttribute(Constants.PARAM_ALL_EMPLOYEES);
    Employee selectedEmployee  = (Employee)request.getAttribute(Constants.PARAM_SELECTED_EMPLOYEE);
    for (Employee employee : allEmployees)
    {
        String selected = ObjectUtils.equals(employee, selectedEmployee) ? "selected" : "";
        %>
        <option value="<%=employee.getId()%>" <%=selected%>> <%=employee%> </option>
        <%
    }
    %>
</select>

Even though the scriptlet version has more lines of code, it seems cleaner to me: the code and markup are clearly delineated. Perhaps there's a web developer who will be confused by a Java “for” statement in the middle of the page, but is the JSTL “forEach” any better? Assuming that your company separates the roles of web developer and Java developer, it seems like a better approach to say “anything with angle brackets is yours, everything else is mine.”

Putting subjective measures aside, I think there are a few objective reasons that the scriptlet is better than JSTL. The first of these is how I retrieve data from the request context: I use constants to identify the objects, the same constants that I use in the controller to store them. While I have no inherent opposition to duck typing, there's a certain comfort level from knowing that my page won't compile if I misspell a parameter. Sometimes those misspellings are obvious, but if you happen to write your JSTL with “selectedEmploye” it will take rigorous testing (either manual or automated) to find the problem.

Another benefit to using scriptlets is that you can transparently call out to Java. Here, for example, I use Jakarta Commons to do a null-safe equality check.

A better example would be formatting: here I rely on Employee.toString() to produce something reasonable (and if you look at the example you may question my choice). But let's say the users want to see a different format, like “Last, First”; or they want different formats in different places. In JSTL, you'll be required to manually extract and format the fields, and that code will be copy/pasted everywhere that you want to display the employee name. Then you'll have to track down all of that copy/paste code when the users inevitably change their mind about what they want to see. Or you could add different formatting functions to the Employee object itself, breaking the separation of model and view. With a scriptlet, you can call a method like EmployeeFormattingUtils.selectionName().

OK, before I get comments, I should note that there are easier ways to implement the JSTL version, using 3rd-party tag libraries. The Spring Form Tags, for example, would reduce my code to a single line, assuming that my controller followed the Spring “command” pattern; But those libraries limit you to common cases, restricting your ability to customize your HTML. And although you can write your own taglibs, I haven't seen many projects that do that (in fact, the only ones I've seen are ones where I did it).

I want to finish with a punchline: everything I've just written is moot, because the web-app world is transitioning from server-side view rendering to client-side, using technologies such as Backbone and Spine. That change will come with its own perils, and history tends to repeat itself. I worry that five years from now we'll be dealing with multi-thousand-line JavaScript monstrosities, mixing business logic with view rendering.

Tuesday, September 29, 2009

Intern Isn't Forever ... And Maybe Never Was

Today I managed to hit myself over the head with a cluebat, proof that there's always something to learn and that you should never accept dogma. In this case, the dogma was that interned strings never get garbage-collected. Like all good dogma, it combined a few facts with a leap of faith, and was plausible enough that I never challenged it.

The facts first (actually, only one fact): interned strings are stored in the same pool as literal strings. This is explicitly stated in the JavaDoc for String.intern(), and can be demonstrated with the following code:

String a = "Are we having fun yet?";
String b = new String(a);
System.out.println(a == b);

String c = b.intern();
System.out.println(a == c);

And now the leap of faith: the JVM doesn't clean up the constant pool. Seems plausible: after all, two literal strings are guaranteed to be the same. And intern() is a native method, so it must be doing something tricky behind the scenes. And everybody else says you'll cause bugs if you intern too many strings, so …

A skeptic might ask “how can you tell that two string literals aren't the same if you don't have references to both?” I even said as much when I wrote about canonicalizing maps (an article that got some edits today). Once all references to a string go out of scope (including any references within a class definition), then there's no need to keep that string in the pool. And the JVM doesn't — at least, the Sun JVM doesn't.

This particular cluebat entered the picture because I'm currently writing an article on out-of-memory exceptions, and wanted a program to demonstrate permgen failures. So I wrote a loop that interned big, random-content strings … and nothing happened. I killed the program after I finally realized that it wasn't going to die on its own.

But the dogma must have some basis in fact, right? It happens that I have a machine with Sun JVMs from version 1.2 on up. So I ran my test program on each revision, and while the -verbose:gc output changed, the result did not: all of these versions appear to clean up the string pool. Is it possible that a 1.1 release is the source of this dogma? Perhaps, and if someone still has one installed, here's the program:

public class InternExhaustion
{
    public static void main(String[] argv)
    throws Exception
    {
        while (true)
        {
            String str = generateRandomString(65536);
            str.intern();
        }
    }    
    
    private static String generateRandomString(int length)
    {
        char[] chars = new char[length];
        for (int ii = 0 ; ii < length ; ii++)
            chars[ii] = (char)(96 * Math.random() + ' ');
        return new String(chars);
    }
}

For myself, I have some edits to make. And a lump on the head to remind me to question dogma.

blog.kdgregory.com