blog.kdgregory.com: design

Showing posts with label design. Show all posts

Friday, August 14, 2015

The Carburetor: Elegance and the Real World

The carburetor is a simple, elegant device for providing fuel to a gasoline engine. So simple and elegant that it was the primary mechanism for doing so for approximately 100 years, and remains the primary mechanism for small engines such as lawnmowers.

Here's a simplified explanation of how it works: all air entering the engine flows through the carburetor; the “jet,” a small tube connected to the supply of gasoline, is exposed to the airflow; as air flows through the carburetor body, it pulls gasoline out of the jet and vaporizes it. The more air flowing past the jet, the more fuel is drawn into it. By properly sizing the jet, you can approach a near-optimal fuel-air mixture — although more on that later.

The airflow through a carburetor is caused by the intake stroke of the engine: as the piston moves down in the cylinder, it creates a vacuum that pulls air from the intake manifold; the faster the engine turns, the more air it wants (and the more fuel it needs). Now here's the interesting part: there's a valve inside the carburetor, attached to the accelerator pedal: when your foot is off the pedal, that valve is (almost) closed; the engine can only get enough air (and fuel) to run at idle speed. When you push the pedal to the floor, the valve opens wide, and the engine gets all the air and fuel that it wants.

But that leads to the first problem with a carbureted engine: the feedback loop (and lag) between opening the throttle and actually going faster. At a normal cruise speed, the butterfly valve is only partway open; the engine doesn't get all the air that it can handle. When you open the throttle, that lets more air into the system, and indirectly, more fuel. More fuel means more power generated by the engine, which (given constant load) means that the engine will speed up. Which increases the vacuum, which means that more air is drawn into the system, which means that more fuel is provided with it, which …

But, as I said, there's lag: when you first allow more air into the system, the flow of gasoline doesn't keep up. So at some point carburetor designers added an “accelerator pump”: an actual pump that squirts extra gasoline into the airflow. Most of the time it does nothing, except when you push the pedal to the floor; then it sends that extra gasoline to the engine, to compensate for the flow through the main jet.

While the single carburetor works fairly well for small engines, it doesn't work so well for large engines that have widely differing airflow requirements between idle and full throttle. So, carburetor designers compensated with multiple barrels: separate paths for the air to follow. At idle the “primaries” provide all the air and fuel; under full acceleration, the “secondaries” open to provide vastly more air and fuel.

All of which works fairly well as long as the altitude doesn't change. A typical automotive carburetor is sized for its market, on the fairly reasonable assumption that drivers stay close to home. But take an east coast car, tuned for driving at sea level, up into the mountains of Colorado, and the mixture becomes excessively rich: the carburetor allows too much fuel to mix with the air, and (perhaps surprisingly) performance suffers. Eventually the spark plugs will get coated with a layer of soot. A worse fate meets the Colorado car driven to the shore: it doesn't provide enough fuel, which causes the engine to run hot, which eventually causes extensive (and expensive) damage to the engine's valves and pistons.

Light aircraft, which cover a similar altitude range on every flight, have a solution: the mixture control. After every significant change in altitude, the pilot has to re-adjust the mixture to match the air density at that altitude. In a typical long-distance flight, the pilot might change throttle settings three times (takeoff, cruise, landing) but adjust mixture a half-dozen times or more. Not something that the typical automobile driver would want to do (or do particularly well; all teenagers' cars would be overly rich “for more power”).

The simple, elegant carburetor is no longer so simple or elegant: to meet the needs of the real world, it's grown a bunch of features. These features expand what I'll call the “conceptual model” of the carburetor. A simple function relating airflow velocity and fuel delivery is not sufficient to implement the real-world model.

Today we use fuel injection in almost every car (and in most high-performance light aircraft). Fuel injection systems are certainly not simple: a computer decides how much fuel to inject based on inputs from a plethora of sensors, ranging from ambient temperature to position of the accelerator pedal. But fuel injectors do a much better job of providing exactly the right amount of fuel at exactly the right time.

And, although a fuel injection system is more complex than a carburetor, its conceptual model is actually simpler: there's a single sensor that measures the amount of residual oxygen in the exhaust, and the computer attempts to optimize this value.

Wednesday, July 17, 2013

Things that Java Got Right

Java turns 18 this year. I've been using it for 14 of those years, and as I've said before, I like it, both as a language and as a platform. Recently, I've been thinking more deeply about why I made the switch from C++ and haven't looked back.

One well-known if biased commentator said that Java's popularity was due to “the most intense marketing campaign ever mounted for a programming language,” but I don't think that's the whole story. I think there were some compelling reasons to use Java in 1995 (or, for me, 1999). And I'm wondering whether these reasons remain compelling in 2013.

Before going further, I realize that this post mixes Java the language with Java the platform. For the purposes of this post, I don't think they're different: Java-the-platform was created to support Java-the-language, and Java programs reflect this fact. Superficially, Java-the-language is quite similar to C or C++. Idiomatically, it is very different, and that is due to the platform.

Dynamic Dispatch

Both Java and C++ support polymorphism via dynamic dispatch: your code is written in terms of a base class, while the actual object instances are of different descendant classes. The language provides the mechanism by which the correct class‘ method is invoked, depending on the actual instance.

However, this is not the default mechanism for C++, static dispatch is. The compiler creates a mangled name for each method, combining class name, method name, and parameters, and writes a hard reference to that name into the generated code. If you want dynamic dispatch, you must explicitly mark a method as virtual. For those methods alone, the compiler invokes the method via a reference stored in the object's “V-table.”

This made sense given the performance goals of C++: a V-table reference adds (a tiny amount of) memory to each object instance, and the indirect method lookup adds (a tiny amount of) time to each invocation. As a result, while C++ supported polymorphic objects, it was a rare C++ application that actually relied on polymorphism (at least in my experience in the 1990s). It wasn't until I saw dozens of Java classes created by a parser generator that I truly understood the benefits of class-based polymorphism.

Today, the prevalence of “interpreted” languages means that dynamic dispatch is the norm — and in a far more flexible form than Java provides.

Late Binding

At the time Java appeared, most applications were statically linked: the compiler produced an object file filled with symbolic references, then the linker replaced these symbolic references with physical addresses when it produced the executable. Shared libraries existed, and were used heavily in the Windows world (qv “DLL hell”), but their primary purpose was to conserve precious RAM (because they could be shared between processes).

In addition to saving RAM, shared libraries had another benefit: since you linked them into the program at runtime, you could change your program's behavior simply by changing the libraries that you used. For example, if you had to support multiple databases you could write a configuration file that selected a specific data access library. At the time, I was working on an application that had to do just that, but we found it far easier to just build separate executables and statically link the libraries. Talking with other people, this was a common opinion. The Apache web server was an exception, although in its case the goal was again to save RAM by not linking modules that you didn't want; you also had an option to rebuild with statically-linked libraries.

Java, by comparison, always loads classes on an as-needed basis when the program runs. Changing the libraries that you use is simply a matter of changing your classpath. If you want, you can change a single classfile.

This has two effects: first, your build times are dramatically reduced. If you change one file, you can recompile just that file. When working on large C and C++ codebases, I've spent a lot of time optimizing builds, trying to minimize time spent staring at a scrolling build output.

The second effect is that the entire meaning of a “release” changes. With my last C++ product, bug fixes got rolled into the six-month release cycle; unless you were a large customer who paid a lot in support, you had to wait. With my first Java project, we would ship bugfixes to the affected customers as soon as the bug was fixed. There was no effort expended to “cut a release,” it was simply a matter of emailing classfiles.

Classloaders

Late binding was present in multiple other languages at the time Java appeared, but to the best of my knowledge, the concept of a classloader — a mechanism for isolating different applications within a single process — was unique to Java. It represented late binding on steroids: with a little classloading magic, and some discipline in how you wrote your applications, you could update your applications while they were running. Or, as in the case of applets, applications could be loaded, run, and then be discarded and their resources reclaimed.

Classloaders do more than simply isolate an application: the classloader hierarchy controls the interaction of separately-loaded applications (or, at least, groups of classes). This is a hard problem, and Java classloaders are an imperfect solution. OSGi tries to do a better job, but adds complexity to an application's deployment.

Threads

Threads are another thing that predated Java but didn't see a lot of use. For one thing, “threads” gives the impression of a uniform interface, which wasn't the case in the mid-1990s. I've worked with Posix threads, Solaris threads, and one other whose name I can't remember. They all had the same basic idea, but subtly different APIs and implementation. And even if you knew the APIs, you'd find that vendor-supplied code wasn't threadsafe. Faced with these obstacles, most Unix-centric programmers turned to the tried-and-true multi-process model.

But that decision led to design compromises. I think the InConcert server was typical in that it relied on the database to manage consistency between multiple processes. We could have gotten a performance boost by creating a cache in shared memory — but we'd have to implement our own allocator and coherence mechanism to make that work. One of the things that drove me to prototype a replacement in Java was its ability to use threads and a simple front-end cache.

I could contemplate this implementation because Java provided threads as a core part of the language, along with synchronization primitives for coordinating them. My cache was a simple synchronized Map: retrieval operations could probe the map without worrying that another thread was of updating it. Updates would clear the cache on their way out, meaning that the next read would reload it.

That said, today I've come to the belief that most code should be thread-agnostic, written as if it were the only thing running, and not sharing resources with other threads. Concurrent programming is hard, even when the language provides primitives for coordination, and in a massively-parallel world contention is the enemy. A shared-nothing mentality — at least for mutable state — and a queue-based communication model makes for far simpler programs and (I believe) higher overall performance.

Library Support

All of the above were neat, but I think the truly compelling reason to use Java in the 1990s was that it came with a huge standard library. You wanted basic data structures? They were in there. Database access? In there. A way to make HTTP requests? In there. Object-oriented client-server communication? Yep. A GUI? Ugly, but there.

This was a time when the STL wasn't yet ubiquitous; indeed, some C++ compiler vendors were just figuring out how to implement templates (causing no end of problems for cross-platform applications that tried to use them). The RogueWave library was popular, but you had to work for a company willing to buy it. I think every C++ programmer of the 1990s had his or her own string class, with varying levels of functionality and correctness. It was a rite of passage, the first thing you did once you figured out how classes worked.

Java's large — and ever-growing — library has been both a blessing and a curse. It's nice to have one standard way to do most tasks, from parsing XML to computing a cryptographic hash. On the other hand, in JDK 1.6 there are 17,484 classes in 754 packages. Many imported whole from third-party libraries. This is bloat for those who don't need the features. Worse, it creates friction and delay for updates: if you find a bug in the XML processor, should you file it with Oracle or Apache? And will it ever get fixed?

Those were the things that I found compelling about Java. Other people had different lists, but I think that everyone who adopted Java in the late 1990s and early 2000s did so for practical reasons. It wasn't simply Sun's marketing efforts, we actually believed that Java offered benefits that weren't available elsewhere.

The 2000s have been a time of change on the language scene: new languages have appeared, and some older languages have become more popular. What would make a compelling case for leaving Java behind?

Friday, February 1, 2013

Concurrency and Interviewing

Yesterday I answered a question on Stack Overflow (something I don't do very often any more, for reasons I won't go into here).

You are given a paragraph, which contain n number of words, you are given m threads. What you need to do is, each thread should print one word and give the control to next thread, this way each thread will keep on printing one word, in case last thread come, it should invoke the first thread. Printing will repeat until all the words are printed in paragraph. Finally all threads should exit gracefully. What kind of synchronization will use?

There were the usual rants about how this was a lousy interview question because it didn't solve a real-world problem, several answers with “Teh Codez” and nothing more, a “change your lifestyle” response, and an answer with code and short explanation that was accepted. Standard PSE fare, and I'm not sure what drove me to write an answer — especially such a long answer. I think it started out as an alternative way to implement the problem. It ended up capturing a big part of my philosophy on both multi-threading and interviewing. So here it is, for those few people who left me in their RSS feeds over the last few months (that's another story).

In my opinion, this is a fabulous interview question -- at least assuming (1) the candidate is expected to have deep knowledge of threading, and (2) the interviewer also has deep knowledge and is using the question to probe the candidate. It's always possible that the interviewer was looking for a specific, narrow answer, but a competent interviewer should be looking for the following:

Ability to differentiate abstract concepts from concrete implementation. I throw this one in primarily as a meta-comment on some of the comments. No, it doesn't make sense to process a single list of words this way. However, the abstract concept of a pipeline of operations, which may span multiple machines of differing capabilities, is important.
In my experience (nearly 30 years of distributed, multi-process, and multi-threaded applications), distributing the work is not the hard part. Gathering the results and coordinating independent processes are where most threading bugs occur. By distilling the problem down to a simple chain, the interviewer can see how well the candidate thinks about coordination. Plus, the interviewer has the opportunity to ask all sorts of follow-on questions, such as "OK, what if each thread has to send its word to another thread for reconstruction."
Does the candidate think about how the processor's memory model might affect implementation? If the results of one operation never get flushed from L1 cache, that's a bug even if there's no apparent concurrency.
Does the candidate separate threading from application logic?

This last point is, in my opinion, the most important. Again, based on my experience, it becomes exponentially more difficult to debug threaded code if the threading is mixed with the application logic (just look at all the Swing questions over on SO for examples). I believe that the best multi-threaded code is written as self-contained single-threaded code, with clearly-defined handoffs.

With this in mind, my approach would be to give each thread two queues: one for input, one for output. The thread blocks while reading the input queue, takes the first word off of the string, and passes the remainder of the string to its output queue. Some of the features of this approach:

The application code is responsible for reading a queue, doing something to the data, and writing the queue. It doesn't care whether it is multi-threaded or not, or whether the queue is an in-memory queue on one machine or a TCP-based queue between machines that live on opposite sides of the world.
Because the application code is written as-if single-threaded, it's testable in a deterministic manner without the need for a lot of scaffolding.
During its phase of execution, the application code owns the string being processed. It doesn't have to care about synchronization with concurrently-executing threads.

That said, there are still a lot of grey areas that a competent interviewer can probe:

"OK, but we're looking to see your knowledge of concurrency primitives; can you implement a blocking queue?" Your first answer, of course, should be that you'd use a pre-built blocking queue from your platform of choice. However, if you do understand threads, you can create a queue implementation in under a dozen lines of code, using whatever synchronization primitives your platform supports.
"What if one step in the process takes a very long time?" You should think about whether you want a bounded or unbounded output queue, how you might handle errors, and effects on overall throughput if you have a delay.
How to efficiently enqueue the source string. Not necessarily a problem if you're dealing with in-memory queues, but could be an issue if you're moving between machines. You might also explore read-only wrappers on top of an underlying immutable byte array.

Finally, if you have experience in concurrent programming, you might talk about some frameworks (eg, Akka for Java/Scala) that already follow this model.

Monday, October 8, 2012

Scriptlets Are Not (Inherently) Evil

JSP scriptlets have a bad reputation; everyone can point to their favorite example of scriptlet abuse. In my case, it was a 12,000 line (yes, three zeros) monstrosity that held a “mere” 1,000 lines or so of markup. The rest was a big if-else construct with plenty of embedded logic. But that example, and ones like it, are examples of bad programming, not indictments of scriptlets as a programming medium. They're no different than Java classes that have a single method with thousands of lines.

Monstrosities aside, I don't think there's a valid argument against scriptlets. Actually, when I Googled for “why are scriptlets bad,” most of the top-ranked pages defended them. As far as I can tell, the main arguments are that scriptlets encourage bad programmers to put business logic in the page, that they make the page untestable, that they limit reuse, and that web developers won't understand pages with embedded Java code. All of which seem to me like red, rotted herrings.

Don't get me wrong, I believe in the separation of concerns that underlies the MVC model. Actually, I believe in the separation of concerns found in what I call the “CSV” model: a lightweight Controller that interacts with business logic via a Service layer, and passes any returned data to a View for rendering. But after working with several alternative technologies and languages, I'm convinced that scriptlets are the best way to implement any programmatic constructs involved in view rendering.

And some amount of programmatic rendering resides in almost every view. One common example is populating a <select> element. On the surface this is an easy task: iterate over a list of values, and emit <option> elements for each. In the real world, it's more complex: the option value will come from a different field than the option text, you probably have a (possibly null) value that should be selected, and maybe you'll decorate different options with different classes or IDs. To handle this, you need a language that, if not Turing-complete, is very close.

I'm going to work through just such an example, comparing scriptlets and JSTL. Both examples use the same Spring controller, which stores two values in the request context: a list of employees in allEmployees, and the desired selection in selectedEmployee (which may be null). If you want to walk through the code, it's available here, and builds with Maven.

First up is the JSTL version. Nothing complex here: a forEach over the list of employees, and some EL to extract the bean values. Perhaps the worst thing about this code is that the HTML it generates looks ugly: line breaks carry through from the source JSP, so the <option> element gets split over three lines. You can always tell a JSP-generated page by the amount of incongruous whitespace it contains (if you're security-conscious, that might be an issue).

<select name="employeeSelector">
<c:forEach var="emp" items="${employees}">
    <option value="${emp.id}"
        <c:if test="${!(empty selectedEmployee) and (selectedEmployee eq emp)}"> selected </c:if>
    >${emp}</option>
</c:forEach>
</select>

Now for the scriptlet version. It's a few lines longer than the JSTL version, partly because I have to retrieve the two variables from the request context (something EL does for me automatically). I also chose to create a temporary variable to hold the “selected” flag; I could have used an inline “if” that matched the JSTL version, or I could have changed the JSTL version to use a temporary variable; I find that I have different programming styles depending on language.

<select name="employeeSelector">
    <%
    List<Employee> allEmployees = (List<Employee>)request.getAttribute(Constants.PARAM_ALL_EMPLOYEES);
    Employee selectedEmployee  = (Employee)request.getAttribute(Constants.PARAM_SELECTED_EMPLOYEE);
    for (Employee employee : allEmployees)
    {
        String selected = ObjectUtils.equals(employee, selectedEmployee) ? "selected" : "";
        %>
        <option value="<%=employee.getId()%>" <%=selected%>> <%=employee%> </option>
        <%
    }
    %>
</select>

Even though the scriptlet version has more lines of code, it seems cleaner to me: the code and markup are clearly delineated. Perhaps there's a web developer who will be confused by a Java “for” statement in the middle of the page, but is the JSTL “forEach” any better? Assuming that your company separates the roles of web developer and Java developer, it seems like a better approach to say “anything with angle brackets is yours, everything else is mine.”

Putting subjective measures aside, I think there are a few objective reasons that the scriptlet is better than JSTL. The first of these is how I retrieve data from the request context: I use constants to identify the objects, the same constants that I use in the controller to store them. While I have no inherent opposition to duck typing, there's a certain comfort level from knowing that my page won't compile if I misspell a parameter. Sometimes those misspellings are obvious, but if you happen to write your JSTL with “selectedEmploye” it will take rigorous testing (either manual or automated) to find the problem.

Another benefit to using scriptlets is that you can transparently call out to Java. Here, for example, I use Jakarta Commons to do a null-safe equality check.

A better example would be formatting: here I rely on Employee.toString() to produce something reasonable (and if you look at the example you may question my choice). But let's say the users want to see a different format, like “Last, First”; or they want different formats in different places. In JSTL, you'll be required to manually extract and format the fields, and that code will be copy/pasted everywhere that you want to display the employee name. Then you'll have to track down all of that copy/paste code when the users inevitably change their mind about what they want to see. Or you could add different formatting functions to the Employee object itself, breaking the separation of model and view. With a scriptlet, you can call a method like EmployeeFormattingUtils.selectionName().

OK, before I get comments, I should note that there are easier ways to implement the JSTL version, using 3rd-party tag libraries. The Spring Form Tags, for example, would reduce my code to a single line, assuming that my controller followed the Spring “command” pattern; But those libraries limit you to common cases, restricting your ability to customize your HTML. And although you can write your own taglibs, I haven't seen many projects that do that (in fact, the only ones I've seen are ones where I did it).

I want to finish with a punchline: everything I've just written is moot, because the web-app world is transitioning from server-side view rendering to client-side, using technologies such as Backbone and Spine. That change will come with its own perils, and history tends to repeat itself. I worry that five years from now we'll be dealing with multi-thousand-line JavaScript monstrosities, mixing business logic with view rendering.

Monday, August 13, 2012

How Annotations are Stored in the Classfile ... WTF?!?

This weekend I made some changes to BCELX, my library of enhancements for Apache BCEL. These changes were prompted not by a need to access different types of annotations, but because I'm currently working on a tool to find hidden and unnecessary dependency references in Maven projects. Why does this have anything to do with annotation processing? Read on.

Prior to JDK 1.5, the Java class file was a rather simple beast. Every referenced class had a CONSTANT_Class_info entry in the constant pool. This structure actually references another entry in the constant pool, which holds the actual class name, but BCEL provides the ConstantClass object so you don't have to chase this reference. It's very easy to find all the external classes that your program references: walk the constant pool and pull out the ConstantClass values.

That functionality is exactly what I needed to cross-check project dependencies. But when I wrote a testcase to check my dependency-extraction method, it failed. I had used the test class itself as my target, and just by chance I picked the @Test annotation as one of my assertions. As far as my dependency-extraction code was concerned, I didn't have a reference to the annotation class.

I figured that there must be some flag in the CONSTANT_Class_info structure that was confusing BCEL — its released version hasn't been updated for JDK 1.5. So I turned to the JDK 1.5 classfile doc, and it slowly dawned on me: annotation classes aren't referenced in the constant pool. Instead, you have to walk through all of the annotation attributes, and get their names out of the constant pool. OK, I should have realized this sooner; after all, it wasn't so long ago that I'd written the annotation parsing code in BCELX.

Of course, this meant that I now had to add support for parameter and field-level annotations to BCELX (I was going to have to parameters anyway, to support another project). While doing this, I discovered something else interesting: the API docs say that you can apply annotations to packages and local variables, but the classfile docs give no indication that this is actually supported.

There are a couple of things that I take from this experience. The first is that it's another piece of evidence that JDK 1.5 represented a changing of the guard at Sun. Annotations have often had a “tacked on” feel to me — right down to the @interface keyword (they broke backwards compatibility for enum, would it have been so bad to add annotation?). I'm sure there was a reason for not treating an annotation reference as just another class, but quite frankly I can't see it.

The other thing I learned is to beware testcases built around spot checks. If I had written my testcase to look for org.junit.Assert rather than org.junit.Test, I never would have found the issue — until it turned up when using the utility. But there are lots of cases where exhaustive checks aren't cost-effective. Including this one: should I write a test that verifies every possible annotation reference? I'll need to, if I want 100% coverage, but really: it's a tiny part of the overall project.

One that could have been far easier if the JVM team had cleanly integrated their changes, and followed the existing model. I suppose that's the real take-away: if you're evolving a design, don't simply tack on the changes.

Wednesday, July 25, 2012

Business Logic and Interactive Web Applications

By the mid-2000s, the structure of an MVC web-app had gelled: business logic belonged in the model (which was usually divided into a service layer and persistent objects), a thin controller would invoke model methods and select data to be shown to the user, and a view held markup and the minimal amount of code needed to generate repeating or optional content. There might be some client-side validation written in JavaScript, but it was confined to verifying that fields were filled in with something that looked like the right value. All “real” validation took place on the server, to ensure that only “good” data got persisted.

Then along came client-side JavaScript frameworks like JQuery. Not only could you make AJAX calls, but they let you easily access field values and modify the rendered markup. A few lines of JavaScript, and you have an intelligent form: if option “foo” is selected, hide fields argle and bargle, and make fields biff and bizzle read-only.

The problem with intelligent front-ends is that they almost always duplicate the logic found in the back end. Which means that, inevitably, the two will get out of sync and the users will get upset: bargle was cleared but they expected it to hold the concatenated values of biff and bizzle.

There's no good solution to this problem, although it's been faced over and over. The standard solution with a “thick client” application was layered MVC: each component had its own model, which would advertise its changes to the rest of the app via events. These events would be picked up by an application-level controller, which would initiate changes in an application-level model, which would in turn send out events that could be processed by the components. If you were fastidious, you could completely separate the business logic of both models from the GUI code that rendered those models.

I don't think that approach would work with a web-app. The main reason is that the front-end and back-end code are maintained separately, using different languages. There's simply no way to look one place and see all the logic that applies to bizzle.

Another problem is validation. The layered approach assumes that each component sends data that's already been validated; there's no need for re-validation at the lower levels. That may be acceptable for internal applications, but certainly not for something that's Internet-facing.

One alternative is that every operation returns the annotated state of the application model: every field, its value, and a status code — which might be as simple as used/not-used. The front-end code can walk that list and determine how to change the rendered view. But this means contacting the server after every field change; again, maybe not a problem on an internal network, but not something for the Internet.

Another alternative is to write all your code in one language and translate for the front end. I think the popularity of GWT says enough about this approach.

I don't have an answer, but I'm seeing enough twisted code that I think it's an important topic to think about.

Tuesday, December 6, 2011

Actron CP9580: How Not To Do An Update

The Actron CP9580 is an automotive scantool. For those who aren't DIY mechanics, it connects to your car's on-board computer and reports on engine operation and trouble codes (ie, why your “check engine” light is on). My car has passed 100,000 miles, and received its first (hopefully spurious) trouble code a few weeks ago; the $200 investment seemed worthwhile.

Except that right now, the tool is an expensive doorstop, sitting in the manufacturer's repair shop, and I wasted a couple of hours last week. All because I ran the manufacturer-supplied update, which failed catastrophically. As I look back on the experience, I see several problems with their update process, some of which are rooted in a 1990-vintage design mentality, but all of which represent some fundamental failing that every developer should avoid.

#1: They used their own protocol to communicate with the device

In 1990, most embedded devices used an RS-232 serial port to communicate with the outside world. Manufacturers had no choice but to develop their own communications protocol, using something like X-Modem for file transfers.

But the CP9580 has a USB port. And I'm betting that it has flash memory to store its data files. Both of which mean that a custom protocol doesn't make sense. Instead, expose the flash memory as a removable drive and let the operating system — any operating system — manage the movement of data back and forth. Doing so should actually reduce development costs, because it would leverage existing components. And it would make user-level debugging possible: simply look at what files are present.

#2: They deleted the old firmware before installing the new

Again, a vestige of 1990, when devices used limited-size EEPROMs for their internal storage. Not only was the amount of space severely limited, but so were the number of times you could rewrite the chip before it failed. Better to simply clear the whole thing and start fresh.

This is another case where flash memory and a filesystem-based design change the game entirely. Consumer demand for memory cards has pushed the price of flash memory to the point where it's not cost-effective to use anything less than a gigabyte. And with a filesystem, version management is as simple as creating a new directory.

It's also a case where the game changed and the system designers half-changed. In the old days, communications code was in permanent ROM. If an update failed, no problem: you could try again (or reload the previous version). However, it seems that the CP9580 stores everything in flash memory, including the loader program (at least, that's what I interpret from the tech support person's comments, but maybe he was just being lazy).

The iPhone is a great example of how to do updates right: you can install a new revision of iOS, run it for months, and then decide that you want to roll back; the old version is still there. But it's not alone; even an Internet radio is smart enough to hold onto its old software while installing an update.

#3: They kept the update on their website, even though they'd had reports of similar failures

The previous two failings can be attributed to engineers doing things the same way they always have, even when the technology has moved forward. This last failure runs a little deeper. After the update failed, the second time I called tech support I was told that “we've had several cases where this happened.” Yet the software was still on the website, without a warning that it might cause problems. And it's still there today.

One of the best-known parts of the Hippocratic Oath is the exhortation to “do no harm.” Programmers don't have to swear a similar oath, but I think they should — if only to themselves. Too often we look at the technical side of a problem, forgetting that there's a human side. Sometimes the result ends up on The Daily WTF, but more often it ends up quietly causing pain to the people who use our software.

Thursday, October 20, 2011

Defensive Copies are a Code Smell

This is another posting prompted by a Stack Overflow question. The idea of a defensive copy is simple: you have a method that returns some piece of your object's state, but don't want the caller to be able to mutate it. For example, String.toCharArray():

public char[] toCharArray() {
    char result[] = new char[count];
    getChars(0, count, result, 0);
    return result;
}

If you simply returned the string's internal array, then the caller could change the contents of that array and violate String's guarantee of immutability. Creating a new array preserves the guarantee.

This technique seems to be a good idea in general: it ensures that the only way to change an object's state is via the methods that the object exposes. This in turn allows you to reason about the places where an object can change, and will make it easier to identify bugs caused by changing data. There's even a FindBugs check for code that exposes its internal state this way (along with a related case, where an object maintains a reference to mutable data that was passed to it).

But are defensive copies really useful in practice?

The core argument in favor seems to be that you can't trust your fellow programmers. In some cases, this is reasonable: security-related classes, for example, should never blindly accept or hand out pieces of their internal state. And in a large organization (or open-source library), it's unlikely that other programmers will understand or care about your intended use of an object — especially if they can save a few lines of code by using it in an unexpected way.

As an argument against, every defensive copy consumes memory and CPU time. String.toCharArray() is a perfect example of this, particularly with large strings, which may be copied directly into the tenured generation. If a programmer blindly calls this method within a loop, it's quite possible for the garbage collector to eat up most of your CPU.

Moreover, there's almost always a better solution. Again using String.toCharArray() as an example, why do you need the character array? I would guess that 99% of the time, the reason is to iterate over the characters. However, String.charAt() will do the same thing without a copy (and Hotspot should be smart enough to inline the array reference). And you should be calling String.codePointAt() anyway, to properly handle Unicode characters outside the Basic Multilingual Plane.

That's all well and good for strings, but what about your application objects. Continuing the theme of “there's a better way,” I ask: why are your objects providing access to their internal state?

One of the principles of object-oriented programming is the Law of Demeter, which holds that collaborating objects should not know anything about each others internal state. The goal of Demeter — just like defensive copies — is to allow you to reason about your objects and their interactions within the application. But it also drives your design toward action: rather than simply holding data, an object should do something with that data. To me, this is what separates object-oriented programming from procedural programming.

Of course, as with any law, there are times when Demeter can and should be broken (for example, data transfer objects). But before breaking the law, think about the consequences.

Wednesday, August 17, 2011

Meta Content-Type is a Bad Idea

Following last week's posting about “text files,” I wanted to look at one of the most common ways to deliver text: the web. The HTTP protocol defines a Content-Type header, which specifies how a user agent (read: browser) should interpret the response body. The content type of an HTML document is text/html; breaking from other “text” types, its default character set is ISO-8859-1. However, you can specify the document's encoding as part of the Content-Type, and most websites do.

All well and good, except that an HTML document can specify its own encoding, using the http-equiv meta tag:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="fr" dir="ltr" xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Wikipédia, l'encyclopédie libre</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

Wikipedia does “meta” Content-Type about as well as you can: the page is delivered with a Content-Type header specifying UTF-8, and it's an XHTML document (which implies UTF-8 encoding in the absence of a prologue). The only questionable practice with this page is the location of the <title> tag: it contains UTF-8 content, but appears before the in-document Content-Type. But in this case the in-document content type specification is superfluous.

Not all non-English pages do as well. The Montreal Craigslist page, for example, specifies ISO-8859-1 in the HTTP response, but UTF-8 in the meta tag.^* It is a testament to browser developers adhering to Postel's Law that you can read the site at all.

From a “layered architecture” perspective, the embedded content-type declaration is ugly. You could argue that it self-describes a stand-alone document, much like the prologue in an XML document. But there's an important difference: the bytes of an XML prologue are rigidly specified; the parser doesn't need to know the encoding to read them. The <meta> tag can appear anywhere in the <head> of an HTML document. Including, as shown by Wikipedia, after content that requires knowledge of the encoding.

While writing this post, I did a quick search for a history of the embedded Content-Type specification. I turned up a W3C page that recommended always using it, but did not give a rationale. And I found a page that claimed specifying a character set in the HTTP response would “break older browsers.” As the page did not list those browsers, and did not appear to be written by someone involved in browser development, I'm not sure that I believe it.

For my personal website, I rely on the HTTP header, and don't use the meta tag. But I also limit myself to US-ASCII text, with HTML or numeric entities for anything that isn't ASCII. I'm not going to suggest that you remove the tag from your website (who knows, your biggest customer might have an “older browser”). But if you do use it, it should be the first thing in your <head>.

More important than whether the <meta> tag is present is that you actually get the encoding right, both in the page and in the HTTP headers.

With servlets, it's easy: the first line of your service method should be a call to ServletResponse.setContentType().

response.setContentType("text/html;charset=UTF-8");

This will set the Content-Type header and also configure the object returned by ServletResponse.getWriter(). Don't, under any circumstances, write HTML data via the object returned by ServletResponse.getOutputStream(); it exists for servlets that produce binary content.

With JSP, put the following two directives at the top of each page.

<%@page contentType="text/html"%>
<%@page pageEncoding="UTF-8"%>

These are translated into a call to ServletResponse.setContentType(), and are also used by the JSP container itself to parse the page. If, after reading this posting, you don't feel comfortable writing self-describing files, you can also use a JSP property group in your web.xml.

One final thing: if you do choose to specify content type via http-equiv, make sure that it matches what your server is putting in the HTTP response. Otherwise, you risk having your site used as an example by someone writing about encodings.

* The Paris Craigslist omits the <meta> declaration, but retains ISO-8859-1 in the HTTP response. Which explains why all of the ads say “EUR” rather than €.

Friday, August 12, 2011

"Text File" is an Oxymoron

Back in the early 1990s, life was easy. If you worked in the United States, “text” meant ASCII. If you worked in Canada or Europe, it might mean with ISO-8859-1 or windows-1252, but they were almost the same thing … unless you dealt with currency and needed to display the new Euro symbol. There were a few specialists that thought of text as wchar_t, but they were rare. Companies hired them as contractors rather than full-time employees.

This US-centric view of text is pervasive: any MIME Content-Type that begins with “text” is presumed to be US-ASCII unless it has an explicit character set specifier. Which often trips up people who create XML, which presumes UTF-8 in the absence of an explicit encoding (solution: use application/xml rather than text/xml).

This was the world that Java entered, and it left an indelible imprint. Internally, Java looked to the future, managing strings as Unicode (now UCS-2). But in the IO package, it was firmly rooted in the past, relying on “default encoding” when converting those two-byte Unicode characters into bytes. Even today, in JDK 7, FileReader and FileWriter don't support explicit encodings.

The trouble with a default encoding is that it changes from machine to machine. On my Linux machines, it's UTF-8; on my Windows XP machine at home, it's windows-1252; on my Windows XP machine from work, it's iso-8859-1. Which means that I can only move “text” files between these boxes if they're limited to US-ASCII characters. Not a problem for me, personally, but I work with people from all over the world.

At this point in time, I think the whole idea of “text” is obsolete. There's just a stream of bytes with some encoding applied. To read that stream in Java, use InputStreamReader with an explicit encoding; to write it, use OutputStreamWriter. Or, if you have a library that manages encoding for you, stick with streams.

If you're not doing that, you're doing it wrong. And if you aren't using UTF-8 as the encoding, in my opinion you're doing it poorly.

Saturday, August 6, 2011

The Horizontal Slice

Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.

That's one of the basic principles of the Agile Manifesto, and a common approach to satisfying it is the “horizontal slice&rdquo: a complete application, which takes its inputs from real sources and produces outputs that are consumed by real destinations. The application starts life as a bare skeleton, and each release cycle adds functionality.

In theory, at least, there are a lot of benefits to this approach. First and foremost is the “for tomorrow we ship” ethos that a partially-functioning application is better than no application at all. Second, it allows the team to work out the internal structure of the application, avoiding the “oops!” that usually accompanies integration of components developed in isolation. And not least, it keeps the entire team engaged: there's enough work for everyone, without stepping on each others' toes.

But after two recent green-field projects that used this approach, I think there are some drawbacks that outweigh these benefits.

The first is an over-reliance on those “real” sources and sinks; the development team is stuck if they become unavailable. And this happens a lot in a typical development or integration environment, because other teams are doing the same thing. Developing mock implementations is one way to avoid this problem, but convincing a product owner to spend time on mocks when real data is available is an exercise in futility.

The second problem is that software development proceeds in a quantum fashion. I've written about this with regards to unit testing, but it applies even more to complete projects. There's a lot of groundwork that's needed to make a real-world application. Days, perhaps weeks, go by without anything that could be called “functional”; everything is run from JUnit. And then, suddenly, the's a main(), and the application itself exists. Forcing this process into a two-week sprint cycle encourages programmers to hack together whatever is needed to make a demo, without concern for the long term.

And that results in the third problem — and in my opinion the worst: high coupling between components. When you develop a horizontal slice, I think there's less incentive to focus on unit tests, and more to focus on end-to-end tests. After all, that's how you're being judged, and if you get the same level of coverage what does it matter?

On the surface, that's a reasonable argument, but unit tests and integration tests have different goals: the latter test functionality, the former lead you to a better design. If you don't have to test your classes in isolation, it's all to easy to rely on services provided by other parts of the application. The result is a barrier to long-term maintenance, which is where most of a team's development effort is spent.

So is there a solution? The best that I can think of is working backwards: creating a module at a time, that produces real, consumable outputs from mock inputs. These modules don't have to be full-featured, and if fact shouldn't be: the goal is to get something that is well-designed. I think that working backwards gives you a much better design than working forwards because at every stage you know what the downstream stage needs, even if those needs change.

I want to say again that this approach is only for building on the green field. To maintain the building metaphor, it's establishing a foundation for the complete system, on which you add stories (pun intended).

Friday, November 6, 2009

Objectificated

I have an admission to make: it wasn't until October 1999 that I truly “got” object-oriented programming. Perhaps that doesn't seem like a big admission, but consider this: I had been programming in C++ since 1990, and had been reading articles on object-oriented programming since the mid-1980s.

Oh, sure, at one level I understood why objects were better than procedural code. When I read Stroustrup's introduction to The C++ Programming Language, I understood and agreed with his ideas of modularization via objects. By then, structured programming and layered design were in the mainstream, and objects seemed a simple way to accomplish those goals. And, working in the financial industry, I had many examples of polymorphic behavior (all securities have a price, but that price means something very different if you're talking about stocks, bonds, or options).

But it never gelled. Although I replaced struct with class, carefully considered the public/private divide, and compiled using a C++ compiler, I continued to refer to my self as a “C programmer.” In fact, my resume doesn't mention C++ (although that's more because C++ has come a long way since I stopped using it).

In October 1999, I joined an existing Java project — my first real use of the language. And one day, looking at the output from a parser generator, it clicked: polymorphism wasn't about three classes that exposed the same set of methods, it was about 150 classes, each of which used different data to drive the same method. Since that time, I've had a clear division in my mind between classes that provide behavior, and classes that hold (and expose) data.

Most “business” applications, however, revolve around data, not behavior. If you're writing online trading software, your use of polymorphism is limited to calculating the value of a position; something else displays that value. And that something else tends to be written in a procedural style: get this piece of data, do something to it, get the next piece of data.

There have been a lot of approaches to making such processing “more object-oriented,” from the Visitor pattern to mandates that all data be accessed via getters and setters. And to be sure, there are benefits from these techniques, particularly if you come up with a clean way to encapsulate business rules.

I think that “clean” is the real problem. Business rules tend to me more exception than rule, having evolved over time, changing to suit the needs of the moment, and involving inter-locking relationships between different sources of data. In that world, “tell, don't ask” is a lot more difficult than “ask and decide.”

(which, as I'm proofing this, strikes me as an example of Conway's law in action — but that's a topic for another post)

Monday, October 12, 2009

Building a Wishlist Service: HTML Forms

Back to the wishlist service, and it's time to look at the client side. In particular, the mechanism that the client uses to submit requests. XML on the browser is, quite simply, a pain in the neck. While E4X is supposed to be a standard, support for it is limited. Microsoft, as usual, provides its own alternative. Since XML is a text format, you could always construct strings yourself, but there are enough quirks that this often results in unparseable XML.

Against the pain of XML, we have HTML forms. They've been around forever, work the same way in all browsers, and don't require JavaScript. They're not, however, very “Web 2.0 friendly”: when you submit a form, it reloads the entire page. Filling this gap, the popular JavaScript libraries provide methods to serialize form contents and turn them into an AJAX request. As long as you're sending simple data (ie, no file uploads), these libraries get the job done.

To simplify form creation, I created some JSP tags. This provides several benefits, not least of which is that I can specify required parameters such as the user ID and wishlist name. I also get to use introspection and intelligent enums to build the form: you specify an “Operation” value in the JSP, and the tag implementation can figure out whether it needs a GET or a POST, what parameters need to go on the URL, and what fields need to go in the body.

One of the more interesting “learning experiences” was the difference between GET and POST forms. With the former, the browser will throw away any query string provided in the form's action attribute, and build a new string from the form's fields. With the latter, the query string is passed untouched. In my initial implementation I punted, and simply emitted everything as input: the server didn't care, because getParameter(), doesn't differentiate between URL and body. This offended my sense of aesthetics however, so I refactored the code into a class that would manage both the action URL and a set of body fields. Doing so had the side benefit that I could write out-of-container unit tests for all of the form generation code.

The other problem came from separating the form's markup from the JavaScript that makes it work. This is current “best practice,” and I understand the rationale behind it, but it makes me uncomfortable. In fact, it makes me think that we're throwing away lessons learned about packaging over the last 30 years. But that's another post.

Thursday, September 10, 2009

Building a Wishlist Service: External Libraries

One of the great things about developing Java applications is the wealth of open-source libraries available. The Jakarta Commons libraries have been part of every project that I've done over the past five years; you can find StringUtils.isEmpty() in almost all text manipulation code that I write.

However, the wealth of open-source libraries presents a paradox of choice: which libraries do you use for a project. Each library adds to the memory footprint of your project, either directly as classes are loaded, or indirectly as the JVM memory-maps the library's JAR. External libraries also make dependency management in your build more complex, in some cases forcing you to build the library locally.

More important, every library represents a form of lock-in: once your code is written to conform to the library, it will be expensive to change. And if you discover a bug or missing feature, you'll need to develop a remediation plan. Even if you can code a patch, it will take time to integrate with the mainline code — assuming that it is accepted. In some cases you may find yourself maintaining a private fork of the library over several public releases.

All of which is to say: use open source, but pick your libraries carefully.

In the cases of the product list service, one of the places where I considered external libraries was XML management, in particular conversion between XML and Java beans. There are lots of libraries that handle this: XMLBeans and XStream are two that are commonly used, and the JDK provides its own serialization and deserialization classes as part of the java.beans package.

Of these, XStream seemed to be the best choice: XmlBeans requires a separate pre-compile step, while the JDK's serialization format would require a lot of work on the part of any non-Java client. However, I had another alternative: I am the administrator and main developer on Practical XML, an open-source library for XML manipulation. It didn't support XML-object conversion, but I also had some converter classes that I'd written before XStream became popular. I figured that it would take a minimal amount of work to flesh out those classes and integrate them into the library.

I have an incentive to evolve the Practical XML library, and to use it in all of my projects. However, adding this functionality introduced a two week diversion into my project. In this case the delay didn't matter: I have no hard deadlines on this project. And since I was already using the library in other places, I had the benefit of consistency and reduced footprint. Faced with an unmovable ship date, my decision would have been different.

Friday, September 4, 2009

Building a Wishlist Service: Template Method

When I'm giving an interview, one of my favorite questions is “Are you familiar with Design Patterns? Please describe your favorite pattern, tell me why you like it, and give an example where you've recently used it.” Historically, if I were to be asked that question, my answer would be Strategy; currently, I think it would be Template Method. Which is interesting, because those two patterns basically do the same thing from different directions.

A description first: Template Method is a pattern for building class hierarchies in which the base class of the hierarchy imposes logic on its subclasses. The base class defines public methods, then calls into abstract protected methods that must be implemented by the subclass. For example, the base class might define a validate() method:

public final void validate()
{
    validateEncryptedParam();
    validateExpiration();
    validateRequiredParams();
    subclassValidation();
}

Historically I've liked Template Method because it avoids bugs caused by subclasses that don't invoke their super's implementation of a method (I take the attitude that “subclasses are like ogres,” which is a topic for a future post). However, as I'm realizing with this service, it also highlights duplicated code, and allows that code to be moved into the superclass. In this example, there's only one method that's actually implemented by the subclass (and its name should make it obvious).

I think it also pushes you toward moving responsibilities out of the class hierarchy entirely. In this example, validateRequiredParams() was originally intended for the subclass, which knew what parameters it needed. But the actual validation code was common to all subclasses, so I changed the method to simply ask the subclass for a list of its required parameters. A little more thought, and I realized that this isn't really an attribute of the subclass, but of the operation itself. So I added the list of parameters to the enum defining the service's operations (and this spurred writing an article for my website):

public enum Operation
{
    // only the third example has a required parameter

    RetrieveList(HttpMethod.GET, null),
    AddItem(HttpMethod.POST, ProductEntry.class),
    DeleteItem(HttpMethod.POST, null, RequestParameter.itemId),
    // ,,,
}

This kind of thinking, taken to an extreme, will take you out of Template Method and into objects constructed according to a configuration file. And that's not a bad thing: if your subclasses don't have behavior, they don't deserve to exist.

Wednesday, September 2, 2009

Building a Wishlist Service: Testing

There are a lot of ways to test a web application: in-container testing using tools such as Cactus, full client-server testing using tools such as HttpUnit, out-of-container unit tests, and even manual tests.

I'm not a fan of any test approach that requires starting the application, because you won't want to do that on every build. So most of the tests for my service are unit tests, meant to run outside of the container. And although I knew that this would happen, I'm still amazed at how such tests push me toward decoupled code, even when I write the tests after the code.

The servlet's dispatch mechanism is a prime example. Going in, I knew that I wanted a RequestHandler class for each operation. My first pass used an interface, with a getRequestHandler() method inside the servlet class. However, the only way to test that method was to make it public or protected, and I'm not willing to violate encapsulation for testing. So RequestHandler became an abstract class, and getRequestHandler() became a static factory method. At the same time, I decided to instantiate a handler object per request, rather than reuse a table of singleton objects: the latter was easier to manage, but the former was easier to test.

Unit-testing servlet components means a lot of stubbing and mocking, and I decided to create a reflection proxy for the servlet request and response objects. I realize that there are existing mock implementations for these objects, but I figured that I wasn't using many of their methods and wanted to limit the use of third-party libraries (that's the topic of another post).

And that led to another observation: writing your own mock objects tends to reduce the number of places you touch those objects. If I have to mock both getParameter() and getParameters(), I'm going to think about why I call both methods, and probably use just one of them. This should translate to reduced chance of errors, in this case because I'll be cognizant of cases where there may be more than one parameter with the same name.

There's another effect that I'm seeing from writing for testability: I tend to use the Template Method pattern. A lot. Or perhaps it's simply a natural pattern for web services. I'll look at that more closely in the next post.

Monday, August 31, 2009

Building a Wishlist Service: The Development Environment

Working software is the primary measure of progress

There's a great feedback loop from seeing your code run in situ. Even if you're religious about writing tests, the green bar doesn't match up to pressing a “Submit” button. I think it's one of the reasons that interactive languages like Python have such a devoted following. And a big part of this feeling is the thought that “I could demo this to someone who doesn't know how to code.” Or, as you get closer to the end of the project, “I could ship this.”

This feedback loop works even if you're a one-person team, which is why I started my wishlist project with the file demo.jsp. At first this page held a hard-coded form, which allowed me to invoke the skeleton of a servlet class. From there to documentation: writing up my thoughts on the service API. And then implementing those thoughts, in data transfer objects, JSP taglibs, and more servlet code. It's a circular process: update the JSP, update the servlet, update the docs. Of course, there are tests supporting all the servlet changes, but that web page stays on the screen, ready to submit whatever form data I need.

My development tools help with this process. Eclipse is my primary IDE: I like its editor, debugger, and refactoring tools. But I don't like its support for web applications (to be fair, the last time I tried it out was four years ago; things may have gotten better). For the web-app side, I use NetBeans 5.5, which comes bundled with a Tomcat server (6.5 gives pretty much the same environment but uses an external server; there's a level of comfort from using tools released together). I can build and deploy in a few seconds, then switch right back to either editor to make changes.

For production packaging, of course, I'll need to move to a Maven build. NetBeans is a nice rapid prototyping tool, but I don't like the way that it forces you to use its own Ant scripts. But that's a ways down the road, and Subversion should help me keep my sanity as I break the pieces into three separate build projects. For now, my prototyping environment is rapid enough that I don't feel the need for an interactive language.

Friday, August 28, 2009

Design versus Implementation

My last few postings have discussed the design of my wishlist service in very broad strokes: here's what I want to accomplish, here are a few of the issues that I've thought about. They have no detail; they don't even look at choice of platform. To me, this is the essence of the divide between design and implementation.

I see it as analogous to traditional building architecture, where the lead architect sketches broad forms, the rest of the architecture studio fleshes those out as models, and the mechanical engineers make it all work. It's also similar to a corporate mission statement — at least, a mission statement for a corporation that isn't floundering in indecision.

If the design is complete, then the implementation should flow naturally. My design specifies a RESTful interface using either XML or form-encoded data in the request; this implies that my implementation will have a front-end layer to transform these inputs into a common form. The design calls for separation between base product data and any application specific data; the data model will reflect that, and the data access code won't need to perform joins or multiple queries to load objects.

There are decisions that don't derive from the design: for example, my choice of Java and Servlets. This decision is one of convenience: it minimizes the amount of stuff that I have to learn. Other choices include Java/Spring, Python/Django, and Ruby/Rails. While I have some familiarity with Python, and could learn Ruby in short order, any code that I write with those would not be idiomatic — it would probably look like Java with a different syntax. Eventually, I plan to use the service as a basis for exploring Python and Ruby in an idiomatic way, but the first pass should leverage what I know.

This dividing line still leaves a lot of implementation decisions to be made, and the next several postings will look at those decisions and the implementation process itself.

Thursday, August 27, 2009

Designing a Wishlist Service: Scalability

One of the repeated themes in Founders at Work is how startups were unprepared for scalability issues, particularly in the cases where the company experienced viral growth. In some companies, the founders worked around the clock adding capacity. Viral growth is by its nature unpredictable: you have no idea how fast — or how long — your company will keep growing, and going in you have no idea who is going to like the product. The rest of us, however, should have an intended audience for our product, and can make design decisions based on that audience.

There's a lot of conventional wisdom on designing for scalability, most of which boils down to “cache everything” and “don't maintain session state.” The trouble with conventional wisdom is that it rarely applies to the real world. Caching, for example, is useless if you're constantly changing the data, or if a particular data item is accessed infrequently. And a RESTful web service doesn't store session state by its very nature.

So what usage pattern should we expect from a wishlist service? I'm designing for three types of user:

Wishlist Owner: This person will add and remove items, and perhaps change their rankings. These interactions all happen on human timescales (ie, several seconds or minutes between changes, then long periods of inactivity), and should involve more writes that reads. For this user, caching is not going to help performance: the list will probably age out of the cache before it's read again, and would need to be constantly purged as items change.
Friends and Family: Wishlist owners may choose to share their wishlist, in the hopes that other people will do the buying. On the surface, it appears that caching might be a good idea: after all, there may be a lot of people accessing the service. But again, we're looking at human timescales: the list will be shared by passing a link, probably in an email. While you may send such an email to a hundred people, they're not going to open it at the same time, so there's no point in caching.
Collaborative Shoppers: One of the original triggers for this project was a collaborative shopping tool, where several people would share a list and make updates (typically comments), using a chat service to broadcast changes. This is one situation where caching would be useful, since there will be a group of people hitting the same data at the same time. They'll also be updating the data constantly, meaning the lifetime of a cached object would be very short.
However, I think that the solution for this scenario is to use caching in front of the application rather than as part of it: either Apache's mod_cache or some sort of caching filter in the application server. To make this work, we need to use URLs that identify the version of the list along with other key data — but this is the RESTful approach anyway.

By coming up with this picture, I can implement the service as a simple database front-end; I don't need to waste time writing explicit code for scalability. And if it goes viral … well, let's just say I don't expect that, particularly with an eCommerce server gating access.

Monday, August 17, 2009

Designing a Wishlist Service: Security

Security is the red-headed stepchild of the web-service world. Sure, there's WS-Security, which may be useful if you're using SOAP in server-server communication. But most services, in particular, browser-accessed services, rely on the security mechanisms provided by HTTP and HTTPS. If you want authentication, you use a 401 response and expect the requester to provide credentials in subsequent requests. If you're paranoid, you use a client certificate.

The wishlist service doesn't quite fit into this model. There's a clear need for some form of authorization and control, if only to protect the service from script kiddies with too much time on their hands. Yet authentication has to be transparent for usability: we don't want to pop up login dialogs, particularly if the customer has already logged in to their eCommerce account.

Perhaps the biggest constraint on any security mechanism is that requests will come from a browser. This means that the service can't expect the requester to provide any secret information, because anybody can view the page or script (I'm amazed by just how many web services require a “secret key” in their URL). If a page needs to provide secret information, that information must be encrypted before it is delivered to the user.

And secret information is required, because there are different levels of authorization: some people can update a list, others can only read it. The solution that I chose is to encrypt the wishlist key and user's authorization level, and pass that to the service as a URL parameter. This does represent another departure from pure REST, in that there may be multiple URLs that refer to the same underlying resource, but it seems the most reasonable compromise.

It also has the limitation that anybody who has a URL has the access defined by that URL. In some web services, this would be a problem. In the case of the wishlist service, it's mitigated by several factors.

First among these is that the value of this data just isn't that high. Sure, if you plan to be nominated for the Supreme Court, you probably don't want to keep a wishlist of X-rated videos. But for most people, a wishlist of clothing just isn't that important (and an opt-in warning can scare away those who feel differently). A related factor is that, in normal usage, people won't be handing out these URLs — at least, not URLs that can make changes. There's simply no reason to do so.

A second mitigating factor is that each URL also has an encrypted timeout: the service will reject any requests that arrive after that timeout. While this can be used to exert control over shared lists, it is primarily intended to defeat script kiddies who might try a denial-of-service attack via constant updates.

A third mitigating factor is that we keep close control over secrets, particularly those that can be used to break the encryption — in other words, security through obscurity. Bletchely Park would not have been nearly so successful at breaking Enigma if they hadn't learned to recognize radio operators. Taking heed of that, none of the information used by the encrypted parameter is provided in plaintext, meaning that any attempts at discovering the key will require heuristic analysis rather than a simple String.contains().