blog.kdgregory.com: scalability

Showing posts with label scalability. Show all posts

Sunday, December 16, 2018

Database Connection Pools Need To Evolve

I never thought very deeply about connection pools, other than as a good example of how to use phantom references. Most of the projects that I worked on had already picked a pool, or defaulted their choice to whatever framework they were using. It is, after all, a rather mundane component, usually hidden behind the object relational manager that you use to interact with your database.

Then I answered this Stack Overflow question. And, after playing around a bit and writing a much longer answer, I realized that connection pools — even the newest — have been completely left behind by the current best practices in database management. Two of those practices, in particular:

Database credentials should be handled with care
All of the pools that I've used expect you to provide database credentials in their configuration. Which means that you, as a security-conscious developer, need to retrieve those credentials from somewhere and manually configure the pool. Or, as a not-so-security-conscious developer, store them in plain text in a configuration file. In either case, you're doing this once, when the application starts. If there's ever a need to change credentials, you restart your application.
That makes it difficult to practice credential rotation, where your database credentials change on a frequent basis to minimize the impact of losing those credentials. At the extreme, Amazon's RDS databases support generation of credentials that last only 15 minutes. But even if you rotate credentials on a monthly basis, the need to restart all of your applications turns this simple practice into a major undertaking, almost certainly manual, and one that may require application downtime.
Failover isn't a big deal
Failover from primary database to replica has traditionally been a Big Event, performed manually, and often involving several people on a conference call. At a minimum you need to bring up a new replaca, and with asynchronous, log-based replication there is always the chance of lost data. But with modern cluster-based database servers like Amazon Aurora, failover might happen during unattended maintenance. If the application can't recognize that it's in the middle of a short-duration failover, that's means it's still a Big Deal.

One solution to both of these problems is for the pool to provide hooks into its behavior: points where it calls out to user code to get the information it needs. For example, rather than read a static configuration variable for username and password, it would call a user-defined function for these values.

And that's fine, but it made me realize something: the real problem is that current connection pools try to do two things. The first is managing the pool: creating connections, holding them until needed, handing them out, collecting them again when the application code forgets to close them, and trying to do all of this with the minimal amount of overhead. The second task is establishing a connection, which has subtle variations depending on the database in use.

I think that the next evolution in connection pools is to separate those behaviors, and turn the connection management code into a composable stack of behaviors that can be mixed and matched as needed. One person might need a MySQL connection factory that uses an IAM credentials provider and a post-connection check that throws if the database is read-only. Another person might want a Postgres connection factory that uses an environment-based credentials provider and is fine with a basic connection check.

The tricky part is deciding on the right interfaces. I don't know that the three-part stack that I just described is the right way to go. I'm pretty sure that overriding javax.sql.DataSource isn't. And after creating an API, there's the hard work of implementing it, although I think that you'll find a lot of people willing to contribute small pieces.

The real question: is there a connection pool maintainer out there who thinks the same way and is willing to break her pool into pieces?

Monday, October 7, 2013

The Big Myth of Functional Programming: Immutability Solves All Problems

I'm not against immutability. Indeed, I think that many concurrency problems can be eliminated by making data structures immutable. Most of those problems, however, are caused because their developers never really thought about concurrent access. And while switching to immutable data structures may solve some problems, it creates others — and in my opinion, those others are much more difficult to debug.

The underlying problem is that the whole point of a program is to modify state. A program that takes no inputs and produces no outputs is worthless, except as a way to turn an expensive CPU into a space heater.

But if the purpose of a program is to modify state, how does it hold the state that's being modified? A “pure” functional program must use the arguments and local variables of its functions. There is a top-level function, which creates the program's initial state, passes that initial state to functions, and gets back the final state.

And if your program has a single thread, or multiple independent threads, that's all you need to do (I also like the idea of decomposition into a tree of independent threads). But if your program consists of communicating threads, the purely functional, immutable data approach is not sufficient: different threads will hold different representations of what they consider the same data. Also known as a race condition.

The easiest way to solve such problems is to introduce a well-designed concurrent data structure (such as Java's ConcurrentHashMap) that is shared between threads. This data structure holds the canonical view of shared state: each thread reads the map whenever it needs to know the latest data. However, a concurrent map by itself isn't sufficient to solve race conditions: it guarantees atomic puts and gets, but not updates.

A better alternative, in my opinion, is to follow the Go mantra of “share by communicating, rather than communicate by sharing.” In other words, wrap your shared state in a data structure that has a message queue between it and the rest of the program. The rest of your program appears to be fully functional: mutations are just function invocations, and you can choose to implement your shared data using immutable objects.

This approach doesn't completely eliminate races: there is still a race between writes and reads (also known as “stale data”). However, there is no way to eliminate that particular race. No matter how hard you try, you can never guarantee that multiple independent threads have a consistent view of the world.

But stale data in one part of your program is far better than missing data due to an unrealistic expectation of what immutable data structures give you.

Monday, September 20, 2010

BigMemory

Last week Terracotta released BigMemory, which stores objects outside of the Java heap and “defuses the GC time bomb.” Since I've written the top-ranked relevant article when you Google for “non-heap memory” (it's on page 2), my website has seen a spike in volume. Unlike previous spikes, people have been poking around, which is nice to see.

I did some poking around myself, both to Terracotta's blogs, and to some of the Twitter posts and articles that linked to my site. And I saw a lot of comments that disturbed me. They ranged from (paraphrased) “this is just hype, it isn't anything new,” to “I wonder if Oracle will file a patent lawsuit on this,” to “the JVM should be doing this itself,” to the ever-popular “people should just learn to write efficient code.” I'll take on that last one first.

But first some caveats: I've never used EHCache outside of Hibernate, and even then I simply followed a cookbook. My closest contact with EHCache development was attending a Java Users Group meeting with a badly jetlagged Greg Luck, followed by beer with a him and a bunch of other people. I can't remember what was discussed around the table, but I don't think it was cache internals. But I do know something about caches, and about the way that the JVM manages memory, so I feel at least qualified to comment on the comments.

And my first comment is: a cache is efficient code … when appropriately used. You can spend days micro-optimizing your code, only to find it all wasted with a single trip to the database.* You use a cache when it's expensive to retrieve or create often-used data; the L1 and L2 caches in your processor exist because memory is relatively slow, so it's expensive to retrieve. Similarly, it's expensive to create a web page (and load its data from the database), so you would use EHCache (or Akamai) to cache the pre-built page (EHCache is cheaper).

The interface to a cache is pretty straightforward: you give it a key, it gives you a value. In the case of the processor's L1 cache, the key is an address, the value is the bytes around that address. In the case of memcached, the key is a string and the value is a blob. In Java, this corresponds to a Map, and you can create a simple cache using a LinkedHashMap.

The reason to use a LinkedHashMap, versus a regular HashMap, is eviction: removing old entries from the map. Although memory is cheap, it's not infinite, and there's no good reason to keep every piece of data ever accessed. For example, consider Stack Overflow: when a question is on the home page, it might get hundreds of views; clearly you don't want to keep reading it from the database. But after it drops off the home page, it might get viewed once a week, so you don't want to keep it in memory.

With LinkedHashMap, you can implement a simple eviction strategy: once the map reaches a certain size, you remove the head entry on each new put(). More sophisticated caches use more sophisticated strategies: for example, weighting the life of a cached object by the amount of work that it takes to load from source. In my opinion, this is why you go with a pre-built solution like EHCache: they've thought through the eviction strategies and neatly packaged them.

So that's how a cache works. What's interesting is how it interacts with the garbage collector. The point of a cache is to hold content in memory for as long as it's valuable; for a generational garbage collector like Sun's, this means it will end up in the tenured generation … but eventually get removed. And this causes two issues.

First is the cost to find the garbage. The Hotspot GC is a “mark-sweep” collector: it starts at “root” references and traces through the entire object graph to find live objects; everything else is garbage (this is all covered in my article on reference objects; go back and click the link). If you have tens or hundreds of millions of objects in your heap (and for an eCommerce site, that's not an unreasonable number), it's going to take some time to find them all: a few milliseconds. This is, as I understand it, the price that you pay for any collection, major or minor.

A major collection, however, performs one additional step: it compacts the heap after collecting the garbage. This is a big deal. No matter how fast your CPU and memory, it takes time to move gigabytes of memory around. Particularly since the JVM is updating all the references to the moved objects. But if you've ever written a C application that fragmented its heap, you'll thank the Hotspot team every time you see a major collection take place.

Or maybe you won't. Heap compaction is an artifact of tuning the JVM for turn-of-the-century computers, where 1Gb of main memory was “server class,” and swap (the pagefile) was an important part of making your computer usable for multiple applications. Compacting the heap was a way to reduce the resident set size, and improve the overall performance of an application.**

Today, of course, we have desktop-class machines with multiple gigabytes, running an OS and JVM that allow direct access to all of it (and more). And there's been a lot of work in the past few decades to reduce the risk of fragmentation with C-style memory allocation. So maybe it is time to rethink garbage collection in the JVM, at least for the tenured generation, and introduce freelists (of course, that will inflame the “Java is a memory hog” crowd). Or maybe it's time to introduce a new generation, for large tenured objects. Maybe in JDK 1.8. I'm not holding my breath, and neither, it seems, were the EHCache developers.

So to summarize: EHCache has taken techniques that are well-known, and applied them to the Java world. If I were still working in the eCommerce field, I would consider this a Good Thing, and be glad that they invested the time and effort to make it work. Oh, and as far as patents go, I suspect that IBM invented memory-mapped files sometime in the 1960s, so no need to worry.

* Or, in the words of one of my favorite authors: “No matter how subtle the wizard, a knife between the shoulder blades will seriously cramp his style.”

** If you work with Swing apps, you'll often find a minimize handler that explicitly calls System.gc(); this assumes that the app's pages will be swapped out while minimized, and is an attempt to reduce restore time.

Thursday, August 27, 2009

Designing a Wishlist Service: Scalability

One of the repeated themes in Founders at Work is how startups were unprepared for scalability issues, particularly in the cases where the company experienced viral growth. In some companies, the founders worked around the clock adding capacity. Viral growth is by its nature unpredictable: you have no idea how fast — or how long — your company will keep growing, and going in you have no idea who is going to like the product. The rest of us, however, should have an intended audience for our product, and can make design decisions based on that audience.

There's a lot of conventional wisdom on designing for scalability, most of which boils down to “cache everything” and “don't maintain session state.” The trouble with conventional wisdom is that it rarely applies to the real world. Caching, for example, is useless if you're constantly changing the data, or if a particular data item is accessed infrequently. And a RESTful web service doesn't store session state by its very nature.

So what usage pattern should we expect from a wishlist service? I'm designing for three types of user:

Wishlist Owner: This person will add and remove items, and perhaps change their rankings. These interactions all happen on human timescales (ie, several seconds or minutes between changes, then long periods of inactivity), and should involve more writes that reads. For this user, caching is not going to help performance: the list will probably age out of the cache before it's read again, and would need to be constantly purged as items change.
Friends and Family: Wishlist owners may choose to share their wishlist, in the hopes that other people will do the buying. On the surface, it appears that caching might be a good idea: after all, there may be a lot of people accessing the service. But again, we're looking at human timescales: the list will be shared by passing a link, probably in an email. While you may send such an email to a hundred people, they're not going to open it at the same time, so there's no point in caching.
Collaborative Shoppers: One of the original triggers for this project was a collaborative shopping tool, where several people would share a list and make updates (typically comments), using a chat service to broadcast changes. This is one situation where caching would be useful, since there will be a group of people hitting the same data at the same time. They'll also be updating the data constantly, meaning the lifetime of a cached object would be very short.
However, I think that the solution for this scenario is to use caching in front of the application rather than as part of it: either Apache's mod_cache or some sort of caching filter in the application server. To make this work, we need to use URLs that identify the version of the list along with other key data — but this is the RESTful approach anyway.

By coming up with this picture, I can implement the service as a simple database front-end; I don't need to waste time writing explicit code for scalability. And if it goes viral … well, let's just say I don't expect that, particularly with an eCommerce server gating access.

blog.kdgregory.com