Saturday, December 18, 2021

My take on the Log4J 2.x vulnerability

A week ago, a lot of Java application developers learned that their applications harbored a severe vulnerability, courtesy of the Log4J 2.x logging framework. The vulnerability, CVE-2021-44228 allowed execution of arbitrary code on any system that logged unsanitized user input using the Log4J 2.x library. This prompted a lot of people to scramble to protect their systems, and a lot of people to take to Twitter with opinions of What It All Means.

As the maintainer of an open-source library that integrates with Log4J, I spent my Saturday morning understanding the bug and updating my examples to use versions of the library that mitigated the problem. Fortunately, my library is not directly affected, as long as its consumers don't use unsanitized data to configure the logging framework.

Having done that, and reading as much as I could on the issue (the LunaSec writeups are excellent), I've decided to voice my own thoughts on the matter. Some of which I haven't seen other people say, so they may be similarly interesting to you.

First, I think that it's important to recognize that this vulnerability — like most things that end up in the news — is not the result of a single issue. Instead, it's a combination of features, some of which make perfect sense:

  1. Log4J 2.x provides string interpolation for configuration (aka “lookups”)

    This is a great idea; I implemented it for my own logging library. The idea is that you don't want your configuration files to contain hardcoded values, such as the address of your central syslog daemon (or, in my case, the name of a CloudWatch log group). Log4J 2.x provides a long list of lookups, ranging from environment variables to information from the Docker container running your application.

  2. One of the lookups retrieves data from the Java Naming and Directory Interface (JNDI) API

    In large deployments, it's nice to be able to centralize your configuration. There are lots of tools to do this, but Java Enterprise Edition settled on JNDI, also known as the javax.naming package. JNDI is an umbrella API for retrieving data from different sources, such as LDAP.

  3. javax.naming.spi.ObjectFactory supports loading code from a remote server

    This is a somewhat dubious idea, but the JNDI SPI (service provider interface) spec justifies it: it lets you defined resources (such as printer drivers) that can be retrieved directly from JNDI. They use the example of a printer driver, and it should be a familiar example to anyone who has installed a printer on their home computer (do you know that the driver you installed is trustworthy?).

    Note: this condition is necessary for remote-code execution, but not for data exfiltration.

  4. The Log4J 2.x PatternLayout also supports string interpolation, with the same set of lookups

    Here we're getting into questionable features in the library. I had no idea that this behavior existed, even though I dug deeply into the documentation and source code when implementing my appenders. The documentation for this layout class is quite long, and prior to the vulnerability, you had to infer the behavior based on the presence of the nolookups pattern option.

  5. Users log unsanitized data

    If users passed all of their logging messages through a sanitizer that looked for and removed the sequence ${, then the vulnerability wouldn't exist. Except nobody does that, because why would you have to? I call it out because I think it's important to consider what you're logging, and whether it might contain information that you don't want to log. As I say in my “effective logging” talk, passwords can hide in the darndest places.

These are the things that need to happen to make this vulnerability affect you. If you can prevent one of them from happening, you're not vulnerable … to this specific issue. And there are ways to do this, although they aren't things that you can easily implement once the vulnerability is discovered. I'm writing a separate post on how Cloud deployments can mitigate similar vulnerabilities.

For now, however, I want to focus on two of the responses that I saw online: pay for open-source maintainers, and inspecting code before using it. In my opinion, neither of these are valid, and they distract from preparing for the next serious vulnerability.

Response #1: inspect code before you use it.

For any non-trivial application, this is simply impossible.

The current “trunk” revision of log4j-core has 89,778 lines of code, excluding test classes (measured using find and wc). That doesn't count any add-on libraries that you might use to write to your preferred destination, such as my log4j-aws-appenders (which has over 10,000 lines of mainline code). And logging is a tiny part of a modern application, which is typically built using a framework such as Spring, and runs on an application server such as Tomcat or Jetty.

And even if your company is willing to pay for the staff to read all of that source code, what is the chance that they will discover a vulnerability? What is the chance that they will even learn all of its behaviors? I've spent quite a bit of time with the Log4J 2.x code and documentation, both to write my library and on a post describing how to write Log4J plugins, and I never realized that it applied string interpolation to the raw log messages. After I knew about the vulnerability, of course, it was easy to find the code responsible.

Response #2: we — or at least large companies — should be paying the maintainers of open-source projects.

I don't know the situation of the core Log4J developers, and whether or not they get paid for their work on the project. But I don't believe that this vulnerability was the result of an overworked, unpaid, solo developer. True, there was no discussion on the commit that introduced JNDI lookups (ignoring the tacky post-hoc commenters). But the code that applies string substitution to logging events has been in the library since the 2.0 release, and has been rewritten multiple times since then.

In fact, I think direct corporate sponsorship would lead to more unexpected behavior, because the sponsoring corporations will all have their own desires, and an expectation that those desires will be met. And if my experience in the corporate world is any guide, a developer that feels their paycheck is in jeopardy is much more likely to do something without giving it full thought.

So where does that leave us?

Unfortunately, I haven't seen very many practical responses (although, as I said, I'm writing a separate post to this effect). And I think that the reason for that is that the real answer puts the onus for vulnerabilities on us, the consumers of open-source software.

Linus's Law, coined by Eric S Raymond, is that “given enough eyeballs, all bugs are shallow.” And this played out in full view with this vulnerability: the Log4J team quickly found and patched the problem, and has been releasing additional patches since.² There have also been multiple third-parties that have written detailed evaluations of the vulnerability.³

But still, we, the consumers of this package, needed to update our builds to use the latest version. And then deploy those updated builds. If you didn't already have a process that allows you to quickly build and deploy a hotfix, then your weekend was shot. Even if you did get a hotfix out, you needed to spend time evaluating your systems to ensure that they weren't compromised (and beware that the most likely compromise was exfiltration of secrets!).

It's natural, in such circumstances, to feel resentment, and to look for external change to make such problems go away. Or maybe even to think about reducing your reliance on open-source projects.

But in my opinion, this is the wrong outcome. Instead, look at this event as a wake-up call to make your systems more robust. Be prepared to do a hotfix at a moment's notice. Utilize tools such as web application firewalls, which can be quickly updated to block malicious traffic. And improve your forensic logging, so that you can identify the effects of vulnerabilities after they appear (just don't log unsanitized input!).

Because this is not the last vulnerability that you will see.

1: You might be interested to learn that the Apache Software Foundation receives approximately $3 million a year (annual reports here). However, to the best of my knowledge, they do not pay stipends to core maintainers. I do not know how the core Log4J 2.x maintainers earn their living.

2: Log4J 2.x change report.

3: I particularly like the CloudFlare posts, especially the one that describes how attackers changed their payloads to avoid simple firewall rules. This post informed my belief that the goal of most attacks was to exfiltrate secrets rather than take over systems.

Wednesday, July 14, 2021

My take on "How to re:Invent"

It's back. AWS is taking over Las Vegas for a week filled with information, sales pitches, and corporate-friendly activities. And while COVID and the possibility of a “fourth wave” hang over the conference, I decided to sign up. Having been to re:Invent once, I now consider myself an expert on how to survive the week. Here are five of my suggestions.

#1: Wear comfortable shoes.
OK, everybody says this, but it bears repeating: you're going to do a lot of walking. It might take ten minutes to navigate from the front door of a hotel to the meeting rooms, following a labyrinthine path through the casino. To give you some numbers: my watch recorded 87,000 steps, or 44.6 miles, over the five days of the conference. That may be higher than average: I often walked between venues rather than find my way to the shuttle buses. But even if you “only” walk 30 miles, you'll still be thankful for doing it in a pair of running shoes.
#2: Stay in a “venue” hotel.
These are the hotels that host sessions and other sponsored content, as opposed to the “sleeping” hotels that just have rooms for attendees. There are several reasons to stay at a venue hotel, but in my opinion the most important is that it cuts down on the amount of walking that you have to do. Of my 87,000 steps, I estimate that 10,000 or more were taken up in walking from my room at the Park MGM to the Aria so that I could pick up a shuttle bus.
#3: Attend workshops, not sessions.
There are some great sessions at re:Invent, conducted by people who are intimately familiar with the service. If you have specific questions it's worth attending one of the “deep dives” and then walking up to the speaker afterward to ask those questions.

But, all of these sessions will be recorded, and you can watch them at your leisure. So if you don't have specific questions there's no reason to attend in-person. What you can't do after December 3rd is learn with an AWS instructor by your side (well, not for free anyway). Unfortunately, space for these workshops is limited, so sign up early for the ones you want (that said, scheduling at re:Invent is extremely fluid; at least half the sessions I attended were marked as full but then had spots open up an hour before they started).

#4: Fly out on Saturday.
If you're not from the United States, you may not realize that re:Invent takes place during the week after the Thanksgiving holiday. Thanksgiving is a time when people return home to visit family and friends, and then all of them get on a plane the following Sunday to return home. It's historically the busiest travel day of the year, and US airports are crowded and frantic with people who only fly on that weekend. Plus, airlines charge the highest rates of the year, because they know people will pay. Even if you have TSA/Pre, it's not fun.

If you're willing to fly out a day early, you avoid the crowds. Plus, you can save significantly on the airfare (right now, it would save me over $300, or nearly 50%). Against that, you'll be paying for an extra night in Vegas. For me, with the conference rate for the hotel, the numbers worked.

#5: Get out of Vegas.
For some people, Vegas is a destination: they love the lights, the noise, and the constant activity. For me, it's overwhelming. Fortunately, you can find thousands of square miles of absolute desolation just outside city limits.

Last time, I rented a motorcycle for a day and explored nearby attractions: Valley of Fire, Hoover Dam, and Red Rock Canyon. This year, I'm planning to take three days and explore southern Utah and northern Arizona. If you're not a motorcyclist, Vegas also has plenty of rental cars, including exotics. And at the far end of the scale, you can spend a day in a high-performance driving class at the Las Vegas Motor Speedway.

Well, that's it. Now it's time to cross my fingers and hope the US COVID situation remains under control.

Tuesday, June 8, 2021

An open letter to the AWS Training organization

You don't have a Feedback link on your site, but it seems that Amazon keeps close tabs on the Blogosphere, so hopefully this reaches you.

I don't know whether you're an actual sub-division of Amazon, but the website URL certainly didn't give me a warm fuzzy feeling when it came up in Google. In fact, my first thought was that it was some unaffiliated company that had better SEO.

So, since it was asking me for login credentials, I did what any reasonably cautious technologist would do, and ran whois. And this is what I got back:

Domain Name:
Registry Domain ID: 8d519b3def254d2f980a08f62416a5b9-DONUTS
Registrar WHOIS Server:
Registrar URL:
Updated Date: 2019-05-19T19:54:24Z
Creation Date: 2014-03-19T00:32:11Z
Registry Expiry Date: 2024-03-19T00:32:11Z
Registrar: Nom-iq Ltd. dba COM LAUDE
Registrar IANA ID: 470
Registrar Abuse Contact Email:
Registrar Abuse Contact Phone: +44.2074218250
Registrant Organization: Amazon Technologies, Inc.
Registrant State/Province: NV
Registrant Postal Code: REDACTED FOR PRIVACY
Registrant Country: US
Registrant Phone Ext: REDACTED FOR PRIVACY
Registrant Email: Please query the RDDS service of the Registrar of Record identified in this output for information on how to contact the Registrant, Admin, or Tech contact of the queried domain name.

That's the sort of whois entry that you get for an individual using a shared hosting service. In fact, it provides less information than you'll see with my domain, which runs on a shared hosting service, and I pay extra for privacy.

By comparison, the whois entry for Amazon itself looks like this (and note that it's a different registrar, another red flag):

Domain Name:
Registry Domain ID: 281209_DOMAIN_COM-VRSN
Registrar WHOIS Server:
Registrar URL:
Updated Date: 2019-08-26T12:19:56-0700
Creation Date: 1994-10-31T21:00:00-0800
Registrar Registration Expiration Date: 2024-10-30T00:00:00-0700
Registrar: MarkMonitor, Inc.
Registrar IANA ID: 292
Registrar Abuse Contact Email:
Registrar Abuse Contact Phone: +1.2083895770
Registrant Name: Hostmaster, Amazon Legal Dept.
Registrant Organization: Amazon Technologies, Inc.
Registrant Street: P.O. Box 8102
Registrant City: Reno
Registrant State/Province: NV
Registrant Postal Code: 89507
Registrant Country: US
Registrant Phone: +1.2062664064
Registrant Phone Ext: 
Registrant Fax: +1.2062667010
Registrant Fax Ext: 
Registrant Email:

While I'm a little surprised by the Reno address, rather than Seattle, this at least looks like the sort of registration information used by a business rather than somebody who pays $10/month for hosting.

I ended up getting to the training site via a link on the AWS Console, so was able to achieve my goal.

But I think there's a general lesson: don't forsake your brand without good reason.

And at the very least, ask your network administrators to update your whois data.