Friday, July 24, 2015

Have I Been Hacked?

Twenty-five years later I can still remember how I felt, returning home that day. Being burgled is a surrealistic experience: you notice a progression of little things that don't belong, and it may be quite a long time before your realize that your world has changed. For me, the first anomaly was that the front door of our two-family house was unlocked. OK, maybe my neighbor was out in the back yard. The second was that I could hear the music that I left on for the bird. As I walked up the stairway (unchanged), I saw that my front door was open. I didn't notice that the wood around the lock was splintered. I walked into what seemed a perfectly normal front room, and finally realized that something was wrong when I saw the VCR hanging by its antenna cable (+1 for tightening the cable with a wrench).

Now imagine a different scenario: the front door locked, the upper door standing open but unlocked, lights on that shouldn't be, and a few dirty dishes on the counter. All things that could be explained by me being particularly forgetful that morning. But still the sense that something was not quite right.

That was my feeling this week, as several of my friends reported receiving spam emails from me, warning me that my Yahoo account might have been hacked. The first was from my neighbor, and I discounted his report, thinking that the spammer might have hit another neighbor, gotten the neighborhood mailing list, and paired up random people. After all, the “From” address wasn't even mine! But then I got reports from friends that weren't neighbors, and even found a couple of the emails in my own GMail spam folder.

OK, so my Yahoo account got hacked, big deal. Maybe next time I'll be more careful with sharing passwords.

Except … the Yahoo account has its own password, one that's not saved anywhere but my head, so I have a hard time accepting that the account was actually broken into. And Yahoo's “activity report” claims that all logins in the past 30 days came from my home IP (a nice feature, it's one of the tabs on the profile page). And, I can still log into the account. I've never heard of anyone breaking into an account and just leaving it there, untouched.

And when I looked at the message, my email address wasn't to be found anywhere. It was “kdgregory,” but some server in a .cx domain. Different people reported different domains. Nor was a Yahoo server to be found in the headers. OK, headers can be forged, but I would have expected a forgery that at least attempted to look credible. According to the IP addresses in the headers, this email originated somewhere in India, went through a Japanese server, and then to its destination.

So I'm left with wondering what happened. Clearly these emails were based on information from my account, either headers from old messages (likely) or a contact list (less likely). But how? Googling for “yahoo data breach” turns up quite a few news stories, but nothing from this year. Did whoever acquired these addresses just sit on them for a year? And if yes, what other information did they glean from my account?

It's disquieting, this sense of perhaps being compromised. I think I would have been happier if they had changed my password and locked me out of the account. At least then I'd know I was being sloppy about security. As it is, I have no idea what (if anything) I did wrong. Or whether it will happen again.

Friday, July 17, 2015

The Achilles Heel of Try-With-Resources

Here's some code that you might see since Java 7 appeared. Can you spot the bug?

public List<String> extractData(File file, String encoding)
throws IOException
{
    List<String> results = new ArrayList<String>();
    try (BufferedReader rdr = new BufferedReader(new InputStreamReader(new FileInputStream(file), encoding)))
    {
        // read each line and extract data
    }
    return results;
}

OK, here's a hint: what if encoding is invalid?

The answer is that the InputStreamReader constructor will throw UnsupportedEncodingException (which is a subclass of IOException).

But this exception happens after the FileInputStream has been created, but before the body of the try-with-resources statement. Which means that the implicit finally in that statement won't get executed, and the file will remain open until the stream's finalizer executes.

To solve this problem, you need to break apart the constructor chain, to ensure that the actual resource is assigned to its own variable in the resource clause, and will therefore be closed. There are two ways to approach this: either use multiple resource clauses, or push non-resource constructors into the body of the statement.

In this case, the InputStreamReader and BufferedReader are simple decorator objects: if you neglect to close them, it doesn't change the behavior of the program. So we can construct them inside the body of the statement:

try (InputStream in = new FileInputStream(file))
{
    BufferedReader rdr = new BufferedReader(new InputStreamReader(in, encoding));
    // ...

Bug solved: regardless of what happens when constructing rdr, the underlying stream will be closed; we don't have to worry about a “too many open files” exception.

However, if we try to do something similar with output streams, we introduce a new bug:

try (OutputStream out = new FileOutputStream(file))
{
    // DON'T DO THIS!
    BufferedWriter wtr = new BufferedWriter(new OutputStreamWriter(out, encoding));
    // ...

The problem with this code is that decorators are important when writing. Unless you close the BufferedOutputStream, the data in its buffer won't be written to the file. On the positive side, you'll see this bug the first time you try to write a file. To solve it and still use the try-with-resources construct, you need to use multiple resource clauses:

try (OutputStream out = new FileOutputStream(file) ;
        OutputStreamWriter osw = new OutputStreamWriter(out, encoding) ;
        BufferedWriter wtr = new BufferedWriter(osw))
{
    // ...

There you have it, the only safe way to use try-with-resources. If you currently have code with nested constructors, I strongly recommend that you change it. And please change any public documentation that has nested constructors; explaining the bug to someone that is “just following the docs” wastes time that could be used for better things.

Monday, June 29, 2015

The Encryption Tax

Last year I wrote a post comparing build times using an SSD and a traditional hard disk. One of the conditions that I tested was using an encrypted home directory, and was surprised that encryption added 11% to my build times — about the same as using a “spinning rust” drive.

That was Xubuntu 12.04, which used eCryptfs, a FUSE filesystem, to handle encryption. At the time, I posited that the buffer cache was holding encrypted data blocks, so you paid a “decryption tax” every time you accessed one. Xubuntu 14.04, however, provides an option to enable full-disk encryption as part of the installation. Since this hooks more tightly into the kernel, I wondered if the decryption tax was still present. This called for another day of testing.

For this series of tests, I'm using my HP ProBook 640-G1:

  • Intel i5-4300M @ 2.6 GHz, 2 cores, hyperthreaded
  • 800 MHz FSB
  • 8 Gb RAM, 3 Mb L2 cache
  • Samsung 840 SSD
  • Xunbuntu 14.04 LTS 64-bit

And as a test subject, I'm building the Spring Framework 3.2.0.RELEASE. In my last test, it showed the biggest difference between SSD and platter, as well as the biggest difference between encrypted and non-encrypted access.

I tested three scenarios:

  • Standard Xubuntu 14.04 install, no encryption.
  • Home directory encryption (eCryptfs, running in userspace).
  • Full-disk encryption (dm-crypt, which runs — I believe — in the kernel).

As before, I pre-fetched all dependencies and disconnected from the network. Each timing is the average of two runs. Between runs I cleaned the build directory, TRIMed the filesystem, and rebooted to clear the buffer cache.

Here, then, are the numbers. Execution time is wall-clock time in seconds, as reported by Gradle, because that's what we really compare about.

  Execution Time Difference from baseline
Baseline 288
Encrypted home directory 300 + 4.2 %
Encrypted filesystem 295 + 2.5 %

I'm not sure why the penalty of an ecrypted home directory was lower this time around, but it's still there. And the full-disk encryption is a lower penalty, as expected, but again it's still there. So you still have to make a tradeoff between performance and protection. But given the numbers, protection should win.

At the end of my earlier post, I suggested that there's no substitute for RAM, and the ability to maintain a large buffer cache. I'm not sure that argument applies with a project as big as Spring. After each build, I cleaned the directory and ran a second build, figuring that 8Gb should be enough RAM to maintain everything in the buffer cache. The numbers, however, tell a different story:

  Execution Time Difference from baseline
Baseline 284 - 1.4 %
Encrypted home directory 294 + 2.1 %
Encrypted filesystem 289 + 0.3 %

Yes, a slight improvement over the previous runs, but not much. I think this might be due to the fact that compilers produce disk blocks, rather than consume them — reading existing blocks is a small part of the overall execution time.