Tuesday, July 19, 2011

Remaining Relevant

Yesterday I republished one of the articles on my website, a how-to guide on dealing with memory problems. A few weeks ago I'd been working on a memory leak in a large app, and decided that the section on heap histograms should be expanded. Once I started editing, however, the changes just kept coming. I found some places that were unclear, some that were too 32-bit-centric, and some things that were just plain wrong. I think the only section that remained unchanged was the one on permgen. The structure of the rest of the article remained the same, but almost all of the text is different, and the article doubled in size.

There are a couple of ways that I could look at this. The first, more positive way, is that I've learned a lot about debugging memory errors in the two years since I first published the article. Except … I really haven't. And the tools haven't changed that much in the interim either (although 64-bit machines are becoming ubiquitous). And after twenty-five or so years of writing professionally, I'm not convinced that I've suddenly become better at explaining technical topics.

I think that the answer is that there's always more depth, more niches to explore. But most of them are pointless. I could spend pages on “bugs that I have known,” and the only result would be that the 241 people that Google Analytics says read this blog regularly would stop. So plumbing the depths isn't the right approach.

And yet … the articles on my website are supposed to be compendiums of everything that I know on a topic. Seeing how much that one article changed has me worried that I should go through all the others and at least review them. After all, even Knuth had second editions. And I know there are some things that I'd like to change.

But against that is the philosophy of “good enough,” also known as “ship it!” I could spend hours wordsmithing, trying to get each sentence just right. But I don't think that time would make the underlying message any different. Once you've reached the point of proper grammar and logical sentence structure, you've reached a point of diminishing returns. Taking the next step may be valid if you're looking for a Pulitzer, but they don't give out Pulitzers for technical writing.

Plus, there's a whole backlog of new things to write about.

2 comments:

Ashwin Jayaprakash said...

I read your bytebuffer article. It's nicely written.

I think you should mention that creating too many bytebuffers will create a leak because you cannot forcibly clear them.

But there is a way to force-discard the buffers - http://javaforu.blogspot.com/2011/04/dsls-oodbs-system-internals-and-some.html

Regards,
Ashwin.

kdgregory said...

I'm glad you liked the article.

Regarding out-of-memory due to creating too many bytebuffers, I don't think this is actually a problem -- at least not with a current JVM.

I've verified that on both 1.5 and 1.6 Sun JVMs, a full GC can be triggered when allocating direct buffers, regardless of how much space is in the heap. If you look in the file Bits.java (available from the OpenJDK source if not src.jar), you'll see that it keeps track of the "reserved" space for direct buffers, and triggers a GC if an allocation tries to exceed this. And the bug report that you referenced shows this happening: with the 1.5 run there's a full GC triggered when the heap only has 150k.

That still leaves open an out-of-memory error if the virtual memory space becomes fragmented, due to allocation/deallocation of varying sizes of direct buffers. But that's unavoidable, in both Java and C, unless you adopt a "bucketed" memory allocation strategy. And with 64-bit virtual address spaces, it shouldn't be an issue anymore.