Monday, August 9, 2010

Ant, Taskdef, and running out of PermGen

Although I've switched to Maven for building Java projects (convention over configuration ftw), I still keep Ant in my toolbox. It excels at the sort of free-form non-Java projects that most people implement using shell scripts.

One reason that Ant excels at these types of projects is that you can easily implement project-specific tasks such as a database extract, and mix those tasks with the large library of built-in tasks like filter or mkdir. And the easiest way to add your tasks to a build file is with a taskdef:

    <taskdef name="example"
             classname="com.kdgregory.example.ant.ExampleTask"
             classpath="${basedir}/lib/mytasks.jar"/>

Last week I was working on a custom task that would retrieve data by US state. I invoked those with the foreach task from the ant-contrib library, so that I could build a file from all 50 states. Since I expected it to take several hours to run, I kicked it off before leaving work for the day.

The next morning, I saw that it had failed about 15 minutes in, having run out of permgen space. And the error happened when it was loading a class. At first I suspected the foreach task, or more likely, the antcall that it invoked. After all, it creates a new project, so what better place to create a new classloader? Plus, it was in the stack trace.

But as I looked through the source code for these tasks, I couldn't see any place where a new classloader was created (another reason that I like Ant is that it's source is generally easy to follow). That left the taskdef — after all, I knew that my code wasn't creating a new classloader. To test, I created a task that printed out its classloader, and used the following build file:

<project default="default" basedir="..">

    <taskdef name="example1"
             classname="com.kdgregory.example.ant.ExampleTask"
             classpath="${basedir}/classes"/>
    <taskdef name="example2"
             classname="com.kdgregory.example.ant.ExampleTask"
             classpath="${basedir}/classes"/>

    <target name="default">
        <example1 />
        <example2 />
    </target>

</project>

Sure enough, each taskdef is loaded by its own classloader. The antcall simply exacerbates the problem, because it executes the typedefs over again.

It makes sense that Ant would create a new classloader for each project, and even for each taskdef within a project (they can, after all, have unique classpaths). And as long as the classloader is referenced only from the project, it — and the classes it loads — will get collected at the same time as the project. And when I looked in the Project class, I found the member variable coreLoader.

But when I fired up my debugger, I found that that variable was explicitly set to null and never updated. The I put a breakpoint in ClasspathUtils, and saw that it was being invoked with a “reuse” flag set to false. The result: each taskdef gets its own classloader, and they're never collected.

I think there's a bug here: not only is the classloader not tied to the project object, it uses the J2EE delegation model, in which a classloader attempts to load classes from its own classpath before asking its parent for the class. However, the code makes me think that this is intentional. And I don't understand project life cycles well enough to know what would break with what I feel is the “correct” implementation.

Fortunately, there's a work-around.

As I was reading the documentation for taskdef, I saw a reference to antlibs. I remembered using antlibs several years ago, when I was building a library of a dozen or so tasks, and didn't want to copy-and-paste the taskdefs for them. And then a lightbulb lit: antlibs must be available on Ant's classpath. And that means that they don't need their own classloader.

To use an antlib, you create the file antlib.xml, and package it with the tasks themselves:

<antlib>
    <taskdef name="example1" classname="com.kdgregory.example.ant.ExampleTask"/>
    <taskdef name="example2" classname="com.kdgregory.example.ant.ExampleTask"/>
</antlib>

Then you define an “antlib” namespace in your project file, and refer to your tasks using that namespace. The namespace specifies the package where antlib.xml can be found (by convention, the top-level package of your task library).

<project default="default" 
    xmlns:ex="antlib:com.kdgregory.example.ant">

    <target name="default">
        <ex:example1 />
        <ex:example2 />
        <antcall target="example"/>
    </target>

    <target name="example">
        <ex:example1 />
        <ex:example2 />
    </target>   

</project>
It's extra effort, but the output makes the effort worthwhile:
ant-classloader-example, 528> ant -f -lib bin build2.xml 
Buildfile: /home/kgregory/tmp/ant-classloader-example/build2.xml

default:
[ex:example1] project:     org.apache.tools.ant.Project@110b053
[ex:example1] classloader: java.net.URLClassLoader@a90653
[ex:example2] project:     org.apache.tools.ant.Project@110b053
[ex:example2] classloader: java.net.URLClassLoader@a90653

example:
[ex:example1] project:     org.apache.tools.ant.Project@167d940
[ex:example1] classloader: java.net.URLClassLoader@a90653
[ex:example2] project:     org.apache.tools.ant.Project@167d940
[ex:example2] classloader: java.net.URLClassLoader@a90653

BUILD SUCCESSFUL
Total time: 0 seconds

Bottom line: if you're running out of permgen while running Ant, take a look at your use of taskdef, and see if you can replace it with an antlib. (at least one other person has run into similar problems; if you're interested in the sample code, you can find it here).

No comments: