Wednesday, October 27, 2010

Living Among Primitives, part 3: Collections

The last post “turned an object inside-out” to eliminate the overhead of a list of discrete objects. While that technique allows you to pack a lot of objects into a relatively small amount of space, your programs will be a lot cleaner if you can pass around Point objects rather than a PointHolder and its index. The java.util.AbstractList class can help you reach a middle ground.

public class PointHolder2
extends AbstractList<Point>
{
    private int _size;
    // ...

    @Override
    public int size()
    {
        return _size;
    }

    @Override
    public boolean add(Point point)
    {
        _x[_size] = point.getX();
        _y[_size] = point.getY();
        _z[_size] = point.getZ();
        _size++;
        return true;
    }

    @Override
    public Point get(int index)
    {
        return new Point(_x[index], _y[index], _z[index]);
    }

    @Override
    public Point set(int index, Point point)
    {
        _x[index] = point.getX();
        _y[index] = point.getY();
        _z[index] = point.getZ();
        return point;
    }

AbstractList is a tool for implementing lists with an application-specific backing store. Out of the box, you have to implement the get() and size() methods, and the setter methods all throw UnsupportedOperationException. But as you can see, it doesn't take a lot of work to implement these, and the result is nearly identical in terms of performance to an ArrayList<Point>.

And the Point class has reappeared, making your application code cleaner. But now they're created as-needed, and will be collected as soon as the program is done with them; you won't pay the memory overhead to keep them around. And assuming that they're relatively short-lived once created, they should rarely if ever make their way out of the heap's young generation, so will be cheap to collect.

There is one important difference between PointHolder and ArrayList<Point>, not immediately apparent from this example: since the Point instances do not actually exist within the list, changes to them won't be reflected in the PointHolder data. As implemented, each Point instance contains its own copies of the primitive values. To make the Point mutable, you'll need to design them with a reference back to the PointHolder (which I won't go into because I don't like mutable data-only objects).

Now that you have a space-efficient List<Point>, you may be thinking of how you'd access it — for example, how to search for a specific point. And you might think of calling Collections.sort(), followed by Collections.binarySearch(). And while this will work, it's not very efficient (because you're constantly creating Point instances), and has hidden memory costs. Those are topics for the next post.

No comments: