// Object Pooling - Determinism vs. Throughput

Object pooling in java is often seen as an anti pattern and/or wasted effort - but there are still valid reasons to think about pooling for certain kind of applications.

The JVM allocates objects much faster from managed heap (young generation; contiguous and defragmented) as you could ever recycle objects from a self written pool running on top of a VM. A good configured garbage collector is also able to delete unused objects fast. GCs in fact don't delete objects explicitly, they rather evacuate all surviving objects and sweep whole memory regions in a very efficient manner and only when its necessary to reduce runtime overhead.

Object allocation (of small objects) on modern JVMs is even so fast that making a copy of immutable objects sometimes outperforms modification of mutable (and often old) objects. JVM languages like scala or clojure make heavy use of this observation. One of the reasons for that anomaly is that generational JVMs are designed to be able to deal with loads of short living objects which makes them inexpensive compared to long living objects in old generations.

Performance does not always mean Throughput

Rendering a game with 60fps might be optimal throughput for a renderer but the performance might be still unacceptable when all frames are rendered in the first half of the second with the second half spent on GC ;). Even if Object Pools may not increase system throughput they can still increase determinism of your application. Here are some observations and tips which might help:

When should I consider Object Pools?

  • GC tuning did not help - you want to try something else
  • The application creates a lot of objects which die in the old generation
  • Your Objects are expensive to create but easy to recycle
  • Determinism, e.g response time (soft real time requirements) is more important for you than throughput

Pro Pooling:

  • pools reduce GC activity in peak times (worst case scenarios)
  • are easy to implement and test (its basically an array ;))
  • are easy to disable (inject a fake pool which returns only new Objects)

Con Pooling:

  • more (old) objects are referenced when a GC kicks in (increases gc overhead)
  • memory leaks (don't forget to reclaim your objects!)
  • cause additional problems in a multi-threaded scenario (new Object() is thread safe!)
  • may decrease throughput
  • cumbersome, repetitive client code

When you decided to use pools you have to make sure to reclaim all objects as soon they are no longer used. One way of doing this is by applying the static factory method pattern for object allocation and a per object dispose method for deallocation.


/**not Thread safe!**/
public class Vector3f {
    
    private static final ObjectPool<Vector3f> pool;
    public float x, y, z;
    private boolean disposed;
    
    static{
        pool = new ObjectPool<Vector3f>(1024);
        for(int i = 0; i  < 1024; i++) {
            pool.reclaim(new Vector3f());
        }
    }

    private Vector3f() {}

    public static Vector3f create(float x, float y, float z) {
        Vector v = pool.isEmpty() ? new Vector() : pool.get();
        v.x = x;
        v.y = y;
        v.z = z;
        v.disposed = false;
        return v;
    }
    
    public void dispose() {
        if(!disposed) {
            disposed = true;
            pool.reclaim(this);
        }
    }
}

To demonstrate the perceived performance difference I captured two flyovers of my old 3d engine. The second flyover was captured with disabled object pools. The terrain engine triangulates the ground dependent on the position and view direction of the observer which makes object allocation hard to predict. The triangulation runs in parallel to the rendering thread which made the pool implementations a bit more complex as the example above.

Every vertex, normal, triangle and quad-tree node is a pooled object (wireframe on mouse over)

on the left: flyover with pre allocated object pools; right: dynamic object allocation (new Object())

Notice the pauses at 7, 17 and 26s on the flyover with disabled pools (right video).

Note on the videos: The quality is very bad since the tool I used created 700MB large files for the 30s videos a lot of frames got skipped. I even sampled them down from 1600x1200 to 1024x768 and limited the fps to 30 but the bottleneck was still the hard disk. This is the main reason why even the left video does not look smooth. (I even had to boot windows the first time in 2 years to use the tool!). I'll try to capture better vids next time.

Conclusion

Using pools requires discipline, is error prone, not good for system throughput and does not play very well with threads. However there are some attempts to make them more usable in case you think you need them. The physics engine JBullet for example uses JStackAlloc to prevent repetitive and cumbersome code by using automatic bytecode instrumentation in the build process. Type Annotations (JSR 308 targeted for OpenJDK 7) in combination with project lombok and/or the automatic resource management proposal might provide further possibilities for simplifying the usage of object pools in java and reduce the risk for memory leaks.


// Garbage First - It has never been so exciting to collect garbage :)

If you are reading this entry, you probably already know about G1 the new Garbage First concurrent collector currently in development for Java 7.

Jon Masamitsu made recently a great overview of all GCs currently integrated into JVM of Java SE 6 and announces the new G1 collector on his weblog.

I asked him in the comments some questions about G1 and a very interesting discussion starts. Tony Printezis an expert from the HotSpot GC Group joined the discussion and answered all the questions very detailed.

(I have aggregated the discussion here because I think it is much easier to read if the answer follows next to the question without the noise between them)

me: I just recently thought about stack allocation for special kind of objects. Couldn't the hotspot compiler provide enough information to determine points in code when its safe to delete certain objects? For example many methods use temporary objects. Is it really worth to put them into the young generation?

Tony: Regarding stack allocation. I believe (and I've seen data on papers that support this) that stack allocation can pay off for GCs that (a) do not compact or (b) are not generational (or both, of course).

In the case of (a), a non-compacting GC has an inherently slower allocation mechanism (e.g., free-list look-ups) than a compacting GC (e.g., "bump-the-pointer"). So, stack allocation can allow some objects to be allocated and reclaimed more cheaply (and, maybe, reduce fragmentation given that you cut down on the number of objects allocated / de-allocated from the free lists).

In the case of (b), typically objects that are stack allocated would also be short-lived (not always, but I'd guess this holds for the majority). So, effectively, you add the equivalent of a young generation to a non-generational GC.

For generational GCs, results show that stack allocation might not pay off that much, given that compaction (I assume that most generational GCs would compact the young generation through copying) allows generational GCs to allocate and reclaim short-lived objects very cheaply. And, given that escape analysis (which is the mechanism that statically discovers which objects do not "escape" a thread and hence can be safely stack allocated as no other thread will access them) might only prove that a small proportion of objects allocated by the application can be safely stack allocated (so, the benefit would be quite small overall).

(BTW, your 3D engine in Java shots on your blog look really cool!)

thank you! :)

[Read More]