Garbage Collection in LÖVE

Blog Terms:

I recently got a message on the forums asking me about the way LÖVE does garbage collection on its objects, and I figured they probably aren't the only ones interested, so here's an expanded answer.

As some of you may know, LÖVE is written in C++, which features manual memory management, and Lua has garbage collection. How do we solve this mismatch? Why, with another way to manage memory, of course! I'll start off by explaining the Lua GC (Garbage Collector), then quickly discuss C++'s model, followed by LÖVE's solution.

Lua's GC

As you may have realised if you've ever used another language, in Lua you don't need to worry about allocating and freeing memory. You create an object, and it stays available as long as you can refer to it. If you remove all references, it gets freed transparently. So how does this work?

I'll try to give a high-level overview of the Lua GC, to spare you reading a few books' worth of academic papers. First of all, the GC can be tweaked, the manual has some more information about the subject, but for the most part we can assume the GC runs automatically, at regular intervals. It stores a list of all objects, and by finding all reachable objects, it can then delete the rest, the unreachable objects. To determine which objects are reachable, it has a couple of starting points: the global environment, and any running functions. And yes, I mean functions, or really, coroutines.

Let's consider a simple example, we have some globals, a suspended coroutine and, of course, the main "coroutine". This means we have 3 places to start looking for values. Now, we can obviously reach all globals, we can reach the locals of the function we're currently running, and we can reach all values they contain. So if we can reach the math table, we can also reach math.sqrt. Similarly, if we can reach the suspended coroutine, we can also reach anything its locals can reach. Additionally, if we are currently in a function g, that was called from f, then we can also reach anything f can reach, and so on. Needless to say, there are usually quite a lot of values you can reach.

Thankfully, the GC doesn't need to run it all at once, since that would introduce significant delays. Instead, it's an incremental garbage collector, which means it can divide the work into multiple smaller steps.

As a final piece of the puzzle, in Lua all external objects are classified as userdata. There is both lightuserdata, and what is sometimes referred to as "full userdata", but Lua calls it userdata. As LÖVE uses an unmodified Lua (or LuaJIT), all of LÖVE's objects, in Lua, are userdata. Whereas in Lua you can't do cleanup when an object gets collected, the Lua developers know that this would cause all kinds of issues if the same were to happen with userdata. To circumvent this, Lua calls the __gc metamethod on userdata when it is garbage collected. As you can imagine, LÖVE then uses this metamethod to determine when to free its objects, but we'll see that later.

As an aside, if you've ever watched the memory usage of LÖVE whilst running the GC manually, you may have noticed something peculiar: The memory usage only goes down 2 cycles after userdata became unreachable. The reason for this is the aforementioned metamethod. If the metamethod somehow creates a new reference to the object, it can't be collected straight away. Rather than trying to solve this by trying to keep track of what it does, the GC does something simpler. On the first pass, it calls the metamethod, then "detaches" it. The next time it comes across the userdata, it knows the metamethod has already been called, and only then is it actually removed.

C++ memory management

There's very little to say on this point, since C++ memory management consists of the programmer doing it all by themselves. Specifically, if you want to allocate a new Image, you could write Image *image = new Image(file);, and when you're done with it, you delete it, using delete image;. Between those points the value of image is valid, after delete is isn't. So, if you refer to image after deleting it, your program could crash (or worse, it could not). Similarly, you are responsible for making sure you delete it exactly once, if you don't delete it, you won't notice you're leaking memory. If you delete it twice, you'll crash. Fun.

As may be obvious, this isn't the most programmer-friendly paradigm. The upside is that it is fast, and it allows you to do all sorts of crazy things. Even though that may include shooting yourself in the foot. Another benefit is that it's predictable. When dealing with Lua, if you remove all references to an object, it may live on for a very long time before it is finally removed. If you delete something in C++, it gets removed right then. Since removing objects takes time, you can thus plan when this happens. You could spread it out or you could delete batches of objects at a time. Even though Lua's GC can be tuned, you'll never get precise timing control. If you're creating lots of garbage, you could see lag spikes, and there's very little you can do about it, except create less garbage.

LÖVE's reference counting

We've seen two completely different ways to manage memory, Lua's fully automatic GC, and C++'s fully manual new/delete. This leaves LÖVE somehow having to reconcile these two worlds. The answer is clearly to introduce yet another model!

Now, not all LÖVE objects are the same, but we'll focus on those that you can use from Lua, which all derive from the love::Object base class. This base class supports (manual) reference counting. In our case, we use two methods for this, retain, which increases the reference count, and release, which decreases it. The basic concept behind these is simple: Once the reference count reaches 0, the object deletes itself! This also means that when the object is created (using new) it has a reference count of 1.

Unfortunately, this is still a manual system, if we ever forget to retain an object we store, or release an object when we stop storing it, we will still run into issues. Thankfully, it's usually fairly easy to know when to call retain, and when to call release, and C++ allows us to write helpers to manage it for us in a lot of cases.

I've now explained how LÖVE deals with those objects internally, and how it relates to C++ model of new and delete, but that still doesn't explain how it cooperates with Lua's GC. This is where the __gc metamethod comes into play again. Whenever an object reaches Lua, which is roughly whenever a function returns it, the LÖVE wrapper code calls retain. Then, when the __gc metamethod gets called, it calls release. What this means is that Lua holds 1 reference for every "copy" of the object it has. When one becomes unreachable, the GC will collect it, and the reference will be released.

If you're now wondering why it has to be this complex, that's because object lifetimes are complex in LÖVE! Say you create a blank ImageData in your code. At that point Lua holds 1 reference, and nothing else does. Then you fill it with some data and call love.graphics.newImage on it. At that point Lua holds 1 reference, and the newly created Image holds another. If you then remove all references to the ImageData, Lua will no longer hold its reference, and the only reference will be the one from the Image. If you then remove all references to the Image, both the Image and the associated ImageData will get deleted, which is exactly what we'd expect. If we were to simply trust the Lua GC, we'd have deleted the ImageData whilst the Image was still using it. If we had "trusted" C++, we'd have deleted it when we LÖVE longer held it, but when Lua still did. This reference counting solution allows both models to work together, without leaking memory, nor deleting it prematurely.

Similarly, this also works across threads. Since love.thread gives every thread its own separate Lua state, they also run independent GCs. Using this reference counting mechanism, this means each thread can have their own reference, and an object only gets deleted when all threads are done with it.

Want more?

In this article I've tried to give a glimpse into the LÖVE internals, if you want to know more about memory management, if anything was unclear, or if you would like to see more posts on other subjects, let us know!

Of course, since LÖVE is open-source software, you can look at the implementation yourself, if you're interested. The 0.9.0 version of love::Object is a simple, straight-forward implementation of the retain and release primitives. The current, 0.10.0 version is more thread-safe, at the cost of readability.

Comments

jlpt's picture

This is a stupid question but im trying to login to the other fourms but i don't know what the "second fourm" on the main page is i tried putting this one but it didn't work

 

francismoy's picture

Hi! Very interesting post. I was wondering: say I want to deactivate any kind of garbage collection before entering a level of my game. Can I use the standard Lua function collectgarbage("stop")? Of course, I'm aware that I should call "restart" after finishing the level (or at some point where FPS are not important). I ask this because I have experimented some random love.exe crashes (in Windows) after calling collectgarbage("stop")...