I recently got a message on the forums asking me about the way LÖVE does garbage collection on its objects, and I figured they probably aren't the only ones interested, so here's an expanded answer.
As some of you may know, LÖVE is written in C++, which features manual memory management, and Lua has garbage collection. How do we solve this mismatch? Why, with another way to manage memory, of course! I'll start off by explaining the Lua GC (Garbage Collector), then quickly discuss C++'s model, followed by LÖVE's solution.
As you may have realised if you've ever used another language, in Lua you don't need to worry about allocating and freeing memory. You create an object, and it stays available as long as you can refer to it. If you remove all references, it gets freed transparently. So how does this work?
I'll try to give a high-level overview of the Lua GC, to spare you reading a few books' worth of academic papers. First of all, the GC can be tweaked, the manual has some more information about the subject, but for the most part we can assume the GC runs automatically, at regular intervals. It stores a list of all objects, and by finding all reachable objects, it can then delete the rest, the unreachable objects. To determine which objects are reachable, it has a couple of starting points: the global environment, and any running functions. And yes, I mean functions, or really, coroutines.
Let's consider a simple example, we have some globals, a suspended coroutine
and, of course, the main "coroutine". This means we have 3 places to start
looking for values. Now, we can obviously reach all globals, we can reach the
locals of the function we're currently running, and we can reach all values they
contain. So if we can reach the
math table, we can also reach
Similarly, if we can reach the suspended coroutine, we can also reach anything
its locals can reach. Additionally, if we are currently in a function
was called from
f, then we can also reach anything
f can reach, and so on.
Needless to say, there are usually quite a lot of values you can reach.
Thankfully, the GC doesn't need to run it all at once, since that would introduce significant delays. Instead, it's an incremental garbage collector, which means it can divide the work into multiple smaller steps.
As a final piece of the puzzle, in Lua all external objects are classified as
userdata. There is both
lightuserdata, and what is sometimes referred to as
"full userdata", but Lua calls it
userdata. As LÖVE uses an unmodified Lua (or
LuaJIT), all of LÖVE's objects, in Lua, are
userdata. Whereas in Lua you can't
do cleanup when an object gets collected, the Lua developers know that this
would cause all kinds of issues if the same were to happen with userdata. To
circumvent this, Lua calls the
__gc metamethod on userdata when it is
garbage collected. As you can imagine, LÖVE then uses this metamethod to
determine when to free its objects, but we'll see that later.
As an aside, if you've ever watched the memory usage of LÖVE whilst running the GC manually, you may have noticed something peculiar: The memory usage only goes down 2 cycles after userdata became unreachable. The reason for this is the aforementioned metamethod. If the metamethod somehow creates a new reference to the object, it can't be collected straight away. Rather than trying to solve this by trying to keep track of what it does, the GC does something simpler. On the first pass, it calls the metamethod, then "detaches" it. The next time it comes across the userdata, it knows the metamethod has already been called, and only then is it actually removed.
There's very little to say on this point, since C++ memory management consists
of the programmer doing it all by themselves. Specifically, if you want to
allocate a new Image, you could write
Image *image = new Image(file);, and
when you're done with it, you delete it, using
delete image;. Between those
points the value of
image is valid, after delete is isn't. So, if you refer to
image after deleting it, your program could crash (or worse, it could not).
Similarly, you are responsible for making sure you delete it exactly once, if
you don't delete it, you won't notice you're leaking memory. If you delete it
twice, you'll crash. Fun.
As may be obvious, this isn't the most programmer-friendly paradigm. The upside is that it is fast, and it allows you to do all sorts of crazy things. Even though that may include shooting yourself in the foot. Another benefit is that it's predictable. When dealing with Lua, if you remove all references to an object, it may live on for a very long time before it is finally removed. If you delete something in C++, it gets removed right then. Since removing objects takes time, you can thus plan when this happens. You could spread it out or you could delete batches of objects at a time. Even though Lua's GC can be tuned, you'll never get precise timing control. If you're creating lots of garbage, you could see lag spikes, and there's very little you can do about it, except create less garbage.
We've seen two completely different ways to manage memory, Lua's fully automatic GC, and C++'s fully manual new/delete. This leaves LÖVE somehow having to reconcile these two worlds. The answer is clearly to introduce yet another model!
Now, not all LÖVE objects are the same, but we'll focus on those that you can
use from Lua, which all derive from the
love::Object base class. This base
class supports (manual) reference counting. In our case, we use two methods for
retain, which increases the reference count, and
decreases it. The basic concept behind these is simple: Once the reference count
reaches 0, the object deletes itself! This also means that when the object is
created (using new) it has a reference count of 1.
Unfortunately, this is still a manual system, if we ever forget to retain an object we store, or release an object when we stop storing it, we will still run into issues. Thankfully, it's usually fairly easy to know when to call retain, and when to call release, and C++ allows us to write helpers to manage it for us in a lot of cases.
I've now explained how LÖVE deals with those objects internally, and how it
relates to C++ model of new and delete, but that still doesn't explain how it
cooperates with Lua's GC. This is where the
__gc metamethod comes into play
again. Whenever an object reaches Lua, which is roughly whenever a function
returns it, the LÖVE wrapper code calls
retain. Then, when the
metamethod gets called, it calls
release. What this means is that Lua holds 1
reference for every "copy" of the object it has. When one becomes unreachable,
the GC will collect it, and the reference will be released.
If you're now wondering why it has to be this complex, that's because object
lifetimes are complex in LÖVE! Say you create a blank ImageData in your code. At
that point Lua holds 1 reference, and nothing else does. Then you fill it with
some data and call
love.graphics.newImage on it. At that point Lua holds 1
reference, and the newly created Image holds another. If you then remove all
references to the ImageData, Lua will no longer hold its reference, and the only
reference will be the one from the Image. If you then remove all references to
the Image, both the Image and the associated ImageData will get deleted, which
is exactly what we'd expect. If we were to simply trust the Lua GC, we'd have
deleted the ImageData whilst the Image was still using it. If we had "trusted"
C++, we'd have deleted it when we LÖVE longer held it, but when Lua still did.
This reference counting solution allows both models to work together, without
leaking memory, nor deleting it prematurely.
Similarly, this also works across threads. Since love.thread gives every thread its own separate Lua state, they also run independent GCs. Using this reference counting mechanism, this means each thread can have their own reference, and an object only gets deleted when all threads are done with it.
In this article I've tried to give a glimpse into the LÖVE internals, if you want to know more about memory management, if anything was unclear, or if you would like to see more posts on other subjects, let us know!
Of course, since LÖVE is open-source software, you can look at the
implementation yourself, if you're interested. The 0.9.0 version of
love::Object is a simple, straight-forward implementation of the
release primitives. The current, 0.10.0 version is
more thread-safe, at the cost of readability.