First-class Functions

This is part one of a two-part blog post on taking advantage of first-class functions in Lua. Part one explains the inner workings of first-class functions from both a generic computer science perspective, and how you implement that within Lua. Part two puts all the theory together to create a lean version of an Entity Manager.

First-class boarding, now available!

Beyond the simplest of problems, a program needs variables in order to store, manipulate, and maintain some important value. Typically, we think of variables as some number value, or a string of characters. When dealing with programs of greater complexity, we begin to think about collections of these atomic units (in Lua, this is accomplished through the monolithic table). Part of being a variable usually requires that you must have some guarantees:

  • You must be able to be stored;
  • You must be able to be passed as a parameter to a subroutine (function);
  • You must be returnable by a subroutine;
  • You must be able to be created at runtime (during the actual execution of your program);
  • You must have an intrinsic identity (This is a bit harder to understand, but, philosophically, this means that you have an identity beyond some identifying characteristics. For example, am I, Michael Kosler, definable beyond my name, age, weight, etc.?).

If you are able to provide these guarantees, then you are considered a first-class citizen in programming parlance. All common variables in Lua, i.e. number, string, and table, are first-class citizens. Obviously, a number is stored as a variable, can be passed as an argument to a function, and can be returned from a function.

Functions themselves, however, often did not have this same status. In C, I cannot directly pass a function as a parameter to another function (I can fake it through a function pointer, but I am just passing through something that points to the function, rather than the function itself). In Java (until Java 8 comes out), I cannot store a function inside a variable.

Lua, and many other scripting languages, elevate functions to first-class citizen status. This means that anything I could do with a number, I can do with a function.

Storing a function in a variable

In Lua, we often create functions like this:

1
2
3
local function foo(a)
  print(a)
end

Turns out, when Lua interprets the code, it is kind of, but not exactly, looking at it like this:

1
2
3
local foo = function (a)
  print(a)
end

I can even re-assign foo to now be a new function.

1
2
3
4
5
6
7
local foo = function(a)
  print(a)
end
 
foo = function(a, b)
  print(a, b)
end

Passing a function as a parameter

Since I can store a function in a variable, I can just as easily pass a function as a parameter. Suppose I have a table of strings that I want to make all uppercase, print, and then make all lowercase.

1
2
3
4
5
6
7
8
9
10
11
12
13
local function map(t, f)
  for i = 1, #t do
    t[i] = f(t[i])
  end
end
 
local text = { 'The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog.' }
 
print(table.concat(text, ' '))
map(text, string.upper)
print(table.concat(text, ' '))
map(text, string.lower)
print(table.concat(text, ' '))

This map function is a powerful and widely used function in functional programming languages like Haskell, so much so that it is built into its core library.

Returning a function from another function

Since functions are first-class citizens, I have no problem returning them from other functions. Suppose I have an adder function, that rather than adding a bunch of numbers together, creates specialized adder functions.

1
2
3
4
5
6
7
8
function add(n)
  return function (m)
    return m + n
  end
end
 
local addThree = add(3)
print(addThree(5)) -- 8

Closuring us out

The first two examples are pretty straight-forward. The adder factory function is a bit weirder if you really pay attention to it. Try tracing the execution path, keeping in mind how scope works:

  1. Lua identifies the function add and the variable addThree.
  2. Lua calls the right-hand side of the assignment statement, add(3).
  3. Lua creates a new, anonymous function that takes in a argument m, returning the expression m + n, and returns the anonymous function from add.
  4. Lua calls print(addThree(5)), which calls addThree(5), which returns 8.

But why does it return 8? Once I have left add, the value of n (in our case 3) is out of scope. Furthermore, it looks like I have no guarantees that if I call add(4), that addThree(5) will not suddenly give me 9.

What Lua and other languages that have first-class functions do is create something know as a closure, which takes a snapshot of the variables that are not within the enclosing functions immediate local scope (i.e. the value of n in our example), and carries that state with it for later use. So, the value of n when I created addThree will, within addThree, will always and forever remain 3.

Comments

Roland_Y's picture

That is awesome post, I just went straight through it.

A suggestion, though, in the part "storing a function in a variable", maybe you can add some lines about the fact that functions are first-class values (in Lua) makes it easy to use them as anonymously. As in:

<pre class="brush: lua">
  print((function(n) return n*n end)(3)) --> 3
  print(6 + (function(n) return n*2 end)(2)) --> 10
</pre>
 
Also, to assess the fact that when functions are treated as reference, when passing them, you can can use that little snippet, as an example:
 
<pre class="brush: lua">
  local function sqr = function(k) return k*k end--> 3
  local foo = sqr
  assert(foo==sqr, 'Assertion failed!')
  print(foo, sqr)

</pre>

Just some random thoughts that came accross my mind. Excellent post, though.
Looking forward the second part about Entity Manager class.
MarekkPie's picture

I've never found much use for anonymous functions in that manner, Roland, so it just slipped my mind that they  can be used in that way.

As far as functions being reference copies, since they share that trait with tables, and since that is not a requirement to be considered a first-class function, I omitted that.

Thanks for the comment.