Perl/Closures

From OpenWetWare

Jump to: navigation, search

Achieving Closure
by Simon Cozens
May 29, 2002

Maybe you've heard about closures; they're one of those aspects of Perl -- like object-oriented programming -- that everyone raves about and you can't really see the big deal until you play around with them and then they just click. In this article, we're going to play around with some closures, in the hope that they'll just click for you.

The nice thing about playing around with closures is that you often don't realize you're doing it. Don't believe me? OK, here's an ordinary piece of Perl:

my $print_hello = sub { print "Hello, world!"; }
$print_hello->();

We create a subroutine reference in $print_hello, and then we dereference it, calling the subroutine. I suppose we could put that into a subroutine:

sub make_hello_printer {
   return sub { print "Hello, world!"; }
}
my $print_hello = make_hello_printer();
$print_hello->()

Still nothing magical going on here. And it shouldn't be any surprise to you that we can move the "message" to a separate variable, like this:

sub make_hello_printer {
    my $message = "Hello, world!";
    return sub { print $message; }
}
my $print_hello = make_hello_printer();
$print_hello->()

As you'd expect, that prints out the Hello, world! message. Nothing special going on here, is there? Well, actually, there is. This is a closure. Did you notice?

What's special is that the subroutine reference we created refers to a lexical variable called $message. The lexical is defined in make_hello_printer, so by rights, it shouldn't be visible outside of make_hello_printer, right? We call make_hello_printer, $message gets created, we return the subroutine reference, and then $message goes away, out of scope.

Except it doesn't. When we call our subroutine reference, outside of make_hello_printer, it can still see and receive the correct value of $message. The subroutine reference forms a closure, ``enclosing the lexical variables it refers to.

Here's the canonical example of closures, that you'll find in practically every Perl book:

sub make_counter {
    my $start = shift;
    return sub { $start++ }
}
my $from_ten = make_counter(10);
my $from_three = make_counter(3);
print $from_ten->();       # 10
print $from_ten->();       # 11
print $from_three->();     # 3
print $from_ten->();       # 12
print $from_three->();     # 4

Related Reading
Perl in a Nutshell, 2nd Edition
By Stephen Spainhour, Ellen Siever, Nate Patwardhan

We've created two "counter" subroutines, which have completely independent values. This happens because each time we call make_counter, Perl creates a new lexical for $start, which gets wrapped up in the closure we return. So $from_ten encloses one $start which is initialized to 10, and $from_three encloses a totally different $start, which starts at 3.

It's because of this property that Barrie Slaymaker calls closures "inside-out objects:" objects are data that have some subroutines attached to them, and closures are subroutines that have some data attached to them.

Now, I said that's used in practically every Perl book, because authors try and put off discussing closures until there's little time left and they run out of imagination. (Well, at least that's my excuse ...) However, it's not an entirely practical example, to say the least. So let's try and find a better one.

This example is a bit more complex, but it demonstrates more clearly one extremely useful feature of closures: They can be used to bridge the gap between event-driven programs, which use callbacks extensively, and ordinary procedural code. I recently had to convert a bunch of XML files into an SQL database. Each file constituted a training course, so I wanted to build a data structure that contained the filename plus some of the details I'd parsed from the XML. Here's what I ended up with:

use XML::Twig;
my %courses;
for (<??.xml>) {
    my $name = $_; $name =~ s/.xml//;
    my $t= XML::Twig->new( 
        TwigHandlers => {
            need => sub { 
                push @{$courses{$name}{prereqs}}, $_->{'att'}->{course};
            },
            # ...
        }
    );
    $t->parsefile($_);
}

What's going on here? XML::Twig is a handy module that can be used to create an XML parser -- these parsers will call "TwigHandlers" when they meet various tags. We go through all the two-letter XML files in the current directory, and create a parser to parse the file. When we see something like this:

<need course="AA"/>

our need handler is called to store the fact that the current course has a prerequisite of the course coded "AA." ($_->{'att'}->{...} is XML::Twig-speak for "retrieve the value of the attribute called ...")

And that need handler is a closure -- it wraps up the name of the current file we're parsing, $name, so that it can be referred to whenever XML::Twig decides to use it.

There are many other things you can do with closures -- Tom Christiansen once recommended using them for "data hiding" in object-oriented code, since they rely on lexical variables that nothing outside of the closure can see. In fact, some of the most esoteric and advanced applications of Perl make heavy use of closures.

But as we've seen, some of the most useful uses of closures can happen without you noticing them at all ... --- Closures are often used for less esoteric purposes. For example, when you want to pass in a bit of code into a function:

my $line;
timeout( 30, sub { $line =  } );

If the code to execute had been passed in as a string, '$line = ', there would have been no way for the hypothetical timeout() function to access the lexical variable $line back in its caller's scope. --- Perl creates closures only over lexical (my) variables and not over global or localized (tagged with local) variables.

To summarize, a closure is the special case of an anonymous subroutine holding onto data that used to belong to its scope at the time of its creation.

Personal tools