Day 7 – Lexical variables

Programming is tough going. Well, stringing together lines of code isn’t that difficult, and prototyping an idea can be pleasant and easy. But as the size of the program scales up, and the maintenance time lengthens, things tend to get tricky. Eventually, if we’re unlucky, we’re overcome by the complexity — not necessarily the complexity of the problem we started out solving, but the complexity of the program itself. We get gray hairs from debugging, or we’re simply at a loss for how to extend the program to do what we want.

So we turn to the history of programming, seeking advice as to how to combat complexity. And the answer sits there, clear as day: limit extent. If you’re architecting programs with hundreds or thousands of types of components, you’ll want those components to interact through only a very small set of surfaces… otherwise, you’ll simply lose control. Otherwise, combinatorics will defeat you.

We see this principle at every single level of programming, simply because it’s such a primary thing: Separation of Concerns, Do One Thing And Do It Well, BCNF, monads, routines, classes, roles, modules, packages. All of them urge us or guide us to limit the extent of things, so we don’t lose to combinatorics. Perhaps the simplest example of this is the lexical variable.

    my $var;
    # $var is visible in here
# $var is not visible out here

Yeah — that is today’s “cool feature”. :-) Here’s what makes it interesting:

Perl got this one wrong from version 1 and onwards. The default variable scope in Perl 5 is the “package variable”, a kind of global variable. Define something inside a block; still see it outside.

$ perl -v
This is perl 5, version 12, subversion 1 (v5.12.1)

$ perl -E '{ $var = 42 }; say $var'

$ perl -wE '{ my $var= 42 }; say $var'
Name "main::var" used only once: possible typo at -e line 1.
Use of uninitialized value $var in say at -e line 1.

In Perl 6, the lexical variable is the default. You won’t get past compilation if you try to pull off the above trick in Rakudo:

$ perl6 -e '{ $var = 42 }; say $var'    # gotta initialize with 'my'
Symbol '$var' not predeclared in <anonymous>

$ perl6 -e '{ my $var = 42 }; say $var' # still won't work! not visible outside
Symbol '$var' not predeclared in <anonymous>

You might say “okay, this is great for catching a typo now and then”. Yes, sure, but the big advantage is that this keeps you honest about variable scoping. And that helps you manage complexity.

Now let me just rush to the defense of Perl 5 by saying a variety of things at the same time. Perl 5 does try to steer you in the right way by having you use strict and use warnings by reflex; Perl 5 is bound by its promise of backwards compatibility, which is very good and noble; Perl 1 certainly was not about writing large applications and managing the resulting complexity; and global variables do make a lot of sense in a one-line script.

Perl 6 has an inherent focus to help you start small, and then help you put in more strictures and architectural underpinnings as your application scales up. In the case of variables, this means that in scripts and modules, lexical variables (à la strict) are the default, but in those -e one-liners the default is package variables. (Rakudo doesn’t implement this distinction yet, and one has to use lexical variables even at the command line. After it’s been implemented in Rakudo, I’d expect the perl6 invocations above to get past compilation, and produce outputs similar to the perl invocations.)

Moving along. At this point you might consider all that’s worth saying about lexical variables to have been said — not so. You see, the result of designing things right is that surprising and awesome bonuses keep falling out. Consider this subroutine:

sub counter($start_value) {
    my $count = $start_value;
    return { $count++ };

What’s returned at return { $count++ } is a block of code. So each time we call counter, what we get back is a little disconnected piece of code that can be called, as many times as we want.

Now look what happens when we create two such pieces of code and play around with them:

my $c1 = counter(5);
say $c1();           # 5
say $c1();           # 6

my $c2 = counter(42);
say $c2();           # 42
say $c1();           # 7
say $c2();           # 43

See that? The vital observation here is that $c1 and $c2 are acting entirely independently of each other. Both keep their own state, in the form of the $count variable, and although this might look like the same variable to us, to the two invocations of counter it looks like two different storage locations — because each time we enter a block of code, we start out afresh. The little block of code returned from some run of counter retains a relation to that particular storage location (it “closes over” the storage location, protecting it from the grasp of the Grim Garbage Collector; thus this kind of block is called a closure.)

If closures look a lot like lightweight objects to you, congratulations; they are. The principle behind closures, regulating the way values are accessed, is the same as the principle behind encapsulation and information hiding in OO. It’s all about limiting the extent of things, so that they can wreak as little havoc as possible when things turn ugly.

You can do nifty things like closures with lexical variables. You can’t with package variables. Lexical variables are cooler. QED.