Author Archive

Day 25 – Merry Christmas!

December 25, 2011

The kind elves who spend the rest of the year working in Santa’s shop to bring you more of Perl 6 each year would like to wish you a very warm and fuzzy Christmas vacation. December is always a special time for us, because we get to interact with you all through the interface of the advent calendar. We think that’s wonderful.

Be sure to check out this year’s Perl 6 coding contest, where you can win €100 worth of books!

Merry Christmas!

Day 22 – Operator overloading, revisited

December 22, 2011

Today’s post is a follow-up. Exactly two years ago, Matthew Walton wrote on this blog about overloading operators:

You can exercise further control over the operator’s parsing by adding traits to the definition, such as tighterequiv and looser, which let you specify the operator’s precedence in relationship to operators which have already been defined. Unfortunately, at the time of writing this is not supported in Rakudo so we will not consider it further today.

Rakudo is still lagging in precedence support (though at this point there are no blockers that I know about to simply going ahead and implementing it). But there’s a new implementation on the block, one that didn’t exist two years ago: Niecza.

Let’s try out operator precedence in Niecza.

$ niecza -e 'sub infix:<mean>($a, $b) { ($a + $b) / 2 }; say 10 mean 4 * 5'
15

Per default, an operator gets the same precedence as infix<+>. This is per spec. (How do we know it got the same precedence as infix<+> above? Well, we know it’s not tighter than multiplication, otherwise we’d have gotten the result 35.)

That’s all well and good, but what if we want to make our mean little operator evaluate tighter than multiplication? Nothing could be simpler:

$ niecza -e 'sub infix:<mean>($a, $b) is tighter(&infix:<*>) { ($a + $b) / 2 }; say 10 mean 4 * 5'
35

See what we did there? is tighter is a trait that we apply to the operator definition. The trait accepts an argument, in this case the language-provided multiplication operator. It all reads quite well, too: “infix mean is tighter [than] infix multiplication”.

Note the explicit use of intuitive naming for the precedence levels. Rather than the inherently confusing terms “higher/lower”, Perl 6 talks about “tighter/looser”, as in “multiplication binds tighter than addition”. Easier to think about precedence that way.

Internally, the precedence levels are stored not as numbers but as strings. Each original precedence level gets a letter of the alphabet and an equals sign (=). Subsequent added precendence levels append either a less-than sign (<) or a greater-than sign (>) to an existing precedence level representation. Using this system, we never “run out” of levels between existing ones (as we could if we were using integers, for example), and tighter levels always come lexigographically before looser ones. Language designers, take heed.

A few last passing notes about operators in Perl 6, while we’re on the subject:

  • In Perl 6, operators are subroutines. They just happen to have funny names, like prefix:<-> or postfix:<++> or infix:<?? !!>. This actually takes a lot of the hand-wavey magic out of defining them. The traits that we’ve seen applied to operators are really subroutine traits… these just happen to be relevant to operator definitions.
  • As a consequence, just like subroutines, operators are lexically scoped by default. Lexical scoping is something we like in Perl 6; it keeps popping up in unexpected places as a solid, sound design principle in the language. In practice, this means that if you declare an operator within a given scope, the operator will be visible and usable within that scope. You’re modifying the parser, but you’re doing it locally, within some block or other. (Or within the whole file, of course.)
  • Likewise, if you want to export your operators, you just use the same exporting mechanism used with subroutines. See how this unification between operators and subroutines keeps making sense? (In Perl 6-land, we say “operators are just funny-looking subroutines”.)
  • Multiple dispatch in operators works just as with ordinary subroutines. Great if you want to dispatch your operators on different types. As with all other routines in the core library in Perl 6, all operators are declared multi to be able to co-exist peacefully with module extensions to the language.
  • Operators can be macros, too. This is not an exceptions to the rule that operators are subroutines, because in Perl 6, macros are subroutines. In other words, if you want some syntactic sugar to execute at parse time (which is what a macro does), you can dress it up either as a normal-looking sub, or as an operator.

That’s it for today. Now, go forth and multiply, or even define your own operator that’s either tighter or looser than multiplication.

Day 19 – Abstraction and why it’s good

December 19, 2011

Some people are a bit afraid of the word “abstract”, because they’ve heard math teachers say it, and also, abstract art freaks them out. But abstraction is a fine and useful thing, and not so complicated. As programmers, we use it every day in different forms. The term is from Latin and means “to withdraw from” or “to pull away from”, and what we’re pulling away from is the specifics so we can focus on the big picture. That’s often mighty useful.

Here are a few examples:

Variables

If your computer only knew how to handle one specific number at a time, it’d be an abacus. Pretty early on, the programmer guild figured out it made a lot of sense to talk about the memory address of a value, and let that address contain whatever it pleased. They abstracted away from the value, and thus made the program more general.

As time passed, addresses were replaced by names, mostly as a convenience. Some people found it a good idea to give their variables descriptive names, as opposed to things like $grbldf.

Subroutines

Code re-use. We hear so much about it in the OO circles, but it holds equally well for subroutines. You write your code once, and then call it from all over the place. Convenient.

But, as I point out in an announcement pretending to be a computer science professor from an alternate timeline, there’s also the secondary benefit of giving your chunk of code a good mnemonic name, because that act in a sense improves the programming language itself. You’re giving yourself new verbs to program with.

This is especially true in Perl 6, because subroutines are lexically scoped (as opposed to Perl 5) and thus you can easily put a subroutine inside another routine. I use it when writing a Connect 4 game, for example.

Packages and modules

In Perl, packages don’t do much. They pull things together and keep them there. In a sense, what they abstract away is a set of subroutines from the rest of the world.

Perl 5 delivers its whole OO functionality through packages and a bit of dispatch magic on the side. It’s quite a feat, actually, but sometimes a bit too minimal. Moose fixes many of those early issues by providing a full-featured object system. Perl 6 lets packages go back to just being collections of subroutines, but provides a few dedicated abstractions for OO, a kind of built-in Moose. Which brings us to…

Classes

Object-orientation means a lot of different things to different people. To some, it’s the notion of an object, a piece of memory with a given set of operations and a given set of states. In a sense, we’re again in the business of extending the language like with did with subroutines. But this time we’re building new nouns rather than new verbs. One moment the language doesn’t know about a Customer object type; the next, it does.

To others, object-orientation means keeping the operations public and the states private. They refer to this division as encapsulation, because the object is like a little capsule, protecting your data from the big bad world. This is also a kind of abstraction, built on the idea that the rest of the world shouldn’t need to care about the internals of your objects, because some day you may want to refactor them. Don’t talk to the brain, talk to the hand; do your thing through the published operations of the object.

Roles

But class-based OO with inheritance will take you only so far. In the past 10 years or so, people have become increasingly aware of the limitations of inheritance-based class hierarchies. Often there are concerns which cut completely across a conventional inheritance hierarchy.

This is where roles come in; they allow you to apply behaviors in little cute packages here and there, without being tied up by a tree-like structure. In a post about roles I explore how this helps write better programs. But really the best example nowadays is probably the Rakudo compiler and its extensive use of roles; jnthn has been writing about that in an earlier advent post.

If classes abstract away complete sets of behaviors, roles abstract away partial sets of behaviors, or responsibilities.

You can even do so at runtime, using mixins, which are roles that you add to an object as the program executes. Objects changing type during runtime sounds magic almost to the point of recklessness; but it’s all done in a very straightforward manner using anonymous subclasses.

Metaobjects

Sometimes you want extra control over how the object system itself works. The object system in Perl 6, through one of those neat bite-your-own-tail tricks, is written using itself, and is completely modifiable in terms of itself. Basically, a bunch of the complexity has been removed by not having a separate hidden, unreachable system to handle the intricacies of the object system. Instead, there’s a visible API for interacting with the object system.

And, when we feel like it, we can invent new and exotic varieties of object systems. Or just tweak the existing one to our fancy.

Macros

On the way up the abstraction ladder, we’ve abstracted away bigger and bigger chunks of code: values, code, routines, behaviors, responsibilities and object systems. Now we reach the top, and there we find macros. Ah, macros, these magical, inscrutable beasts. What do macros abstract away?

Code.

Well, that’s rather disappointing, isn’t it? Didn’t we already abstract away code with subroutines? Yes, we did. But it turns out there’s so much code in a program that sometimes, it needs to be abstracted away on several levels!

Subroutines abstract away code that can then run in several different ways. You call the routine with other values, and it behaves differently. Macros abstract away code that can then be compiled in several different ways. You write a macro with other values, and it gets compiled into different code, which can then in turn run differently.

Essentially, macros give you a hook into the compiler to help you shape and guide what code it emits during the compilation itself. In a sense, you’re abstracting certain parts of the compilation process, the parsing and the syntax manipulation and the code generation. Again, you’re shaping the language — but this time not inventing new nouns or verbs, but whole ways of expressing yourself.

Macros come in two broad types: textual (a la C) and syntax tree (a la Lisp). The textual ones have a number of known issues stemming from the fact that they’re essentially a big imprecise search-and-replace on your code. The syntax tree ones are hailed as the best thing about Lisp, because it allows Lisp programs to grow and adapt to the needs of the programmer, by inventing new ways of expressing yourself.

Perl 6, being Perl 6, specifies both textual macros and syntax tree macros. I’m currently working on a grant to bring syntax macros to Rakudo Perl 6. There’s a branch where I’m hammering out the syntax and semantics of macros. It’s fun work, and made much more feasible by the past year’s improvements to Rakudo itself.

In conclusion

As an application grows and becomes more complex, it needs more rungs of the abstraction ladder to rest on. It needs more levels of abstraction with which to draw away the specifics and focus on the generalities.

Perl 6 is a new Perl, distinct from Perl 5. Its most distinguishing trait is perhaps that it has more rungs on the abstraction ladder to help you write code that’s more to the point. I like that.

Merry Christmas!

December 25, 2010

The people who brought you this year’s Advent Calendar had a blast doing so — it’s exciting to get to present new and old Perl 6 features to new and old readers. Thanks everyone! And Merry Christmas!

Day 1 – Reaching the Stars
Day 2 – Interacting with the command line with MAIN subs
Day 3 – File operations
Day 4 – The Sequence Operators
Day 5 – Why Perl syntax does what you want
Day 6 – The X and Z metaoperators
Day 7 – Lexical variables
Day 8 – Different Names of Different Things
Day 9 – The module ecosystem
Day 10 – Feed operators
Day 11 – Markov Sequence
Day 12 – Smart matching
Day 13 – The Perl6 Community
Day 14 – nextsame and its cousins
Day 15 – Calling native libraries from Perl 6
Day 16: Time in Perl6
Day 17 – Rosetta Code
Day 18 – ABC Module
Day 19 – False truth
Day 20 – The Perl 6 Synopses
Day 21 – transliteration and beyond
Day 22 – The Meta-Object Protocol
Day 23 – It’s some .sort of wonderful.
Day 24 – Yule the Ancient Troll-tide Carol
Day 25 – Merry Christmas!

Day 21 – transliteration and beyond

December 21, 2010

Transliteration sounds like it has Latin roots and means a changing of letters. And that’s what the Str.trans method does.

say "GATTACA".trans( "TCAG" => "0123" );  # prints "3200212\n"

Perl 5 people (and Unix shell folk) immediately recognize this as tr/tcag/0123/, but here’s a quick explanation for the rest of you out there: for every instance of T we find in the string, we replace it by 0, we replace every instance of C by 1, and so on. The two strings TCAG and 0123 supply alphabets to be translated to and from, respectively.

This can be used for any number of time-saving ends. Here, for example, is a simple subroutine that “encrypts” a text with ROT-13:

sub rot13($text) { $text.trans( "A..Za..z" => "N..ZA..Mn..za..m" ) }

When .trans sees those .. ranges, it expands them internally (so "n..z" really means "nopqrstuvwxyz"). Thus, the ultimate effect of the rot13 sub is to map certain parts of the ASCII alphabet to certain other parts.

In Perl 5, the two dots (..) are a dash (-), but we’ve tried in Perl 6 to have those two dots stand for the concept “range”; in the main language, in regexes, and here in transliterations.

Note also that the .trans method is non-mutating; it doesn’t change $text, but just returns a new value. This is also a general theme in Perl 6; in the core language we prefer to offer the side-effect-free variants of methods. You can easily get the mutating behavior by doing .=trans:

$kabbala.=trans("A..Ia..i" => "1..91..9");

(And that goes not only for .trans, but for all methods. It’s a silent encouragement to you as a programmer to write your libraries with non-mutating methods, making the world a happier, more composable place.)

But Perl 6 wouldn’t be Perl 6 if .trans didn’t also contain a hidden weapon which takes the Perl 5 tr/// and just completely blows it out of the water. Here’s what it also does:

Let’s say we want to escape some HTML, that is, replace things according to this table:

    & => &amp;
    < => &lt;
    > => &gt;

(By the way, I hope if you ever need to escape HTML, that there will be a library routine for you ready that does it for you. But the general principle is important; and in the few instances when you do need to do something like this, it’s good to know the tools are there, built into the language.)

This is nothing that a few well-placed regexes can’t handle. So what’s the big deal? Well, a naive in-place per-match replacement of the above three characters might be unlucky enough to get stuck in an infinite loop. (& => &amp; => &amp;amp; => ...) So you need to do various sordid trickery to avoid that.

But that’s not even the fundamental problem, which is that you want to resort to stitching together pieces of strings, rather than thinking of the problem in a more high-level manner. Generally, we wouldn’t want a solution that depends on the order of the substitutions. That would also affect something like this:

    foo         => bar
    foolishness => folly

If the former substitution is attempted first each time, there won’t ever be an occasion to perform the latter one — probably not what was intented. Generally, we want to try and match the longer substrings before shorter ones.

So, it seems we want a longest-token substitution matcher that avoids infinite cycles due to accidental re-substitution.

That’s what .trans in Perl 6 provides. That’s its hidden weapon: sending in a pair of arrays rather than strings. For the HTML escaping, all we need to do is this:

my $escaped = $html.trans(
    [ '&',     '<',    '>'    ] =>
    [ '&amp;', '&lt;', '&gt;' ]
);

…and the non-trivial problems of replacing things in the right order and avoiding cyclical replacement are taken care of for us.

Day 19 – False truth

December 19, 2010

Today’s advent gift teaches us how to use mixins for nefarious and confusing purposes. In fact, this feature will probably appear partly insane, but it turns out to be quite useful. Enter the but operator:

my $value = 42 but role { method Bool  { False } };
say $value;    # 42
say ?$value;   # False

So you see, we overload the .Bool method on our $value. It doesn’t affect other integers in the program, not even other 42s in the program, just this one. Normally, for Ints, the .Bool method (and therefore the prefix:<?> operator) returns whether the number is non-zero, but here we make it always return False.

In fact, there’s a shorter way to write this for enum values, of which False is one.

my $value = 42 but False;

Since False is a value of the Bool type, it will automatically overload the .Bool method, which by convention is a kind of conversion method in Perl 6. Values of other types will of course overload their corresponding conversion method.

Here’s the part that turns out to be quite useful: in Perl 5 when you put a &system call in an if statement wanting to check for success, you have to remember to negate the result of the call, since in bash only zero means success:

if ( system($cmd) == 0 ) {  # alternatively, !system($cmd) 
    # ...
}

But in Perl 6, the corresponding &run routine returns the above kind of overloaded integers; these boolify to True if and only if the return value is zero, which is the opposite of the default Int behavior, and just what we need.

if run($cmd) {  # we don't negate
    # ...
}

Oh, and here’s the part that appears insane. :-) We can overload the .Bool method of boolean values!

my $value = True but False;
say $value;    # True
say ?$value;   # False

Yes, Perl 6 allows you to shoot yourself in the foot in this particular way. Though I don’t see why anyone would want to do this except for obfuscatory purposes, I’m kinda glad Perl 6 has the presence of mind to keep track of the subtleties of that type of overloading. I know I almost don’t. :-)

Day 14 – nextsame and its cousins

December 14, 2010

Maybe you’re familiar with the way the keyword super in Java allows you to delegate to the method (or constructor) of a base class. Perl 6 has something similar… but in a world with multiple inheritance and mixins it makes less sense to call it super. So it’s called nextsame. Here’s an example of its use:

class A {
    method sing {
        say "life is but a dream.";
    }
}

class B is A {
    method sing {
        say ("merrily," xx 4).join(" ");
        nextsame;
    }
}

class C is B {
    method sing {
        say "row, row, row your boat,";
        say "gently down the stream.";
        nextsame;
    }
}

Now, when we call C.new.sing, our class hierarchy will output this:

row, row, row your boat,
gently down the stream.
merrily, merrily, merrily, merrily,
life is but a dream.

You’ll note how the call finds its way from C.sing via B.sing over to A.sing. Those transitions are (of course) mediated by the nextsame calls. You’ll note the similarity to, for example, Java’s super.

But calling along the inheritance chain is not the only place where nextsame proves useful. Here’s an example not involving object orientation:

sub bray {
    say "EE-I-EE-I-OO.";
}

# Oh right, forgot to add the first line of the song...
&bray.wrap( {
    say "Old MacDonald had a farm,";
    nextsame;
} );

bray(); # Old MacDonald had a farm,
        # EE-I-EE-I-OO.

So that’s another reason nextsame is not called super: it’s not necessarily related to the base class, because there might not be a base class. Instead, there’s some more general phenomenon involved. What might that be?

Every time we do a call to something, there’s a part of the language runtime making sure that the call ends up in the right routine. Such a part is called a dispatcher. A dispatcher makes sure that the following multi call ends up in the appropriate routine:

multi foo(    $x) { say "Any argument" }
multi foo(Int $x) { say "Int argument" }

foo(42) # Int argument

(And a nextsame in the second multi foo would re-dispatch to the first. But that doesn’t work in Rakudo yet.)

Dispatchers are everywhere in Perl 6. They’re involved in method calls, so that a method can defer along the inheritance chain, as we did in the beginning of the post. They’re in wrapped routines, so that the code doing the wrapping can call into the code being wrapped. And they participate in multi dispatch, so that multi variants can defer to each other. It’s all the same principle, but in different guises.

And nextsame is just a way to talk directly to your friendly neighborhood dispatcher. By the way, the keyword is called nextsame because it instructs the dispatcher to defer to the next candidate with the same signature. There are variants, as you’ll see below.

You can use nextsame in mixins, too:

class A {
    method foo { "OH HAI" }
}

role LogFoo {
    method foo {
        note ".foo was called";
        nextsame;
    }
}

my $logged_A = A.new but LogFoo;

say $logged_A.foo; # .foo was called
                   # OH HAI

I like this way to use mixins to inject behavior. I once wrote a post about it, and jnthn has written a Perl 6 module that exploits it.

Though pretty cool, this use of nextsame isn’t really anything new; in fact it’s just another example of the defer-along-the-OO-callchain dispatcher. That’s because mixing in the role LogFoo with but causes an anonymous subclass to be created, one that also does LogFoo. So role mixin nextsame boils down to just inheritance nextsame. (But we don’t need to actually grok that to use it, and it still feels slightly magical and very nice to use.)

In summary, nextsame works in a lot of places you’d expect it to work, and it works the way you expect it to. It defers to the next thing.

Oh, and nextsame has three closely related cousin keywords:

nextsame             Defer with the same arguments, don't return
callsame             Defer with the same arguments, then  return

nextwith($p1, $p2, ...) Defer with these arguments, don't return
callwith($p1, $p2, ...) Defer with these arguments, then  return

Naturally, the other three can be used in the same situations nextsame can.

Day 7 – Lexical variables

December 7, 2010

Programming is tough going. Well, stringing together lines of code isn’t that difficult, and prototyping an idea can be pleasant and easy. But as the size of the program scales up, and the maintenance time lengthens, things tend to get tricky. Eventually, if we’re unlucky, we’re overcome by the complexity — not necessarily the complexity of the problem we started out solving, but the complexity of the program itself. We get gray hairs from debugging, or we’re simply at a loss for how to extend the program to do what we want.

So we turn to the history of programming, seeking advice as to how to combat complexity. And the answer sits there, clear as day: limit extent. If you’re architecting programs with hundreds or thousands of types of components, you’ll want those components to interact through only a very small set of surfaces… otherwise, you’ll simply lose control. Otherwise, combinatorics will defeat you.

We see this principle at every single level of programming, simply because it’s such a primary thing: Separation of Concerns, Do One Thing And Do It Well, BCNF, monads, routines, classes, roles, modules, packages. All of them urge us or guide us to limit the extent of things, so we don’t lose to combinatorics. Perhaps the simplest example of this is the lexical variable.

{
    my $var;
    # $var is visible in here
}
# $var is not visible out here

Yeah — that is today’s “cool feature”. :-) Here’s what makes it interesting:

Perl got this one wrong from version 1 and onwards. The default variable scope in Perl 5 is the “package variable”, a kind of global variable. Define something inside a block; still see it outside.

$ perl -v
This is perl 5, version 12, subversion 1 (v5.12.1)

$ perl -E '{ $var = 42 }; say $var'
42

$ perl -wE '{ my $var= 42 }; say $var'
Name "main::var" used only once: possible typo at -e line 1.
Use of uninitialized value $var in say at -e line 1.

In Perl 6, the lexical variable is the default. You won’t get past compilation if you try to pull off the above trick in Rakudo:

$ perl6 -e '{ $var = 42 }; say $var'    # gotta initialize with 'my'
===SORRY!===
Symbol '$var' not predeclared in <anonymous>

$ perl6 -e '{ my $var = 42 }; say $var' # still won't work! not visible outside
===SORRY!===
Symbol '$var' not predeclared in <anonymous>

You might say “okay, this is great for catching a typo now and then”. Yes, sure, but the big advantage is that this keeps you honest about variable scoping. And that helps you manage complexity.

Now let me just rush to the defense of Perl 5 by saying a variety of things at the same time. Perl 5 does try to steer you in the right way by having you use strict and use warnings by reflex; Perl 5 is bound by its promise of backwards compatibility, which is very good and noble; Perl 1 certainly was not about writing large applications and managing the resulting complexity; and global variables do make a lot of sense in a one-line script.

Perl 6 has an inherent focus to help you start small, and then help you put in more strictures and architectural underpinnings as your application scales up. In the case of variables, this means that in scripts and modules, lexical variables (à la strict) are the default, but in those -e one-liners the default is package variables. (Rakudo doesn’t implement this distinction yet, and one has to use lexical variables even at the command line. After it’s been implemented in Rakudo, I’d expect the perl6 invocations above to get past compilation, and produce outputs similar to the perl invocations.)

Moving along. At this point you might consider all that’s worth saying about lexical variables to have been said — not so. You see, the result of designing things right is that surprising and awesome bonuses keep falling out. Consider this subroutine:

sub counter($start_value) {
    my $count = $start_value;
    return { $count++ };
}

What’s returned at return { $count++ } is a block of code. So each time we call counter, what we get back is a little disconnected piece of code that can be called, as many times as we want.

Now look what happens when we create two such pieces of code and play around with them:

my $c1 = counter(5);
say $c1();           # 5
say $c1();           # 6

my $c2 = counter(42);
say $c2();           # 42
say $c1();           # 7
say $c2();           # 43

See that? The vital observation here is that $c1 and $c2 are acting entirely independently of each other. Both keep their own state, in the form of the $count variable, and although this might look like the same variable to us, to the two invocations of counter it looks like two different storage locations — because each time we enter a block of code, we start out afresh. The little block of code returned from some run of counter retains a relation to that particular storage location (it “closes over” the storage location, protecting it from the grasp of the Grim Garbage Collector; thus this kind of block is called a closure.)

If closures look a lot like lightweight objects to you, congratulations; they are. The principle behind closures, regulating the way values are accessed, is the same as the principle behind encapsulation and information hiding in OO. It’s all about limiting the extent of things, so that they can wreak as little havoc as possible when things turn ugly.

You can do nifty things like closures with lexical variables. You can’t with package variables. Lexical variables are cooler. QED.

Day 23: Lazy fruits from the gather of Eden

December 23, 2009

Today’s gift is a construct not often seen in other languages. It’s an iterator builder! And it’s called gather.

But first, let’s try a bit of historical perspective. Many Perl people know their map, grep and sort, convenient functions to carry out simple list transformations, filterings and rankings without having to resort to for loops.

my @squares = map { $_ * $_ }, @numbers;
my @primes  = grep { is-prime($_) }, @numbers;

The map and grep constructs are especially powerful once we get comfortable with chaining them:

my @children-of-single-moms =
    map  {  .children },
    grep { !.is-married },
    grep {  .gender == FEMALE },
         @citizens;

(Note that .children may return one child, a list of several children or an empty list. map has a flattening influence on the lists thus produced, so the final result is a flat list of children.)

The chaining of map and sort gave rise to the famous Schwartzian transform, a Perl 5 caching idiom for when the thing being sorted on is computationally expensive:

my @files-by-modification-date =
    map  { .[0] },                # deconstruct
    sort { $^a[1] <=> $^b[1] },
    map  { [$_, $_ ~~ :M] },      # compute and construct
         @files;

It’s unfortunate that the functional paradigm puts the steps in reverse order of processing. The proposed pipe syntax, notably the ==>, would solve that. But it’s not implemented in Rakudo yet, only described in S03.

Anyway, if you’ve read the post from day 20, you know that the Schwartzian transform is built into sort nowadays:

my @files-by-modification-date =
    sort { $_ ~~ :M },
    @files;

So that’s, you know, coming along.

Now, what about this gather construct? Well, it’s a kind of generalization of map and grep.

sub mymap(&transform, @list) {
    gather for @list {
        take transform($_);
    }
};

sub mygrep(&condition, @list) {
    gather for @list {
        take $_ if condition($_);
    }
};

(The real map can swallow several argument at a time, making it more powerful than the &mymap above.)

Just to be clear about what happens: gather signals that within the subsequent block, we’ll be building a list. Each take adds an element to the list. You could think of it as pushing to an anonymous array if you want:

my @result = gather { take $_ for 5..7 }; # this...

my @result;
push @result, $_ for 5..7; # ...is the same as this

Which brings us to the first property of gather: it’s the construct you can use for building lists when map, grep and sort aren’t sufficient. Of course, there’s no need to reinvent those constructions… but the fact that you can do that, or roll your own special variants, is kinda nice.

sub incremental-concat(@list) {
  my $string-accumulator = "";
  gather for @list {
    # RAKUDO: The ~() is a workaround for [perl #62178]
    take ~($string-accumulator ~= $_);
  }
};

say incremental-concat(<a b c>).perl; # ["a", "ab", "abc"]

The above is nicer than using map, since we need to manage the $string-accumulator between iterations.

(Implementing &incremental-concat by hand is silly in an implementation which implements the [\~] operator. Just a short announcement to people who want to keep track of the extent to which Perl 6 is channeling APL. Rakudo doesn’t yet, though.)

The second property of gather is that while the take calls (of course) have to occur within the scope of a gather block, they do not necessarily have to occur in the lexical scope, only the dynamic scope. For those unfamiliar with the distinction, I think an example explains it best:

sub traverse-tree-inorder(Tree $t) {
  traverse-tree-inorder($t.left) if $t.left;
  take transform($t);
  traverse-tree-inorder($t.right) if $t.right;
}

my $tree = ...;
my @all-nodes = gather traverse-tree-inorder($tree);

See what’s happening here? We wrap the call to &traverse-tree-inorder in a gather statement. The statement itself doesn’t lexically contain any take calls, but the called subroutine does, and the take in there remembers that it’s in a gather context. That’s what makes the gather context dynamic rather than lexical.

Just to hammer in the point, traverse-tree-inorder does lexical recursion on a tree structure, and no matter how far down the call stack we find ourselves, the values passed to take find their way back into the same anonymous array, rooted in the gather around the original call. It’s as if the anonymous array were implicitly passed around for us automatically as invisible arguments. Another way to view it is that the gather mode works orthogonally to the call stack, essentially not caring about how many calls down it is.

Should we be unfortunate enough to do a take outside of any gather block, we’ll get a warning at runtime.

I’ve saved the best till last: the third property of gather: it’s lazy.

What does “lazy” mean here? Well, to take the above tree-traversing code as an example: when the assignment to @all-nodes has executed, the tree hasn’t yet been traversed. Only when you access the first element of the array, @all-nodes[0], does the traversal start. And it stops right after it finds the leftmost leaf node. Access @all-nodes[1], and the traversal will resume from where it left off, run just enough to find the second node in the traversal, and then stop.

In short, the code within the gather block starts and stops in such a way that it never does more work than you’ve asked it to do. That’s lazy.

It’s essentially a model of delayed execution. Perl 6 promises to run the code inside your gather block, but only if it turns out that you actually need the information. Operationally, you can think of it as a separate thread that starts and stops, always doing the smallest possible amount of work to keep the main thread satisfied. But under the hood in a given implementation, it’s likely implemented by continuations — or, failing that, by painful, complicated cheating.

Now here’s the thing: most any array in Perl 6 has the lazy behavior by default, and things like reading all lines from a file are lazy by default, and not only can map and grep be implemented using gather; it turns out that they actually are, too. So map and grep are also lazy.

Now, it’s nice to know that values aren’t unnecessarily generated when you’re doing calculations with arrays… but the really nice thing is that lazy arrays open up the door for stream-based programming and, by extension, infinite arrays.

Unfortunately, laziness hasn’t landed in Rakudo yet. We’re nearly there though, so I don’t feel too bad dangling these examples in front of you, even though they will currently cause Rakudo to spin the fans of your computer and nothing more:

my @natural-numbers = 0 .. Inf;

my @even-numbers  = 0, 2 ... *;    # arithmetic seq
my @odd-numbers   = 1, 3 ... *;
my @powers-of-two = 1, 2, 4 ... *; # geometric seq

my @squares-of-odd-numbers = map { $_ * $_ }, @odd-numbers;

sub enumerate-positive-rationals() { # with duplicates, but still
  take 1;
  for 1..Inf -> $total {
    for 1..^$total Z reverse(1..^$total) -> $numerator, $denominator {
      take $numerator / $denominator;
    }
  }
}

sub enumerate-all-rationals() {
  map { $_, -$_ }, enumerate-positive-rationals();
}

sub fibonacci() {
  gather {
    take 0;
    my ($last, $this) = 0, 1;
    loop { # infinitely!
      take $this;
      ($last, $this) = $this, $last + $this;
    }
  }
}
say fibonacci[10]; # 55

# Merge two sorted (potentially infinite) arrays
sub merge(@a, @b) {
  !@a && !@b ?? () !!
  !@a        ?? @b !!
         !@b ?? @a !!
  (@a[0] < @b[0] ?? @a.shift !! @b.shift, merge(@a, @b))
}

sub hamming-sequence() # 2**a * 3**b * 5**c, where { all(a,b,c) >= 0 }
  gather {
    take 1;
    take $_ for
        merge( (map { 2 * $_ } hamming-sequence()),
               merge( (map { 3 * $_ }, hamming-sequence()),
                      (map { 5 * $_ }, hamming-sequence()) ));
  }
}

(That last subroutine is a Perl 6 solution to the Hamming problem, described in section 6.4 of mjd++’s Higher Order Perl. A seriously cool book, by the way. It builds iterators from scratch; we just use gather for the same result.)

Today’s obfu award goes to David Brunton, who has written a Perl 6 tweet which draws a rule #30 cellular automaton, which also happens to look like a Christmas tree.

$ perl6 -e 'my %r=[^8]>>.fmt("%03b") Z (0,1,1,1,1,0,0,0);\
say <. X>[my@i=0 xx 9,1,0 xx 9];\
for ^9 {say <. X>[@i=map {%r{@i[($_-1)%19,$_,($_+1)%19].join}},^19]};'
.........X.........
........XXX........
.......XX..X.......
......XX.XXXX......
.....XX..X...X.....
....XX.XXXX.XXX....
...XX..X....X..X...
..XX.XXXX..XXXXXX..
.XX..X...XXX.....X.
XX.XXXX.XX..X...XXX

Day 16: We call it ‘the old switcheroo’

December 16, 2009

Another glorious day in Advent; another gift awaits us. It’s switch statements!

Well, the term for them is still “switch statement” in Perl 6, but the keyword has changed for linguistic reasons. It’s now given, as in “given today’s weather”:

given $weather {
  when 'sunny'  { say 'Aah! ☀'                    }
  when 'cloudy' { say 'Meh. ☁'                    }
  when 'rainy'  { say 'Where is my umbrella? ☂'   }
  when 'snowy'  { say 'Yippie! ☃'                 }
  default       { say 'Looks like any other day.' }
}

Here’s a minimal explanation of the semantics, just to get us started: in the above example, the contents of the variable $weather is tested against the strings 'sunny', 'cloudy', 'rainy', and 'snowy', one after the other. If either of them matches, the corresponding block runs. If none matches, the default block triggers instead.

Not so different from switch statements in other languages, in other words. (But wait!) We’ll note in passing that the when blocks don’t automatically fall through, so if you have several conditions which would match, only the first one will run:

given $probability {
  when     1.00 { say 'A certainty'   }
  when * > 0.75 { say 'Quite likely'  }
  when * > 0.50 { say 'Likely'        }
  when * > 0.25 { say 'Unlikely'      }
  when * > 0.00 { say 'Very unlikely' }
  when     0.00 { say 'Fat chance'  }
}

So if you have a $probability of 0.80, the above code will print Quite likely, but not Likely, Unlikely etc. (In the cases when you want to “fall through” from a when block, you can end it with the keyword continue.) (Update: after spec discussions that originated in the comments of this post, break/continue were renamed to succeed/proceed.)

Note that in the above code, strings and decimal numbers and comparisons are used as the when expression. How does Perl 6 know how to match the given value against the when value, when both can be of wildly varying types?

The answer is that the two values enter a negotiation process called smartmatching, mentioned briefly in Day 13. To summarize, smartmatching (written as $a ~~ $b) is a kind of “regex matching on steroids”, where the $b doesn’t have to be a regex, but can be of any type. For ranges, the smartmatch will check if the value we want to match is within the range. If $b is a class or a role or a subtype, the smartmatch will perform a type check. And so on. For values like Num and Str which represent themselves, some appropriate equivalence check is made.

The “whatever star” (*) has the peculiar property that it smartmatches on anything. Oh, and default is just sugar for when *.

To summarize the summary, smartmatching is DWIM in operator form. And the given/when construct runs on top of it.

Now for something slightly head-spinning: the given and when features are actually independant! While you complete the syllable “WHAT?”, let me explain how.

Given is actually a sort of once-only for loop.

given $punch-card {
  .bend;
  .fold;
  .mutilate;
}

See what happened there? All given does is set the topic, also known to Perl people as $_. The cute .method is actually short for $_.method.

Now it’s easier to see how when can be used without a given, too. when can be used inside any block which sets $_, implicitly or explicitly:

my $scanning;
for $*IN.lines {
  when /start/ { $scanning = True }
  when /stop/  { $scanning = False }

  if $scanning {
    # Do something which only happens between the
    # lines containing 'start' and 'stop'
  }
}

Note that those when blocks exhibit the same behaviour as the in a given block: they skip the rest of the surrounding block after executing, which in the above code means they go directly to the next line in the input.

Here’s another example, this time with $_ explicitly set:

sub fib(Int $_) {
  when * < 2 { 1 }
  default { fib($_ - 1) + fib($_ - 2) }
}

(This independence between given and when plays out in other ways too. For example, the way to handle exceptions is with a CATCH block, a variant of given which topicalizes on $!, the variable holding the most recent exception.)

To top it all off, both given and when come in statement-ending varieties, just as for, if and the others:

  say .[0] + .[1] + .[2] given @list;
  say 'My God, it's full of vowels!' when /^ <[aeiou]>+ $/;

You can even nest a when inside a given:

  say 'Boo!' when /phantom/ given $castle;

As given and when represent another striking blow against the Perl obfuscation track record, I hereby present you with the parting gift of an obfu DNA helix, knowing full well that it doesn’t quite make up for the damage caused. :)

$ perl6 -e 'for ^20 {my ($a,$b)=<AT CG>.pick.comb.pick(*);\
  my ($c,$d)=sort map {6+4*sin($_/2)},$_,$_+4;\
  printf "%{$c}s%{$d-$c}s\n",$a,$b}'
     G  C
      TA
     C G
   G    C
 C     G
 G     C
 T   A
  CG
 CG
 C   G
 T     A
  T     A
   T    A
     C G
      TA
    T   A
  T     A
 A     T
 C    G
 G  C

Follow

Get every new post delivered to your Inbox.

Join 37 other followers