Day 17 – Compiling our way to happiness

Our mission, should we choose to accept it, is to solve the SEND + MORE = MONEY problem in code. No, hold on, let me put it like this instead:

    S E N D
+   M O R E
-----------
  M O N E Y

It means the same, but putting it up like this is more visually evocative, especially since many of us did it this way in school.

The ground rules are simple.

  • Each letter represents a digits between 0 and 9.
  • The letters represent distinct digits; two letters may not share the same digit.
  • Leading digits (in our puzzle, S and M) can’t be zero. Then they wouldn’t be leading digits!

Given these constraints, there’s a unique solution to the puzzle above.

I encourage you to find the solution. Write a bit of code, live a little! In this post, we’ll do that, but then (crucially) not be satisfied with that, and end up in a nested-doll situation where code writes code until something really neat emerges. The conclusion will spell out the ultimate vision — hold on, I’m being informed in real-time by the Plurality Committee that the correct term is “an ultimate vision” — for Perl 6.

Let’s do this.

Marcus Junius Brute Force (The Younger)

Our first language of the day, with its corresponding solution, is Perl 6 itself. There’s no finesse here; we just charge right through the solution space like an enraged bull, trying everything. In fact, we make sure not to attempt any cleverness with this one, just try to express the solution as straightforwardly as possible.

for 0..9 -> int $d {
    for 0..9 -> int $e {
        next if $e == $d;

        my int $y = ($d + $e) % 10;
        my int $_c1 = ($d + $e) div 10;

        for 0..9 -> int $n {
            next if $n == $d;
            next if $n == $e;
            next if $n == $y;

            for 0..9 -> int $r {
                next if $r == $d;
                next if $r == $e;
                next if $r == $y;
                next if $r == $n;

                next unless ($_c1 + $n + $r) % 10 == $e;
                my int $_c2 = ($_c1 + $n + $r) div 10;

                for 0..9 -> int $o {
                    next if $o == $d;
                    next if $o == $e;
                    next if $o == $y;
                    next if $o == $n;
                    next if $o == $r;

                    next unless ($_c2 + $e + $o) % 10 == $n;
                    my int $_c3 = ($_c2 + $e + $o) div 10;

                    for 1..9 -> int $s {
                        next if $s == $d;
                        next if $s == $e;
                        next if $s == $y;
                        next if $s == $n;
                        next if $s == $r;
                        next if $s == $o;

                        for 1..9 -> int $m {
                            next if $m == $d;
                            next if $m == $e;
                            next if $m == $y;
                            next if $m == $n;
                            next if $m == $r;
                            next if $m == $o;
                            next if $m == $s;

                            next unless ($_c3 + $s + $m) % 10 == $o;
                            my int $_c4 = ($_c3 + $s + $m) div 10;

                            next unless $_c4 % 10 == $m;

                            say "$s$e$n$d + $m$o$r$e == $m$o$n$e$y";
                        }
                    }
                }
            }
        }
    }
}

Again, it’s not pretty, but it works. This is the kind of indentation level your mother warned you about. If you ask me, though, I’m more annoyed about the indentation being there at all. We have one for every variable whose search space we need to scan through. (Only with Y do we get to take a shortcut.)

Though it’s a detour for today’s buffet, MJD once blogged about this and then I blogged about it too. Those blog posts were very much about “removing the indentation”, in a sense. Today’s post is where my thinking has taken me, three years later.

I took the path less traveled (and all the other paths, too)

Our second language is still mostly Perl 6, but with a neat hypothetical extension called amb, but spelled (evocatively) <-. It gets rid of all the explicit for loops and levels of indentation.

my $d <- 0..9;
my $e <- 0..9;
guard $e != any($d);
my $y = ($d + $e) % 10;
my $_c1 = ($d + $e) div 10;

my $n <- 0..9;
guard $n != any($d, $e, $y);
my $r <- 0..9;
guard $r != any($d, $e, $y, $n);
guard ($_c1 + $n + $r) % 10 == $e;
my $_c2 = ($_c1 + $n + $r) div 10;

my $o <- 0..9;
guard $o != any($d, $e, $y, $n, $r);
guard ($_c2 + $e + $o) % 10 == $n;
my $_c3 = ($_c2 + $e + $o) div 10;

my $s <- 1..9;
guard $s != any($d, $e, $y, $n, $r, $o);
my $m <- 1..9;
guard $m != any($d, $e, $y, $n, $r, $o, $s);
guard ($_c3 + $s + $m) % 10 == $o;
my $_c4 = ($_c3 + $s + $m) div 10;

guard $_c4 % 10 == $m;

say "$s$e$n$d + $m$o$r$e == $m$o$n$e$y";

This solution is shorter, more compact, and feels less “noisy” and aggravating just by ridding us of the for loops. (I suspect this has something to do with that imperative↔declarative spectrum people mention sometimes. We’re not so interested in looping as such, only seeing it get done.)

I know it won’t completely make up for the fact that Perl 6 doesn’t have the amb operator and guard implemented in core (or even in module space), but here’s a short script that will convert the above program to today’s first version:

my $indent = 0;
constant SPACE = chr(0x20);
sub indent { SPACE x 4 * $indent }

for lines() {
    when /^ my \h+ ('$' \w) \h* '<-' \h* (\d+ \h* '..' \h* \d+) ';' $/ {
        say indent, "for $1 -> int $0 \{";
        $indent++;
    }

    when /^ guard \h+ ('$' \w) \h* '!=' \h* 'any(' ('$' \w)+ % [\h* ',' \h*] ')' \h* ';' $/ {
        say indent, "next if $0 == $_;"
            for $1;
        say "";
    }

    when /^ guard \h+ ([<!before '=='> .]+ '==' <-[;]>+) ';' $/ {
        say indent, "next unless $0;";
    }

    when /^ my \h+ ('$' \w+) \h* '=' \h* (<-[;]>+) ';' $/ {
        say indent, "my int $0 = $1;";
    }

    when /^ \h* $/ {
        say "";
    }

    when /^ say \h+ (<-[;]>+) ';' $/ {
        say indent, $_;
    }

    default {
        die "Couldn't match $_";
    }
}

while $indent-- {
    say indent, "\}";
}

But we’ll not be satisfied here either. Oh no.

Thinking in equations

The third language takes us even further into the declarative, getting rid of all the guard clauses that simply state that the variables should be distinct.

ALL_DISTINCT

$d in 0..9
$e in 0..9
$n in 0..9
$r in 0..9
$o in 0..9
$s in 1..9
$m in 1..9

$y = ($d + $e) % 10
$_c1 = ($d + $e) div 10

($_c1 + $n + $r) % 10 == $e
$_c2 = ($_c1 + $n + $r) div 10

($_c2 + $e + $o) % 10 == $n
$_c3 = ($_c2 + $e + $o) div 10

($_c3 + $s + $m) % 10 == $o
$_c4 = ($_c3 + $s + $m) div 10

$_c4 % 10 == $m

We’re completely in the domain of constraint programming now, and it would be disingenuous not to mention this. We’ve left the imperative aspects of Perl 6 behind, and we’re focusing solely on describing the constraints of the problem we’re solving.

The most imperative aspect of the above program is when we do an assignment. Even this is mostly an optimization, in the cases when we know we can compute the value of a variable directly instead of searching for it.

Even in this case, we could translate back to the previous solution. I’ll leave out such a translator for now, though.

I’m going to come back to this language in the conclusion, because it turns out in many ways, it’s the most interesting one.

The fourth language

Having gotten this far, what more imperative complexity can we peel off? Specifically, where do those equations come from that are specified in the previous solution? How can we express them more succinctly?

You’ll like this, I think. The fourth language just expresses the search like this:

    S E N D
+   M O R E
-----------
  M O N E Y

Hang on, what again? Yes, you read that right. The most declarative solution to this problem is just an ASCII layout of the problem specification itself! Don’t you just love it when the problem space and the solution space meet up like that?

From this layout, we can again translate back to the constraint programming solution, weaving equations out of the manual algorithm for addition that we learn in school.

So, not only don’t we have to write those aggravating for loops; if we’re tenacious enough, we can have code generation all the way from the problem to the solution. We just need to find the appropriate languages to land on in-between.

Conclusion

My exploration with 007 has led me to think about things like the above: translating programs. Perl 6 already exposes one part of the compilation process very well: parsing. We can use grammars both in userland and within the Perl 6 toolchain itself.

I’ve come to believe we need to do that to all aspects of the compilation pipeline. Here, let me put it as a slogan or a declaration of sorts:

Perl 6 will have reached its full potential when all features we bring to bear manipulating text/data can also be turned inwards, to the compilation process itself.

Those translators I wrote (or imagined) between my different languages, they work in a pinch but they’re also fragile and a bit of a waste. The problem is to a large extent that we drop down to text all the time. We should be doing this at the AST level, where all the structure is readily available.

The gains from such a mind shift cannot be overstated. This is where we will find Lispy enlightenment in Perl 6.

For example, the third language with the equations doesn’t have to be blindly translated into code. It can be optimized, the equations massaged into narrower and more precise ones. As can be seen on Wikipedia, it’s possible to do such a good job of optimizing that there’s no searching left once the program runs.

My dream: to be able to do the above transformations, not between text files but between slangs within Perl 6. And to be able to do the optimization step as well. All without leaving the comfort of the language.

Day 5 – Variables

Such a simple thing, isn’t it? A variable is a name that holds a value.

Occasionally, the value it’s holding might be replaced by a different one — hence the name. (According to the surgeon general, variables that don’t experience change on a regular basis should see their physician, and ask to be diagnosed as constants.)

Even though they’re very easy to grasp, and basically every language has them, my goal today is to convince you that variables are actually wonderfully tricky. In a good way! I aim to make you stumble out of this blog post, dazed, mumbling “I thought I knew variables, but I really had no idea…”.

Towards the end, the experimental language 007 will also figure, because it’s completely and utterly this language’s fault that I’m thinking so much about variables.

Continue reading “Day 5 – Variables”

Day 23 – macros and secret agents

Let’s talk about 007.

$ perl6 bin/007 -e='say("OH HAI")'
OH HAI

007 is a small language, implemented in Perl 6. Its reason for existing is macros, a topic that has become quite dear to me.

I want to talk a little bit about macros, and what's happened with them in 2015. But first let's just get out of the way any doubts about 007 being a real language. Here, have a Christmas tree.

for [1, 2, 3, 4, 5, 2, 2] -> n {
    my indent = " " x (5 - n);
    my tree = "#" x (2 * n - 1);
    say(indent ~ tree);
}

Which gives this output:

    #
   ###
  #####
 #######
#########
   ###
   ###

If you want to describe 007 real quick, you could say it has a syntax very much like Perl 6, but it borrows things from Python (and JavaScript). For example, there are no sigils. (Awww!) There's no $_, so if you're interested in the elements you're iterating over, you need to declare a loop variable like we did. There's no flattening, no list context, and no conflation between a single item and a list-of-one. (Yay!)

007 does have objects:

my agent = {
    name: "James Bond",
    number: 007
};

say(agent.name);             # James Bond
say(agent.has("number"));    # 1

And, as it happens, all the types of statements, expressions, and other program elements are also expressible in the object system.

my code = quasi {
    say("secret agent!")
};

say(code.statementlist.statements[0]
        .expr.argumentlist.arguments[0]
        .value);             # secret agent!

The quasi there "freezes" the program code into objects, incidentally the exact representation that the 007 compiler deals with. We can then dig into that structure, passing down through a block, a statement, a (function invocation) expression, and finally a string value.

Being able to take this object-oriented view of the program structure is very powerful. It opens up for all kinds of nice manipulations. Perhaps the most cogent thing I've written about such manipulations is a gist about three types of macros. But the way ahead is still a bit uncertain, and 007 is there exactly to scout out that way.

I recently blogged about 007 over at strangelyconsistent, telling a little bit about what's happened with the language in the past year. Here, today, I want to tell about some things that are more directly relevant to Perl 6 macros.

But first...

What are all those Q types, then?

Giving a list of the Q types is still dangerous business, since the list itself is still in mild flux. But with that in mind, let's see what we have.

There's a Q type for each type of statement. Q::Statement::If is a representative example. There's also while loops, for loops, declarations of my variables, constants, subs, macros, etc. Return statements.

A common type of statement is Q::Statement::Expr, which covers anything from a function call to doing some arithmetic. A Q::Statement::Expr is simply a statement which contains a single expression.

So what qualifies as an expression? All the operators: prefix, infix, postfix. Various literal forms: ints, strings. Bigger types of terms like arrays and objects. Identifiers — that is, something like a variable, whether it's being declared or being used. Quasi terms, like we saw above.

Some constructs have some "extra" moving parts that are neither statements nor expressions. For example, blocks have a parameter list with parameters — the parameter list is a Q type, and a parameter is a Q type. Function calls have argument lists. Subs and macros have trait lists. Quasis have "unquotes", a kind of parametric hole in the quasi where a dynamic Qtree can be inserted.

And that's it. For a detailed current list, see Q.pm.

First insight: the magic is in the identifiers

Way back when we started to think about macros in Perl 6, we came to realize that Qtrees are never quite free of their context. You always keep doing variable lookups, calling functions, etc. Because this happens in two vastly different environments (the macro's environment, and the macro user's environment), quite a bit of effort goes to keeping these lookups straight and not producing surprising or unsafe code. This concern is usually referred to as "hygiene", but all we really need to know is that the macro environment is supposed to be able to contain any variable bindings and still not mess up the macro user's environment... and vice versa.

macro moo() {
    my x = "inside macro";
    
    return quasi { say(x) };
}

my x = "outside macro";
moo();    # inside macro

The prevailing solution to this (and the one that I started to code into Rakudo as part of my macros grant) was to achieve hygiene by the clever use of scopes. See this github issue comment for a compelling example of how code blocks would be injected in such a way that the variable lookup would just happen to find the right declaration, even if that means jumping all over the program.

With the recent work of 007, it's evident that the "clever use of scopes" solution won't work all the way. I'm glad I discovered this now, in a toy language implementation, and not after investing lots more time writing up a real solution of this in Rakudo.

The fundamental problem is this: injecting blocks is all good and well in either a statement or an expression. Both of these situations can be made to work. But we also saw in the previous section that there are "extra" Q types which fall outside of the statement/expression hegemony. We don't always think about those, but they are just as important. And they sometimes contain identifiers, which would be unhygienic if there wasn't a solution to cover them. Example: a trait like is looser(infix:<+>). The hygiene consists of infix:<+> being guaranteed to mean what it means in the defining macro environment, not what it means in the using mainline environment.

The new solution is deceptively simple, and will hopefully take us all the way: equip identifiers with a knowledge of what block they were defined in. If we exclude all macro magic, all identifiers will always start their lookup from "here", the position in the code that the runtime is in. (Note that this is already strong enough to account for things like closures, because "here" is exactly the context in which the closure was defined.)

The magic thing about macros is that you will inject a lot of code, and all the identifiers in that code will remember where they were born: in a quasi in a macro somewhere. That's the context in which they will do the lookup, not the "here" of the mainline code. Et voilà, hygiene.

We still allow identifiers to be synthetically created (using object construction syntax) without such a context. This makes them "detached", and the idea is that this will provide a little bit of a safety vent for when people explicitly want to opt out of hygiene. The Common Lisp and Scheme people seem to agree that there are plenty of such cases.

Second insight: there are two ways to create a program

So we know that Qtrees are formed by the parser going over a program, analyzing it, and spitting out Q objects that hang together in neat ways. We know that this process is always "safe", because we trust the parser and the parser is nice.

But in 007 where Q objects are in some sense regular objects, and they can be plugged together in all kinds of way by the user, anything could happen. The user is not necessarily nice — as programmers we take this as a self-evident truth. Someone could try to cram a return statement into a trait! Or drop a parameter list into a constant declaration. Many other horrors could be imagined.

So there have to be rules. Kind of like a HTML DOM tree is not allowed to look any which way it wants. We haven't set down those rules in stone yet in 007, but I suspect we will, and I suspect it will be informative.

(The challenging thing is when we try to combine this idea of restriction with the idea of extending the language. Someone wants to introduce optional parameters. Ok, let's say there's a process for doing that. Now the first obstacle is the rule that says that a parameter consists of just an identifier and nothing else. Once you've negotiated with that rule and convinced it to loosen up, you still have various downstream tools such as linters and code formatters you need to talk to. The problem is not unsolvable as such, just... the right kind of "interesting".)

But it doesn't end with just which nesting relationships — there's a certain sense in which a completely synthetically constructed Qtree isn't yet part of the program as such. For example, it contains variable declarations, but those variable haven't been declared yet. It contains macro calls, but those macros haven't been applied yet. And so on. All of those little things need to happen as the synthetic Qtree is introduced into the bigger program. We currently call this process checking, but the name may change.

Parsing and checking are like two sides of the same coin. Parsing is, in the end, just a way to convince the compiler that a piece of program text is actually a 007 program. Checking is just a way to convince the compiler that a Qtree is actually a 007 program (fragment). We've grown increasingly confident in the idea of checking, partly because we're using it quite heavily in our test suite.

But why stop there? Having two code paths is no fun, and in a sense parsing is just a special case of checking! Or, more exactly, parsing is a very optimized form of checking, where we make use of the textual medium having certain strengths. Why are identifiers limited to the character set of alphanumerics and underscores (and, in Perl 6, hyphens and apostrophes)? Mostly because that's a way for the parser to recognize an identifier. There's nothing at all that prevents us from creating a synthetic Q::Identifier with the name , and there's nothing at all that prevents the checker from processing that correctly and generating a correct program.

It seems to me that the work forwards is to explore the exact relationship between text and Qtrees, between parsing and checking, and to set up sensible rules within which people can write useful macros.

This is exciting work, and will eventually culminate in very nice macros in Perl 6. Expect to hear more about 007 in 2016. See you on the other side!

Day 17 – A loop that bears repeating

I don’t believe anyone has ever Advent blogged about this topic. But it’s getting hard to tell, what with almost six years of old posts. 😁

This story starts in Perl 5, where we implement a countdown with a simple while loop:

$ perl -wE'my $c = 5; while ($c--) { say $c }'
4
3
2
1
0

And of course, if we want to exit early from the loop, we use the trusty last:

$ perl -wE'my $c = 5; while ($c--) { say $c; last if $c == 2 }'
4
3
2

So far, so good. But notice that $c is decremented before each run of the loop body. Under some circumstances it makes more sense to do it afterwards. In Perl 5, we’d use do ... while:

$ perl -wle'my $c = 5; do { print $c; } while $c--'
5
4
3
2
1
0

Great. Now we just need to combine this with the early exit:

$ perl -wle'my $c = 5; do { print $c; last if $c == 2 } while $c--'
5
4
3
2
Can't "last" outside a loop block at -e line 1.

Eh? Oops.

Experienced Perl people will know what’s going on here. The documentation, in this case perldoc perlsyn, explains it clearly:

The “while” and “until” modifiers have the usual “”while” loop” semantics (conditional evaluated first), except when applied to a “do”-BLOCK […], in which case the block executes once before the conditional is evaluated.

…and then…

Note also that the loop control statements described later will NOT work in this construct, because modifiers don’t take loop labels. Sorry.

In other words, the do ... while construct is a little bit of a cheat in Perl 5, made to work as we expect, but really just a do block with reindeer horns glued on top of its head. The cheat is exposed when we try to next and last from the do block.

perldoc perlsyn goes on to explain how you can mitigate this with more blocks and loop labels, but in a way, the damage is already done. We’ve lost a little bit of that natural belief in goodness, and Santa, and predictable language semantics.

It goes without saying that this is a situation up with which Perl 6 will not put. Let’s see what Perl 6 provides us with.

Instead of painstakingly lining up differences, let’s just first replay our Perl 5 session with the corresponding Perl 6:

$ perl6 -e'my $c = 5; while $c-- { say $c }'
4
3
2
1
0
$ perl6 -e'my $c = 5; while $c-- { say $c; last if $c == 2 }'
4
3
2
$ perl6 -e'my $c = 5; repeat { say $c } while $c--'
5
4
3
2
1
0
$ perl6 -e'my $c = 5; repeat { say $c; last if $c == 2 } while $c--'
5
4
3
2

We’ve gotten rid of a few parentheses, as is usually the case when changing to Perl 6. But the biggest difference is that do has now been replaced by another keyword repeat.

In Perl 5, when we saw the do keyword and a block, we didn’t know if there would come a while or until statement modifier after the block. In Perl 6, when we see the repeat, that’s a promise that this is a repeat ... while or repeat ... until loop. So it’s a real loop in Perl 6, not a fake.

Oh, and the last statement works! As it does in real loops.


The story might have ended there, but Perl 6 has a little bit more of a gift to give with this type of loop.

Let’s say we’re waiting for an input that has to have exactly five letters.

sub valid($s) {
    $s ~~ /^ <:Letter> ** 5 $/;
}

This is the type of thing that we’d tend to express with a repeat-style loop, because we have to read the input first, and then find out if we’re done or not:

my $input;
repeat {
    $input = prompt "Input five letters: ";
} until valid($input);

The $input variable has to be declared separately outside of the loop block, because we’re testing it in the until condition outside of the loop block.

Even though we now know to expect a while or until after the block, having that information there makes the end-weight of the loop a bit problematic: we’re delaying some of the most pertinent information until last.

(A similar problem happens in Perl 5 regexes, where a lot of modifiers can show up after the final /, changing the meaning of the whole regex body. And of course, even in spoken and written language, you might have a sentence which goes on and on and just won’t stop, even though it should have long ago, and finally when you think you’ve got the hang of it, it surprises you unduly by ending with altogether the wrong hamster.)

For this reason, Perl 6 allows the following variant of the above repeat loop:

my $input;
repeat until valid($input) {
    $input = prompt "Input five letters: ";
}

That looks nicer!

I hasten to underline that even after moving the condition to the top of the loop like this, the code still has the previous behavior — that is, the condition valid($input) is still evaluated after each iteration of the loop body. (That is, after all, the difference between a repeat until loop and just an until loop.) In other words, we get to place the condition where it’s more prominent and harder to miss, but we retain the expected eval-condition-afterwards semantics.

As a final nice bonus, we can now inline the declaration of $input into the loop condition itself.

repeat until valid(my $input) {
    $input = prompt "Input five letters: ";
}

This clearly shows the difference between order of elaboration (in which variables are declared before they are used) and order of execution (in which statements and expressions evaluate in the order they damn well please).

That’s repeat, folks. Solves a slew of problems, and lets us write nice, idiomatic code.

Day 12 – I just felt a disturbance in the switch

So I said “I’m going to advent post about the spec change to given/when earlier this year, OK?”

And more than one person said “What spec change?”

And I said, “Exactly.”


We speak quite proudly about “whirlpool development” in Perl 6-land. Many forces push on a particular feature and force it to converge to an ideal point: specification, implementation, bugs, corner cases, actual real-world usage…

Perl 6 is not alone in this. By my count, the people behind the HTML specification have now realized at least twice that if you try to lead by specifying, you will find to your dismay that users do something different than you expected them to, and that in the resulting competition between reality and specification, reality wins out by virtue of being real.

I guess my point is: if you have a specification and a user community, often the right thing is to specify stuff that eventually ends up in implementations that the user community benefits from. But information can also come flowing back from actual real-world usage, and in some of those cases, it’s good to go adapting the spec.


Alright. What has happened to given/when since (say) some clown described them in an advent post six years ago?

To answer that question, first let me say that everything in that post is still true, and still works. (And not because I went back and changed it. I promise!)

There are two small changes, and they are both of the kind that enable new behavior, not restrict existing behavior.

First small change: we already knew (see the old post) that the switching behavior of when works not just in a given block, but in any “topicalizer” block such as a for loop or subroutine that takes $_ as a parameter.

given $answer {
    when "Atlantis" { say "that is CORRECT" }
    default { say "BZZZZZZZZZT!" }
}
for 1..100 {
    when * %% 15 { say "Fizzbuzz" }
    when * %% 3 { say "Fizz" }
    when * %% 5 { say "Buzz" }
    default { say $_ }
}
sub expand($_) {
    when "R" { say "Red Mars" }
    when "G" { say "Green Mars" }
    when "B" { say "Blue Mars" }
    default { die "Unknown contraction '$_'" }
}

But even subroutines that don’t take $_ as a parameter get their own lexical $_ to modify. So the rule is actually less about special topicalizer blocks and more about “is there something nice in $_ right now that I might want to switch on?”. We can even set $_ ourselves if we want.

sub seek-the-answer() {
    $_ = (^100).pick;
    when 42 { say "The answer!" }
    default { say "A number" }
}

In other words, we already knew that when (and default) were pretty separate from given. But this shows that the rift is bigger than previously thought. The switch statement logic is all in the when statements. In this light, given is just a handy topicalizer block for when we temporarily want to set $_ to something.

Second small change: you can nest when statements!

I didn’t see that one coming. I’m pretty sure I’ve never used it in the wild. But apparently people have! And yes, I can see it being very useful sometimes.

when * > 2 {
    when 4 { say 'four!' }
    default { say 'huge' }
}
default {
    say 'little'
}

You might remember that a when block has an implicit succeed statement at the end which makes the surrounding topicalizer block exit. (Meaning you don’t have to remember to break out of the switch manually, because it’s done for you by default.) If you want to override the succeed statement and continue past the when block, then you write an explicit proceed at the end of the when block. Fall-through, as it were, is opt-in.

given $verse-number {
    when * >= 12 { say "Twelve drummers drumming"; proceed }
    when * >= 11 { say "Eleven pipers piping"; proceed }
    # ...
    when * >= 5 { say "FIIIIIIVE GOLDEN RINGS!"; proceed }
    when * >= 4 { say "Four calling birds"; proceed }
    when * >= 3 { say "Three French hens"; proceed }
    when * >= 2 {
        say "Two turtle doves";
        say "and a partridge in a pear tree";
    }
    say "a partridge in a pear tree";
}

All that is still true, save for the word “topicalizer”, since we now realize that when blocks show up basically anywhere. The new rule is this: when makes the surrounding block exit. If you’re in a nested-when-type situation, “the surrounding block” is taken to be the innermost block that isn’t a when block. (So, usually a given block or similar.)


It’s nice to see spec changes happening due to how things are being used in practice. This particular spec change came about because jnthn tried to implement a patch in Rakudo to detect topicalizer blocks more strictly, and he came across userland cases such as the above two, and decided (correctly) to align the spec with those cases instead of the other way around.

The given/when spec has been very stable, and so it’s rare to see a change happen in that part of the synopses. I find the change to be an improvement, and even something of a simplification. I’m glad Perl 6 is the type of language that adapts the spec to the users instead of vice versa.

Just felt like sharing this. Enjoy your Advent.

Day 5 – Identifiers have hyphens in them

One day on the #perl6 channel, back in 2009, I stumbled into a conversation where Larry said “it didn’t break any spectests, and that convinced me” or something like that. Maybe it broke a couple of spectests, but they apparently needed breaking anyway.

The change in question was adding hyphens and apostrophes to identifiers. Normally in languages, these are valid identifier names:

my $foo;
my $please_do;

But Perl 6 also allows these:

my $foo-with-hyphens;
my $please-don't;

As usual, I was conservative, and slow to pick up these changes. I didn’t like what it did to my vim highlighting of variables. Whatever.

These days I kind of like the hyphens in identifiers. Mostly I just decide on a project-by-project basis whether I want to use hyphens, but I notice myself deciding to use them more and more often. They just look nicer, somehow. More balanced.

Damian Conway, on the Perl 6 Language mailing list, tried to institute the convention that hyphens and underscores should be used interchangeably — hyphens where you’d use hyphens in a sentence, and underscore where you’d use a space between words. I haven’t seen anyone pick up on that practice. I suspect it is because many of us are not native speakers, but rather speakers of some nebulous Goodenuf English, and we would hesitate between what’s two words and what’s a hyphen-separated compound. Actually, I’m pretty sure native speakers hesitate too sometimes.

Anyway, there’s an obvious parser conflict which you may have spotted: the hyphen is already used for infix minus. Perl 6 disambiguates this by saying that you’re allowed to use a hyphen (or an apostrophe) if it’s then followed by an alphabetic character. Thanks to sigils, this is enough to make it clear what you mean.

my $foo-2;     # variable minus 2 
my $foo-bar;   # a single variable;

Now I want to say two things at once. The first thing is that the apostrophe is an old vestigial thing in Perl 5 identifiers, too. It means the same as the package separator ::. Which is how Acme::Don't can exist, for example. (Look at the actual package name of the module.

The second thing is that Lisp people and Haskell people seem particularly saddened that because of this rule, you’re not allowed to put an apostrophe at the end of an identifier.

my $foo';     # not allowed

Ah, well. There will be slangs. I’m surprised there isn’t already a slang for apostrophes at the end of identifiers. ☺

Day 19 – Snow white and the seven conditionals

Perl 6 is a big language. It’s rather unapologetic about it. The vast, frightening, beautiful (and slightly dated) periodic table of operators hammers this point home. The argument for having a plethora of operators is to cover a lot of semantic ground, to let you freely express yourself with More Than One Way To Do It. The argument against is fear: “aaargh, so many!”

But ’tis the season, and love conquers fear. Today I went searching for conditionals in the language; ways to say “if p then q” in Perl 6. All this in order to take you on a guided tour of conditionals.

Here’s what we are looking for: a language construct with an antecedent (“the ‘if’ part”) and a consequent (“the ‘then’ part”). (They may also have an optional ‘else’ part, but under my arbitrary search criteria, I don’t care.)

I went looking, and I found seven of them. Now for the promised tour.

if

An oldie but a goodie.

if prompt("Should I exult? ") eq "yes" {
    say "yay!";
}

Unlike in Perl 5, you don't have to put parentheses around the antecedent expression. (Cue Python programmers rolling their eyes, unimpressed.) There are elsif and else clauses that you can add if you need it.

I won't count unless separately, as all it does is reverse the logical value of the antecedent.

when

The when statement also qualifies. It's a conditional statement, but with a preference to matching against the topic variable, $_.

given prompt("What is your age? ") {
    when * < 15 { do-not-let-into-bar() }
    when * < 25 { ask-for-ID() }
    when * > 80 { suggest-bingo-around-the-corner() }
}

The semantic "bonus" with this statement is that the end of a when block implicitly triggers a succeed (known as break to traditionalists), which will leave the surrounding contextualizer ($_-binding) block, if any. Or, in plain language, you don't have to put in explicit breaks in your Perl 6 switch statements.

The when statement doesn't have an opposite version. I have sometimes campaigned for whenn't, but with surprisingly (or rather, predictably) little uptake. Ah well; there will be modules.

&&

"Now hold your horses!", I hear you say. "That ain't no conditional!" And you would be right. But, according to the criteria of the search, it is, you see. It has an antecedent and a consequent. The consequent only gets evaluated if the antecedent is truthy.

if @stuff && @stuff[0] eq "interesting" {
    # ...
}

This behavior is called "short-circuiting", and Perl 6 is only one of a long list of languages that work this way.

I won't count ||, because it's the dual of &&. It also short-circuits, but it bails early on truthy values instead of falsy ones.

and

If we count &&, surely we should count its low-precedence cousin, and.

open("file.txt")
    and say "Yup, it opened";

Some people use && and and interchangeable, or prefer only one of them. I'm a little bit more restrictive, and tend to use && inside of expressions, and and only between statements.

I won't count or, again because it's the dual of and.

not ?&

A non-entrant in our tour which nontheless deserves an honorable mention is the ?& operator. It's a "boolifying &&".

But besides always returning a boolean value (rather than the rightmost truthy value, as && and and do), it also doesn't short-circuit. Which makes it useless as a conditional.

sub f1 { say "first"; False }
sub f2 { say "second" }

f1() && f2();   # says "first"
f1() ?& f2();   # says "first", "second"

Therefore, there are two reasons I'm not counting ?|.

not & either

We have another honorable mention. In Perl 5, the & operator deals in bitmasks (just as in a lot of C-derived languages), but in Perl 6 this operator has been stolen for a higher purpose: junctions.

f1() & f2();    # says "first", "second" in some order

The old bitmask operator (now spelled +&, by the way) doesn't short-circuit, and never did in any language I'm aware of. This new junctional operator also doesn't short-circuit. In fact, the semantics it conveys is one of quantum computer "simultaneous evaluation". Actual threading semantics is not guaranteed, currently does not happen in Rakudo — and may not be worth the overhead for most small calculations anyway — but the message is clear: hie the hence, short-circuiting.

//

The most questionable conditional on my list, this one nevertheless made the cut. It's a version of ||, but instead of testing for truthiness, it tests for undefinedness. Other than that, it's very conditional.

my $name = %names{$id} // "<no name given>";

This operator is notable for sharing a first place (with say) as "most liked feature side-ported to Perl 5". (Just don't mention smartmatch to a Perl 5 person.)

The dual of //, for evaluating a consequent only if the antecedent is a defined value, is spelled ⅋⅋, and of course, it does not exist and never did. *bright neuralizer flash*

?? !!

Loved and known by many names — ternary operator, trinary operator, conditional operator — this operator defies grammatical categories but is clearly a conditional, too. It even has an else part.

my $opponent = $player == White ?? Black !! White;

Almost every language out there spells this operator as ? : and it takes a while to get used to the new spelling in Perl 6. (The reason was that both ? and : were far too short and valuable to waste on this operator, so it got "demoted" to a longer spelling.)

If you ask me, ?? !! is an improvement once you see past the historical baggage: all of the other pure boolean operators already identify themselves by doubling a symbolic character like that. It blends in better.

You didn't hear it from me, but if anyone ever feels like creating an unless/else version of this operator, you could do a lot worse than choosing ¿¿ ¡¡ for the job. Ah, Unicode.

andthen

Finally, a late entry, andthen also works as a conditional. Just when you thought Perl 6 had you semantically covered with && and and for ordinary boolean values, and // for undefined values...

...it opens up a third door. The andthen operator only evaluates its right-hand statement if the left-hand statement succeeded.

shut-down-reactor()
    andthen don-hazmat-suit()
    andthen enter-chamber()
;

Succeeding means returning something defined and not failing along the way. Structually, the andthen operator has been compared to Haskell's monadic bind: success of earlier statements guards execution of later ones. Though it's not implemented yet in Rakudo, you're even supposed to be able to pass the successful value from the left-hand statement to the right-hand one. (It ends up in $_.)

I won't count orelse, even though it also short-circuits. Though it's worth noting that the right-hand side of orelse ends up with the $! value from the left-hand side.

Signoff

Thank you for taking the tour. As you can see, there really is More Than One Way to write a condition in Perl 6. Some of them will turn up more often in everyday code, and some less often. I use and only rarely, and I've yet to use an andthen in real code. The others all crop up fairly regularly, and all in their own favorite contexts.

Some languages give you a sparse diet of the bare necessities, claiming that it's good for you to have fewer choices. Perl is not one of those languages. Instead, you get lots of choice, lots of freedom, and it's up to you to do great things with it. Of course, we also try to make sure that the constructs are there make sense, are in good taste, and feel consistent with the rest of the language. The end result is a pleasant language with plenty of rope to shoot yourself in the foot with.

Merry conditional Christmas!