Day 23 – Webscale sorting of the sleepy kind

Day 23 – Webscale sorting of the sleepy kind

In the emerging tradition of writing about sort on the 23rd, let me elaborate on a particularly silly kind of sorting algorithm: sleepsort.

The idea is that to sort a list of numbers, you spawn a separate process (or thread), sleep for as long as the number specifies, and then print the result. That way the numbers are printed in ascending order.

Rakudo on the MoarVM backend has threads, so let's put them to good use:

 use v6;
 
 my @unsorted = (1..10).pick(5);
 
 say "Unsorted: @unsorted[]";
 
 await @unsorted.map: -> $t {
     start {
         sleep $t;
         say $t;
     }
 };

If you run it, you should get output like this:

 $ ./perl6-m sleepsort-01.p6
 Unsorted: 1 7 5 8 3
 1
 3
 5
 7
 8

How did we get there? (1..10).pick(5) randomly picks (without replacing) five numbers from 1 to 10. start { ... } starts a thread and returns a Promise object. We do that for all the numbers, and await blocks until all promises that have passed to it are fullfilled (or broken, but that won't happen here). Or in other words: until all the newly spawned threads have finished.

Now let's make it web scale! Well, sort (sic) of.

First an obvious improvement is to not sleep for the whole second. Let's just scale it with a constant factor: sleep $t / 10, and we're about ten times as fast!

But there's another scaling problem: If you want to sort hundreds of thousands of numbers with sleep sort (you'd do that, right?), our program would spawn as many threads. So lots of memory wasted. Also sleep maps pretty much directly to the underlying libc call, and thus blocks the thread. We can do better:

 use v6;
 
 my @unsorted = (1..10).pick(5);
 
 await @unsorted.map: -> $t {
     Promise.in($t / 10 ).then({ say $t });
 };

Promise.in($s) creates a Promise that will be kept in $s seconds. .then creates a chained promise whose block is executed when the first one is kept.

This also removes the need to spawn processes explicitly. The standard scheduler takes care of that. We're basically web scale!

Now it's also time to address an issue of correctness. The sleepsort algorithm is a kind of divide-and-conquer approach, but there's no joining of the results. There is no single point in the program that can access the entire the sorted list. To fix that, we use a Channel, a thread-safe queue:

 use v6;
 my @unsorted = (1..10).pick(5);
 
 my $channel = Channel.new;
 
 await @unsorted.map: -> $t {
     Promise.in($t / 10 ).then({ $channel.send($t) });
 };
 $channel.close;
 
 say $channel.list;

This prints the sorted list all from the main thread.

So now it can be encapsulated in a nice subroutine.

 sub sleepsort(*@values, :$factor) {
     my $channel = Channel.new;
 
     await @values.map: -> $t {
         Promise.in($t * $factor ).then({ $channel.send($t) });
     };
     $channel.close;
 
     $channel.list;
 }

Another issue of correctness remains: Both sleep and Promise.in only specify a minimal time to be slept; implementation-dependent, longer sleep times are possible. If $factor is too small, the promises might executed out of the desired order.

So let's find the minimal factor with which it still works:

 my $previous = Inf;
 for 0.1, */1.5 ... * -> $factor {
     say "Trying $factor";
     my $success = [<=] sleepsort(:$factor, @unsorted);
     say $success ?? 'Success' !! 'Failed';
     unless $success {
         say "Last good factor: ", $previous;
         last;
     }
     $previous = $factor;
 }

On my machine, this produces the following output:

 Trying 0.1
 Success
 Trying 0.066667
 Success
 Trying 0.044444
 Success
 Trying 0.029630
 Success
 Trying 0.019753
 Success
 Trying 0.013169
 Success
 Trying 0.008779
 Success
 Trying 0.005853
 Failed
 Last good factor: 0.008779

So a :$factor of around 0.01 or 0.09 seems to work on my setup.

Your output/mileage may vary.

Over and out, sleeping until Christmas Eve.

Day 22 – The Cool subset of MAIN.

Day 22 – The Cool subset of MAIN.

In the Unix environment, many scripts take arguments and options from the command line. With Perl 6 it’s very easy to accept those. At least, that’s what Our 2010 advent post on MAIN says.

But today, we’re interested in going further, and we’re going to try to gain more expressiveness from it.
The first Perl 6 feature we’re going to look at here is subsets. Subsets have a very broad range of applications, but they’re even better when used in conjunction with MAIN.

Subsets are a declarative way to specify conditions on your type. If you wanted to translate “A QuiteSmallInt is an Int with a value lesser than 10”, you can write

subset QuiteSmallInt where * < 10;

In case you don’t remember from our 2010 advent post about smart matching or our 2011 advent post about multi-method dispatch, here’s a quick reminder of how you can use subsets with smart matching, multi-method dispatch or typed variables.

say 9 ~~ QuiteSmallInt; # True
say 11 ~~ QuiteSmallInt; # False

multi sub small-enough(QuiteSmallInt) { say "Yes!" }
multi sub small-enough($) { say "No, too big."; }

small-enough(9); # Prints "Yes!"
small-enough(11); # Prints "No, too big."

# You can also use it to type variables
my QuiteSmallInt $n = 9; # Works!
my QuiteSmallInt $n = 11; # Errors out with "Type check failed in assignment to '$n'; ..."

Alright; now that we got this out of the way, what does this buy us?
Well, as we just demonstrated, we can dispatch to different subroutines using those “where” conditions.
That goes for MAIN as well.

Say we wanted to take a filepath as the first argument of a MAIN. We’ll also write two “companions” that’ll give a descriptive error message in case we pass bad arguments.

# a File is a string containing an existing filename
subset File of Str where *.IO.e; # .IO returns an IO::Path object, and .e is for "exists"
subset NonFile of Str where !*.IO.e;

multi sub MAIN(File $move-from, NonFile $move-to) {
  rename $move-from, $move-to;
  say "Moved!";
}
multi sub MAIN($, File $) {
  say "The destination already exists";
}
multi sub MAIN(NonFile $, $) {
  say "The source file doesn't exist";
}

And now, if we try to use it (after saving it as “main.p6”)…

$ touch file1.txt
$ perl6 ./main.p6 file1.txt destination.txt
Moved!
$ perl6 ./main.p6 non-existant.p6 non-existant.txt
The source file doesn't exist!
$ perl6 ./main.p6 destination.txt destination.txt
The destination already exists

Alright, looks good. There’s a little catch, however: the –help will display 3 usages, when there’s only one.

$ perl6 ./main.p6 --help
Usage:
tmp.p6 <move-from> <move-to>
tmp.p6 <Any> (File)
tmp.p6 (NonFile) <Any>

Whoops! We need to find a way to hide them (if you’re wondering, the “<Any>” is because the arguments both are unnamed and untyped).
Luckily enough, Perl6 provides a trait for that: hidden-from-USAGE, added by moritz++ for this advent post (USAGE is the –help message).

Let’s add those to our MAINs.

multi sub MAIN(File $move-from, NonFile $move-to) {
  rename $move-from, $move-to;
  say "Moved!";
}
multi sub MAIN($, File $) is hidden-from-USAGE {
  say "The destination already exists";
}
multi sub MAIN(NonFile $, $) is hidden-from-USAGE {
  say "The source file doesn't exist";
}

And now, the USAGE is correct:

$ perl6 ./main.p6 --help
Usage:
tmp.p6 <move-from> <move-to>

Okay, now, we just need to add a description.

The second Perl6 feature we’re going to use is Pod.
Pod is an easy-to-use markup language, which you can embed to document your code.

#= Rename a file
multi sub MAIN(File $move-from, NonFile $move-to) {
  rename $move-from, $move-to;
  say "Moved!";
}

# You can write it in multiple lines, though in USAGE they'll be collapsed
# (joined by a space)

#={Rename
a
file}
multi sub MAIN(File $move-from, NonFile $move-to) {
  rename $move-from, $move-to;
  say "Moved!";
}

Both versions above will print the same –help:

$ perl6 ./main.p6 --help
Usage:
tmp.p6 <move-from> <move-to> -- Rename a file

And now you have everything you need to create a nice CLI interface :-).

Day 21 – Community smoke testing

Day 21 – Community smoke testing

So after all we are good programmers who know their stuff, right? I mean, I know that I should better test the code that I write, and I tend to do that for every change I do. At work I often just run a particular script that tests the new feature or the absence of a bug, but eventually these test codes are added to a test suite at the end of the day. But even when you are a better programmer than me, and you write tests beforehand, there is one problem left:

Assumptions

We assume quite a lot when we write or test our programs. We know how the program is meant to work and that is our problem, which is so hard to escape from. We assume that someone will enter a valid date in a date field, we expect positive numbers when it comes to quantities. And we usually test our programs on a given set of platforms, often just the one PC or Mac we’re sitting at.

One important rule about test suites is that you should try to forget about what Certainly Should Work™. No, test built-ins, because these can change over time. Also query the infrastructure if other tests rely on specific things. Test the environment as detailed as you can, because if something fails early, you save time debugging the problem. This is even much more worth if you get reports from systems you don’t have access to. Debugging a very high level and late problem in a test suite on a strange or just foreign platform can be frustrating.

So in case we want to minimize the assumptions we talked about… What’s the opposite? – It is knowledge made out of experience, which itself is made of input. That’s the approach we will take. I come back to that a little bit later.

Back to what we can do for you. We cannot directly do much about your expectations and assumptions when it comes to your test suite, sadly. We don’t have a framework that pushes garbage input data to your scripts, but we can test your program in different ways. And we can hopefully provide feedback to recent changes to your code base very quickly, but that will also only work out when there are testers that test your distribution often.

Personally I code using the most bleeding edge rakudo compiler with bleeding edge of the MoarVM backend, on an Ubuntu linux box. That is a quite specific scenario. And there are a lot of different setups out there, and even when your code is not platform specific, one library, a dependency of a dependency surely is.

Generate a lot of input, to be turned into experience and knowledge

How are we going to achieve that? Of course we have to intercept the build and test stages that happen when a random user installs our distributions. That user can help us getting reports be doing:

# on unixes:
PANDA_SUBMIT_TESTREPORTS=1 panda install Foo

# on windows
set PANDA_SUBMIT_TESTREPORTS=1
panda install Foo

On can also set and export this environment variable permanently, so that every attempt of installing a distribution will be reported and the dist author is able to care about upcoming problems.

The great benefit of this way of gathering information in contrast to smoke testing the ecosystem on a central box is of course that we get reports from a wide variety of operating systems, compiler versions, locales and dependencies like other Perl 6 distributions but also C libraries.

A test report made by panda will look somewhat like this:

Sample test report

Though the last section contains the entire report in fact. You’ll see information about the operating system, the kernel, the backend used by rakudo, flags how that backend was built, and you also see some information about the tested distribution. I think it would be worthwile to extend that information so that certain meaningful environment variables are included as well, I am thinking of vars like MVM_JIT_DISABLE that have an impact on how the test programs are executed.

Once we received reports like this we build stats to highlight how well the distribution is doing. One of them is a matrix that shows the pass rate across compiler version, operating system and backend.

Platform/version matrix

When you’d send a report for a platform that is not listed there yet (like GNU/Hurd), it would automatically expand the list when the matrix gets regenerated. As of today that is done every five minutes. The colored bars represent the three backends, MoarVM is the topmost, JVM in the middle and Parrot at the bottom. The color coding is:

  • green – The tests passed, that also means that the test stage has to exit with code 0.
  • orange – That usually means that we did not run any tests, but besides that everything ran cleanly.
  • red – Something went wrong. Either stage build or stage test.

You see that the list of compiler versions can grow pretty quickly, and I think the best solution is to collapse the development releases into a single line, and make it expand on click. On this matrix all shown releases are dev releases, recognizable by the trailing partial sha1.

So now imagine these stats are about your distribution. You see red flashing lights that should put you in action and you will quickly take a look at all negative test results. Now, some of the reports might reveal pretty quickly what is wrong. Maybe you’ve forgotten to add a dependeny to the META file shipped with the distribution. But there might be other failing tests where you don’t have a clue what went actually wrong, or where to start poking.

The good thing is that testers.perl6.org is not a tool that only works in one direction. Sure, the user runs your test and the result travels in your direction. But remember that you are the master of the tests in question. If you are in doubt about the cause of a bug, then extend the test suite, and wait for more reports to arrive that provide your recently added diagnostics.

That’s where the loop closes to the things I said earlier. Wipe everything from your mind that you *think* that applies to the box the tests failed. Test everything that is involved in the failing functionality. And if you have to test the built ins of the Perl 6 compiler, do it. If you are pedantic and suspicious enough, you’ll get a great test suite that offers many informations also for upcoming test failures.

Sidenote: For the Perl 5 community it is quite usual that there are volunteers that run tests for distributions every day around the clock. I hope that we’ll have such awesome volunteers too in near future. So if you have boxes that run 24/7, have a weird architecture or a rare operating system, please run tests to aid the other developers in the right direction.

Forecast

I have plenty of ideas how this tool can be improved. A connection to issue trackers like GitHub Issues is one of my favourite which is also listed here: TODO of testers.perl6.org
If you are interested in helping out, the repository of testers.perl6.org is here and I’d be pleased to accept pull request or hand out commit bits.

Day 20 – Helping Santa with Roles

Day 20 – Helping Santa with Roles

On Day 14 of the Advent calendar I stopped lurking and started learning Perl 6 by converting a Perl 5 Timer object into its Perl 6 equivalent. Day 14 ends with a ‘challenge’ to convert the Perl 6 Timer class into a role.

Perl 6 roles encapsulate functionality for ready reuse by classes and objects.

The Timer class is a good candidate for ‘role-ification’ – it’s simple and is potentially useful to all sorts of objects.

Santa wants to go Lean this year, don’t worry he’s still rotund – he just wants to apply Lean principles to the process of present giving. The Timer role will help identify bottlenecks so Santa Inc. can improve.

To turn the Timer class into a role we just swap class for role (i.e., s/class/role/):

role Timer {

    has $!start-time;   # private scalar storing the starting time
    has @!event-log;    # private list of events
                        # dashes-in-identifiers save keystrokes too

    method record ($event_description) {
        # initialise the start-time if this is the first time through
        $!start-time //= now;

        @!event-log.push({
            timestamp   => now - $!start-time, 
            description => $event_description
        });
    }

    method report {
        my @report_lines;

        for @!event-log -> %event {
            @report_lines.push("[%event<timestamp>.fmt("%.3f")] %event<description>");
        }
        return @report_lines.join("\n");
    }
}

So that’s the role declared. Now I need a class or object to compose it into – something that does the role. How about a Timer for Santa’s Helpers?

use Timer;

class SantasHelper does Timer {
    has @!presents = qw<train lego doll> ;

    method wrap-present($child's_name) {
        return @!presents.pick ~ " for " ~ $child's_name;            
    } 
}

A quick check in the REPL shows:

> use SantasHelper;
> my $bruce = SantasHelper.new;    # this helper has an Australian accent
> $bruce.^methods
record report wrap-present

Pointing to the Higher Order Workings (HOW) of the $bruce object via .^ shows the Timer role’s methods (record, report) are now combined with the methods for SantasHelper (wrap-present). Let’s start wrapping presents:

use SantasHelper;

my $helper = SantasHelper.new;

my @children = qw<Hugo Honey Willow>;

# using {@children} to interpolate the list
$helper.record("started wrapping presents for {@children}"); 

for @children -> $child {
    my $present = $helper.wrap-present($child);
    $helper.record("wrapped $present");
}
say $helper.report; 

The report shows:

[0.006] started wrapping presents for Hugo Honey Willow
[0.013] wrapped train for Hugo
[0.016] wrapped doll for Honey
[0.020] wrapped doll for Willow

Nice.

But now there’s a problem! Santa’s Lean approach has exposed a bottleneck. To wrap all the presents in time Santa will need to roll up his sleeves and start wrapping too. We need to refactor the wrap-present method into its own role so Santa can wrap-presents too:

role WrappingPresents {
    has @!presents = qw<doll train lego> ;

    method wrap-present($child's_name) {
        return @!presents.pick ~ " for " ~ $child's_name;            
    } 
}

It turns out we can dynamically add a role to an existing object with does:

> use Santa
> my $santa = Santa.new
Santa.new()
> use WrappingPresents
> $santa does WrappingPresents
Santa+{WrappingPresents}.new(name => Any)
> $santa.wrap-present('Emma')
lego for Emma
> $santa.wrap-present('Claire')
lego for Claire

Santa Inc is now back on track.

It’s been fun playing with Perl 6 roles. Coding in Perl 6 feels malleable and expressive – and I’ve only just started!

jnthn++ has more details about parameterised roles and what happens when you compose multiple roles with the same method signature.

For now I’ll leave the last word to Perl 6’s Santa:

> $santa.greeting();
> Happy Christmas!

Day 19 – Snow white and the seven conditionals

Day 19 – Snow white and the seven conditionals

Perl 6 is a big language. It’s rather unapologetic about it. The vast, frightening, beautiful (and slightly dated) periodic table of operators hammers this point home. The argument for having a plethora of operators is to cover a lot of semantic ground, to let you freely express yourself with More Than One Way To Do It. The argument against is fear: “aaargh, so many!”

But ’tis the season, and love conquers fear. Today I went searching for conditionals in the language; ways to say “if p then q” in Perl 6. All this in order to take you on a guided tour of conditionals.

Here’s what we are looking for: a language construct with an antecedent (“the ‘if’ part”) and a consequent (“the ‘then’ part”). (They may also have an optional ‘else’ part, but under my arbitrary search criteria, I don’t care.)

I went looking, and I found seven of them. Now for the promised tour.

if

An oldie but a goodie.

if prompt("Should I exult? ") eq "yes" {
    say "yay!";
}

Unlike in Perl 5, you don't have to put parentheses around the antecedent expression. (Cue Python programmers rolling their eyes, unimpressed.) There are elsif and else clauses that you can add if you need it.

I won't count unless separately, as all it does is reverse the logical value of the antecedent.

when

The when statement also qualifies. It's a conditional statement, but with a preference to matching against the topic variable, $_.

given prompt("What is your age? ") {
    when * < 15 { do-not-let-into-bar() }
    when * < 25 { ask-for-ID() }
    when * > 80 { suggest-bingo-around-the-corner() }
}

The semantic "bonus" with this statement is that the end of a when block implicitly triggers a succeed (known as break to traditionalists), which will leave the surrounding contextualizer ($_-binding) block, if any. Or, in plain language, you don't have to put in explicit breaks in your Perl 6 switch statements.

The when statement doesn't have an opposite version. I have sometimes campaigned for whenn't, but with surprisingly (or rather, predictably) little uptake. Ah well; there will be modules.

&&

"Now hold your horses!", I hear you say. "That ain't no conditional!" And you would be right. But, according to the criteria of the search, it is, you see. It has an antecedent and a consequent. The consequent only gets evaluated if the antecedent is truthy.

if @stuff && @stuff[0] eq "interesting" {
    # ...
}

This behavior is called "short-circuiting", and Perl 6 is only one of a long list of languages that work this way.

I won't count ||, because it's the dual of &&. It also short-circuits, but it bails early on truthy values instead of falsy ones.

and

If we count &&, surely we should count its low-precedence cousin, and.

open("file.txt")
    and say "Yup, it opened";

Some people use && and and interchangeable, or prefer only one of them. I'm a little bit more restrictive, and tend to use && inside of expressions, and and only between statements.

I won't count or, again because it's the dual of and.

not ?&

A non-entrant in our tour which nontheless deserves an honorable mention is the ?& operator. It's a "boolifying &&".

But besides always returning a boolean value (rather than the rightmost truthy value, as && and and do), it also doesn't short-circuit. Which makes it useless as a conditional.

sub f1 { say "first"; False }
sub f2 { say "second" }

f1() && f2();   # says "first"
f1() ?& f2();   # says "first", "second"

Therefore, there are two reasons I'm not counting ?|.

not & either

We have another honorable mention. In Perl 5, the & operator deals in bitmasks (just as in a lot of C-derived languages), but in Perl 6 this operator has been stolen for a higher purpose: junctions.

f1() & f2();    # says "first", "second" in some order

The old bitmask operator (now spelled +&, by the way) doesn't short-circuit, and never did in any language I'm aware of. This new junctional operator also doesn't short-circuit. In fact, the semantics it conveys is one of quantum computer "simultaneous evaluation". Actual threading semantics is not guaranteed, currently does not happen in Rakudo — and may not be worth the overhead for most small calculations anyway — but the message is clear: hie the hence, short-circuiting.

//

The most questionable conditional on my list, this one nevertheless made the cut. It's a version of ||, but instead of testing for truthiness, it tests for undefinedness. Other than that, it's very conditional.

my $name = %names{$id} // "<no name given>";

This operator is notable for sharing a first place (with say) as "most liked feature side-ported to Perl 5". (Just don't mention smartmatch to a Perl 5 person.)

The dual of //, for evaluating a consequent only if the antecedent is a defined value, is spelled ⅋⅋, and of course, it does not exist and never did. *bright neuralizer flash*

?? !!

Loved and known by many names — ternary operator, trinary operator, conditional operator — this operator defies grammatical categories but is clearly a conditional, too. It even has an else part.

my $opponent = $player == White ?? Black !! White;

Almost every language out there spells this operator as ? : and it takes a while to get used to the new spelling in Perl 6. (The reason was that both ? and : were far too short and valuable to waste on this operator, so it got "demoted" to a longer spelling.)

If you ask me, ?? !! is an improvement once you see past the historical baggage: all of the other pure boolean operators already identify themselves by doubling a symbolic character like that. It blends in better.

You didn't hear it from me, but if anyone ever feels like creating an unless/else version of this operator, you could do a lot worse than choosing ¿¿ ¡¡ for the job. Ah, Unicode.

andthen

Finally, a late entry, andthen also works as a conditional. Just when you thought Perl 6 had you semantically covered with && and and for ordinary boolean values, and // for undefined values...

...it opens up a third door. The andthen operator only evaluates its right-hand statement if the left-hand statement succeeded.

shut-down-reactor()
    andthen don-hazmat-suit()
    andthen enter-chamber()
;

Succeeding means returning something defined and not failing along the way. Structually, the andthen operator has been compared to Haskell's monadic bind: success of earlier statements guards execution of later ones. Though it's not implemented yet in Rakudo, you're even supposed to be able to pass the successful value from the left-hand statement to the right-hand one. (It ends up in $_.)

I won't count orelse, even though it also short-circuits. Though it's worth noting that the right-hand side of orelse ends up with the $! value from the left-hand side.

Signoff

Thank you for taking the tour. As you can see, there really is More Than One Way to write a condition in Perl 6. Some of them will turn up more often in everyday code, and some less often. I use and only rarely, and I've yet to use an andthen in real code. The others all crop up fairly regularly, and all in their own favorite contexts.

Some languages give you a sparse diet of the bare necessities, claiming that it's good for you to have fewer choices. Perl is not one of those languages. Instead, you get lots of choice, lots of freedom, and it's up to you to do great things with it. Of course, we also try to make sure that the constructs are there make sense, are in good taste, and feel consistent with the rest of the language. The end result is a pleasant language with plenty of rope to shoot yourself in the foot with.

Merry conditional Christmas!

Day 18 – MoarVM Internals for the Brave (and Curious)

Day 18 – MoarVM Internals for the Brave (and Curious)

Hi there! I’m the guy who developed the MoarVM x64 JIT compiler during
this year’s GSoC. Jonathan has already given a detailed (and clear)
overview of MoarVMs optimization subsystems, of which the JIT compiler
is but a small step. Today, I’m going to try and show you how you can
see this in action. But first, a sample program:

    use v6;
    
    sub fibonacci (int $n --> int) {
        my int $x = 1;
        my int $y = 1;
        while $n > 0 {
            $x = $x + $y;
            $y = $x - $y;
            $n = $n - 1;
        }
        return $y;
    }
    
    for ^40 -> int $x {
        say fibonacci $x;
    }


Careful readers will spot the native integer definitions sprinkled
throughout the code. You can write perl6 code as if it is C,
too. (After all, that’s just one more way to do it). I’m not entirely
sure about other backends, but on MoarVM these really work – no object
is ever allocated for them.

Rakudo perl 6 can do more than just run this code, it can also show
you how it understands it. I’ve saved the above program as ‘fib.p6’,
and you can get a (rather verbose) textual representation of the
syntax tree by using the special target command line option:

perl6-m --target=mast fib.p6

However, this is just a static analysis. What we’re really interested
in is what happens during runtime. MoarVM will show you all the gory
details if you specify a filename with the MVM_SPESH_LOG environment
variable, like so.

export MVM_SPESH_LOG=spesh-log.txt; perl6-m fib.p6

From the outside it looks as if nothing special has
happened. As they say, looks are deceiving, because MoarVM has
faithfully produced a text file that details the way your code has
been changed. It is larger even than the first analysis, but it also
contains more information. Let’s look at the our function:

    Inserting logging for specialization of 'fibonacci' 
(cuid: cuid_2_1418763286.32937)
    
    Before:
    Spesh of 'fibonacci' (cuid: cuid_2_1418763286.32937, file: fib.p6:3)

    BB 0 (0x4f77118):
      Instructions:
        no_op 
      Successors: 1, 6, 5, 7, 9
      Predeccessors: 
      Dominance children: 1, 5, 6, 7, 9, 10

    BB 1 (0x4f77188):
      Instructions:
        checkarity liti16(1), liti16(1)
        param_rp_i r2(1), liti16(0)
        bindlex lex(idx=5,outers=0,$n), r2(1)
        paramnamesused 
        const_i64_16 r0(1), liti16(0)
        bindlex lex(idx=1,outers=0,$x), r0(1)

    ....


This little snippet is not without jargon, so I’ll try to explain. (I
did warn you at the top). The ‘cuid’ is the compilation unit
identifier, and it serves to identify the source of any
function. Sometimes compilation units correspond to files, and
sometimes they don’t (like in a REPL loop). The indented blocks denote
Basic Blocks. A Basic Block is a sequence of instructions that is
uninterrupted by a (conditional) jump. They are important because
within a basic block it is always obvious where all the values are
coming from.

Further along in our spesh log, we can see how spesh has transformed
the first block:

    ...
      BB 1 (0x4f77188):
        Instructions:
          sp_getarg_i r2(1), liti16(0)
          bindlex lex(idx=5,outers=0,$n), r2(1)
          const_i64_16 r0(1), liti16(0)
     ...


This is the same start of the function as before. What has happened is
that the polymorphic instruction param_rp_i to the much simpler
instruction sp_getarg_i. As Jonathan explained, we can get away with
this because we know exactly how this function is called, which is
just at line 15 of our little script.

While the spesh log is certainly interesting, it is no good for light
reading. Which is why I wanted to show off a really cool tool that
Timo Paulssen (timotimo) made a while ago – a way to transform a piece
of the spesh log (the ‘spesh graph’ that represents the code of a
function) into a visual form (with a little help of
graphviz). Unfortunately, I couldn’t really get it to work. This
demonstrates something important about all the tools that I’m showing
today – they’ve all been designed not for language users but VM
developers, and they may all be broken this time next year.

[I was able to run the tool and get this piece of output. – timotimo]

Let’s wrap it up. If you’re interested in what kind of assembly code
the JIT will produce for you, there is also a way to get to that. Run
your script again as follows:

export MVM_JIT_BYTECODE_DIR=. # or /tmp, or some other directory
perl6-m fib.p6

If you’ve followed these instructions directly, what you’ll now see is
your working directory littered with binary files representing different
frames (functions). Every such file will contain the raw binary data
generated by the JIT compiler. Inspecting these files is easy enough
(as long as you do know x64 assembly):

objdump -D -b binary -m i386:x86-64 -M intel `ls *.fibonacci.bin`

If you’re on a mac, that’s gobjdump rather than objdump. And if you
prefer AT&T syntax over intel syntax, just drop the ‘-M intel’ part.
Looking at the output of this bytecode, you might also see the rather
wastefull way your code is compiled. After all, I’ve written this
function specifically for simplicity. Yet I count no fewer than 215
‘mov’ instructions, 18 conditional move instructions, and 16 function
calls. As much as MoarVM and perl6 have achieved this year, there is
still a lot left to do. And with that, I wish you a hacky christmas
:-)

Day 17 – Three strange loops

Day 17 – Three strange loops

For a long time, I’ve been fascinated by strange loops.

hands drawing each other

The concept “strange loop” revels in the circular definition. Something is the way it is because it’s defined in terms of itself. I think fascination for this concept is related to fascination with the infinite. Chicken-egg conundrums. Grandfather paradoxes. Catch-22s. (“Catches-22”?) Newcomb’s paradox. Gödel’s incompleteness theorems. Something about causal feedback loops tickle our brains.

The concept has taken root not just in art or in brainteasers. When we talk of “bootstrapping compilers” or “metacircular evaluators”, we mean exactly this type of activity. To make a compiler bootstrapping means to make it able to compile itself by (re-)writing it in the target language. An evaluator is metacircular if it implements features not in terms of a helper subsystem, but in terms of themselves. And people do this not just because it’s a cool trick, but because it opens up possibilities. Arbitrary limits in programming stop us from doing what we want, and the limit between “host system” and “guest system” is arbitrary.

In other words, if you’re seriously interested in extensibility, then you’re also likely interested in strange loops. That’s why you see these ideas floated around in the Lisp communities a lot. And that’s why compilers sometimes take the bootstrapping route. (Another happy example is JavaScript, where some language extension proposals happens through their metacircular interpreter, aptly named Narcissus.)

Perl 6 has three strange loops. Grammar, OO, and program elements. It’s in various stages of closing those.

Grammar

Status: not quite closed yet

Perl 6 grammars are famously powerful and allows the programmer to specify entire languages. It stands to reason that a Perl 6 implementation would want to eat its own dogfood and use a Perl 6 grammar to specify the syntax of the language itself. And behold, Rakudo and Niecza do exactly that. (Pugs doesn’t, but among its future versions is the milestone of being self-hosting.)

Having this bootstrapping in place gives us a high confidence in our grammar engine. We know that it’s powerful enough to parse all the Perl 6 we pass it every day. This is also what allows us to write most of Rakudo in Perl 6, making it easier for others to contribute.

I say “not quite closed yet” because the access from user code into the Perl 6 grammar has some ways to go still.

Benefits: tooling, slangs, macros

Something like PPI should be trivial once we close the loop completely. The compiler already has access to all that information, and we just need to get it out into user space. Slangs can either extend the existing Perl 6 grammar, or start from scratch and provide a completely different syntax. In either case, good communication between old and new grammars is required for e.g. variables to be visible across languages. Macros need to be able to effortlessly go from the textual form of something to AST form.

OO

Status: closed

The system underlying all objects in the Perl 6 runtime is called 6model. An object was created from a class, a class was created from a “class metaobject”. After just a few more steps you end up with something called a “knowhow” — this is the end of the line, because by all appearances the knowhow created itself.

6model is our object-oriented implementation of object orientation. It’s very nice and it allows us to extend OO when we need it.

Benefits: make your own OO rules

You will find most examples of the kind of OO extension that’s posssible through jnthn’s modules. Grammar::Debugger extends what it means to call a grammar rule. Test::Mock similarly keep track of how and when methods are called. The recent modules OO::Monitors and OO::Actors both provide new class metaobjects, basically giving us new rules for OO. They also go one step further and provide definitional keywords for monitors and actors, respectively.

Program elements

Status: just discovered

Macros can generate code, but in some cases they also analyze or extend code. That’s the idea anyway. What’s been stopping us so far in realizing this in Perl 6 is that there hasn’t been a standard way to talk about Perl 6 program elements. What do I mean by “program element” anyway? Let’s answer that by example.

Say you have a class definition in your program. Maybe you have a fancy editor, with refactoring capability. The editor is certainly aware of the class definition and can traverse/manipulate it according to the rules of the language. In order for it to do that, it needs to be able to represent the class definition as an object. That object is a program element. It’s different from the class metaobject; the former talks about its relation to the program text, and the latter talks about its relation to the OO system.

<PerlJam> masak: "program elements" reads like
          "DOM for Perl" to me.
<masak> yep.

Macros are headed in the same way as such a refactoring editor. By handling program elements, you can analyze and extend user code at compile time inside macros. The API of the program elements give the user the ability to extend the Perl 6 compiler itself from library code and user code.

The work on this has only just started. My progress report so far is this: Perl 6 has a lot of node types. :) Having a rich language that doesn't take the Lisp route of almost-no-syntax also means that the API of the program elements becomes quite extensive. But it will also open up some exciting doors.

Benefits: macros, slangs

Parting words

Part of the reason why I like Perl 6 is that it has these strange loops. Little by little, year by year, Perl 6 lifts itself up by the bootstraps. There's still some work left to close some of these loops. I've been hopefully waiting for a long time to be able to do Perl 6 parsing from user code. And I'm eager to provide more of the program element loop so that we can write Perl 6 that writes Perl 6 from inside of our macros.

Mostly I'm excited to see people come up with things that none of us have thought of yet, but that are made possible, even easy, by embedding these strange loops into the language.

Have a loopy Christmas.