Day 23 – Unary Sort

Most languages or libraries that provide a generic sort routine allow you to specify a comparator, that is a callback that tells the sort routine how two given elements compare. Perl is no exception.

For example in Perl 5, which defaults to lexicographic ordering, you can request numeric sorting like this:

 use v5;
 my @sorted = sort { $a <=> $b } @values;

Perl 6 offers a similar option:

 use v6;
 my @sorted = sort { $^a <=> $^b }, @values;

The main difference is that the arguments are not passed through the global variables $a and $b, but rather as arguments to the comparator. The comparator can be anything callable, that is a named or anonymous sub or a block. The { $^a <=> $^b} syntax is not special to sort, I have just used placeholder variables to show the similarity with Perl 5. Other ways to write the same thing are:

 my @sorted = sort -> $a, $b { $a <=> $b }, @values;
 my @sorted = sort * <=> *, @values;
 my @sorted = sort &infix:«<=>», @values;

The first one is just another syntax for writing blocks, * <=> * use * to automatically curry an argument, and the final one directly refers to the routine that implements the <=> "space ship" operator (which does numeric comparison).

But Perl strives not only to make hard things possible, but also to make simple things easy. Which is why Perl 6 offers more convenience. Looking at sorting code, one can often find that the comparator duplicates code. Here are two common examples:

 # sort words by a sort order defined in a hash:
 my %rank = a => 5, b => 2, c => 10, d => 3;
 say sort { %rank{$^a} <=> %rank{$^b} }, 'a'..'d';
 #          ^^^^^^^^^^     ^^^^^^^^^^  code duplication

 # sort case-insensitively
 say sort { $^a.lc cmp $^b.lc }, @words;
 #          ^^^^^^     ^^^^^^  code duplication

Since we love convenience and hate code duplication, Perl 6 offers a shorter solution:

 # sort words by a sort order defined in a hash:
 say sort { %rank{$_} }, 'a'..'d';

 # sort case-insensitively
 say sort { .lc }, @words;

sort is smart enough to recognize that the code object code now only takes a single argument, and now uses it to map each element of the input list to new values, which it then sorts with normal cmp sort semantics. But it returns the original list in the new order, not the transformed elements. This is similar to the Schwartzian Transform, but very convenient since it's built in.

So the code block now acts as a transformer, not a comparator.

Note that in Perl 6, cmp is smart enough to compare strings with string semantics and numbers with number semantics, so producing numbers in the transformation code generally does what you want. This implies that if you want to sort numerically, you can do that by forcing the elements into numeric context:

 my @sorted-numerically = sort +*, @list;

And if you want to sort in reverse numeric order, simply use -* instead.

The unary sort is very convenient, so you might wonder why the Perl 5 folks haven't adopted it yet. The answer is that since the sort routine needs to find out whether the callback takes one or two arguments, it relies on subroutine (or block) signatures, something not (yet?) present in Perl 5. Moreover the "smart" cmp operator, which compares number numerically and strings lexicographically, requires a type system which Perl 5 doesn't have.

I strongly encourage you to try it out. But be warned: Once you get used to it, you'll miss it whenever you work in a language or with a library that lacks this feature.

Day 06 – Parsing and generating recurring dates

There are a lot of events that are scheduled on particular days of the week each month, for example the regular Windows Patch Day on the second Tuesday of each month, or in Perl 6 land that Rakudo Perl 6 compiler release, which is scheduled for two days after the Parrot release day, which again is scheduled for the third Tuesday of the month.

So let's write something that calculates those dates.

The specification format I have chosen looks like 3rd tue + 2 for the Rakudo release date, that is, two days after the 3rd Tuesday of each month (note that this isn't always the same as the 3rd Thursday).

Parsing it isn't hard with a simple grammar:

grammar DateSpec::Grammar {
    rule TOP {
        [<count><.quant>?]?
        <day-of-week>
        [<sign>? <offset=count>]?
    }
    token count { \d+ }
    token quant { st | nd | rd | th }
    token day-of-week { :i
        [ mon | tue | wed | thu | fri | sat | sun ]
    }
    token sign { '+' | '-' }
}

As you can see, everything except the day of the week is optional, so sun would simply be the first Sunday of the month, and 2 sun - 1 the Saturday before the second Sunday of the month.

Now it's time to actually turn this specification into a data structure that does something useful. And for that, a class wouldn't be a bad choice:

my %dow = (mon => 1, tue => 2, wed => 3, thu => 4,
        fri => 5, sat => 6, sun => 7);

class DateSpec {
    has $.day-of-week;
    has $.count;
    has $.offset;

    multi method new(Str $s) {
        my $m = DateSpec::Grammar.parse($s);
        die "Invalid date specification '$s'\n" unless $m;
        self.bless(
            :day-of-week(%dow{lc $m<day-of-week>}),
            :count($m<count> ?? +$m<count>[0] !! 1),
            :offset( ($m<sign> eq '-' ?? -1 !! 1)
                    * ($m<offset> ?? +$m<offset> !! 0)),
        );
    }

We only need three pieces of data from those date specification strings: the day of the week, whether the 1st, 2nd, 3rd. etc is wanted (here named $.count), and the offset. Extracting them is a wee bit fiddly, mostly because so many pieces of the grammar are optional, and because the grammar allows a space between the sign and the offset, which means we can't use the Perl 6 string-to-number conversion directly.

There is a cleaner but longer method of extracting the relevant data using an actions class.

The closing } is missing, because the class doesn't do anything useful yet, and that should be added. The most basic operation is to find the specified date in a given month. Since Perl 6 has no built-in type for months, we use a Date object where the .day is one, that is, a Date object for the first day of the month.

    method based-on(Date $d is copy where { .day == 1}) {
        ++$d until $d.day-of-week == $.day-of-week;
        $d += 7 * ($.count - 1) + $.offset;
        return $d;
    }

The algorithm is quite simple: Proceed to the next date (++$d) until the day of week matches, then advance as many weeks as needed, plus as many days as needed for the offset. Date objects support addition and subtraction of integers, and the integers are interpreted as number of days to add or subtract. Handy, and exactly what we need here. (The API is blatantly copied from the Date::Simple Perl 5 module).

Another handy convenience method to implement is next, which returns the next date matching the specification, on or after a reference date.

    method next(Date $d = Date.today) {
        my $month-start = $d.truncated-to(month);
        my $candidate   = $.based-on($month-start);
        if $candidate ge $d {
            return $candidate;
        }
        else {
            return $.based-on($month-start + $month-start.days-in-month);
        }
    }
}

Again there's no rocket science involved: try the date based on the month of $d, and if that's before $d, try again, but with the next month as base.

Time to close the class :-).

So, when is the next Rakudo release? And the next Rakudo release after Christmas?

my $spec = DateSpec.new('3rd Tue + 2');
say $spec.next;
say $spec.next(Date.new(2013, 12, 25));

Output:

2013-12-19
2014-01-23

The code works fine on Rakudo with both the Parrot and the JVM backend.

Happy recurring hollidates!

Day 01 – The State of Perl 6 in 2013

Welcome to the 2013 Perl 6 advent calendar!

In Perl 6 land, 2013 will be remembered as the year that brought proper concurrency support.

But I'm getting ahead of myself.

There is also sad news. Niecza, the Perl 6 compiler on the CLR (.NET/Mono) platform, and the Perl 6 compiler with the best runtime characteristics, had its last release in March. Since then there were a few maintenance patches and new built-in types and routines, but little in terms of actual compiler features.

A little later, Rakudo gained support to run on the Java Virtual machine. There are still some bits missing, mostly notably support for the native call interface, but all in all it works quite well, passes more than 99.9% of the tests that Rakudo on Parrot passes, and has two key advantages: it is much faster at run time, and has proper concurrency/parallelism support.

Jonathan Worthington prototyped and implemented it, and later specified it in S17, which again led to lots of improvements. Stay tuned for more advent calendar posts on the JVM and concurrency/parallelism topics.

Another big news this year was the revelation of MoarVM, a virtual machine designed to run Perl 6. With the JVM's high startup time and Parrot being mostly unmaintained and having lots of unsolved problems, there is a niche to be filled. NQP, the "Not Quite Perl" Perl 6 compiler used to bootstrap Rakudo already runs on MoarVM; Rakudo support for MoarVM is on its way, and progressing well so far.

There was also lots of progress in terms of built-in types likes Set and Bag, and IO::Path for handling path and directory objects.

As a developer and early adopter, I find Perl 6 to be pleasant to work with. In 2013 it has gotten easier to use, due to better error reporting and improved IO.

Day 12 – Exceptions

Sometimes things go horribly wrong, and the only thing you can do is not to go on. Then you throw an exception.

But of course the story doesn’t end there. The caller (or the caller’s caller) must somehow deal with the exception. To do that in a sensible manner, the caller needs to have as much information as possible.

In Perl 6, exceptions should inherit from the type Exception, and by convention they go into the X:: namespace.

So for example if you write a HTTP client library, and you decide that an exception should be thrown when the server returns a status code starting with 4 or 5, you could declare your exception class as

 class X::HTTP is Exception {
     has $.request-method;
     has $.url;
     has $.status;
     has $.error-string;

     method message() {
         "Error during $.request-method request"
         ~ " to $.url: $.status $.error-string";
     }
 }

And throw an exception as

 die X::HTTP.new(
     request-method  => 'GET',
     url             => 'http://example.com/no-such-file',
     status          => 404,
     error-string    => 'Not found',
 );

The error message then looks like this:

 Error during GET request to
    http://example.com/no-such-file: 404 Not found

(line wrapped for the benefit of small browser windows).

If the exception is not caught, the program aborts and prints the error message, as well as a backtrace.

There are two ways to catch exceptions. The simple Pokemon style “gotta catch ’em all” method catches exception of any type with try:

 my $result = try do-operation-that-might-die();
 if ($!) {
     note "There was an error: $!";
     note "But I'm going to go on anyway";
 }

Or you can selectively catch some exception types and handle only them, and rethrow all other exceptions to the caller:

 my $result =  do-operation-that-might-die();
 CATCH {
     when X::HTTP {
         note "Got an HTTP error for URL $_.url()";
         # do some proper error handling
     }
     # exceptions not of type X::HTTP are rethrown
 }

Note that the CATCH block is inside the same scope as the one where the error might occur, so that by default you have access to all the interesting varibles from that scope, which makes it easy to generate better error messages.

Inside the CATCH block, the exception is available as $_, and is matched against all when blocks.

Even if you don’t need to selectively catch your exceptions, it still makes sense to declare specific classes, because that makes it very easy to write tests that checks for proper error reporting. You can check the type and the payload of the exceptions, without having to resort to checking the exact error message (which is always brittle).

But Perl 6 being Perl, it doesn’t force you to write your own exception types. If you pass a non-Exception objects to die(), it simply wraps them in an object of type X::AdHoc (which in turn inherits from Exception), and makes the argument available with the payload method:

    sub I-am-fatal() {
        die "Neat error message";
    }
    try I-am-fatal();
    say $!;             # Neat error message;
    say $!.perl;        # X::AdHoc.new(payload => "Neat error message")

To find out more about exception handling, you can read the documentation of class Exception and Backtrace.

Day 6 – Lexical Imports

Perl 6 is built on lexical scopes. Variables, subroutines, constants and even types are looked up lexically first, and subroutines are only looked up in lexical scopes.

So it is only fitting that importing symbols from modules is also done into lexical scopes. I often write code such as

    use v6;

    # the main functionality of the script
    sub deduplicate(Str $s) {
        my %seen;
        $s.comb.grep({!%seen{ .lc }++}).join;
    }

    # normal call
    multi MAIN($phrase) {
        say deduplicate($phrase)
    }

    # if you call the script with --test, it runs its unit tests
    multi MAIN(Bool :$test!) {
        # imports &plan, &is etc. only into the lexical scope
        use Test;
        plan 2;
        is deduplicate('just some words'),
            'just omewrd', 'basic deduplication';
        is deduplicate('Abcabd'),
            'Abcd', 'case insensitivity';
    }

This script removes all but the first occurrence of each character given on the command line:

    $ perl6 deduplicate 'Duplicate character removal'
    Duplicate hrmov

But if you call it with the --test option, it runs its own unit tests:

    $ perl6 deduplicate --test
    1..2
    ok 1 - basic deduplication
    ok 2 - case insensitivity

Since the testing functions are only necessary in a part of the program — in a lexical scope, to be more precise –, the use statement is inside that scope, and limits the visibility of the imported symbols to this scope. So if you try to use the is function outside the routine in which Test is used, you get a compile-time error.

Why, you might ask? From the programmer's perspective, it reduces risk of (possibly unintended and unnoticed) name clashes the same way that lexical variables are safer than global variables.

From the point of view of language design, the combination of lexical importing, runtime-immutable lexical scopes and lexical-only lookup of subroutines allows resolving subroutine names at compile time, which again allows neat stuff like detecting calls to undeclared functions, compile-time type checking of arguments, and other nice optimizations.

But subroutines are only the tip of the iceberg. Perl 6 has a very flexible syntax, which you can modify with custom operators and macros. Those too can be exported, and imported into lexical scopes. Which means that language modifications are also lexically by default. So you can safely load any language-modifying extension, without running into danger that a library you use can't cope with it — the library doesn't even see the language modification.

So ultimately, lexical importing is another facet of encapsulation.

Perl 6 Advent Calendar 2012: Table of Contents

This post serves as a table of contents for the 2012 Perl 6 advent calendar. Links to new posts will appear here during the course of this month.

See also: table of contents for 2011, 2010, 2009.

Day 1 – State of Perl 6 in 2012

Welcome to another edition of your annual Perl 6 advent calendar.

As is tradition on the first of December, you can read a short overview over what has changed in the past year, and where we are standing now.

The list of major changes to the specification is pretty short. The IO subsystem has undergone a rewrite, and now much better reflects the realities in implementations, and actually has a measure of common sense applied. S32::Exceptions has gone through lots of changes (mostly extensions), and now there is a decent core of exception classes in Perl 6.

Both Rakudo and Niecza, the two major Perl 6 compilers, have matured a great deal. Contrary to last year, chances are pretty good that if your program works on one of the compilers, it also works on the other. Niecza also temporarily overtook Rakudo on the count of passing tests.

Niecza had a revamp of the roles implementation, has gained constant folding, awesome Unicode support in regexes, list comprehensions and a no strict; mode. To name just a few of the major changes.

Rakudo now supports heredocs, all phasers (special blocks like BEGIN, END, FIRST, …), longest-token matching in regexes, typed exceptions, much nicer backtraces and operator adverbs. And it now has a debugger, which is shipped with the Rakudo Star distribution.

The module ecosystem has grown a lot, and there is much more documentation for Perl 6 than a year ago.

So, after all these changes, where are we now?

Reports from production uses of Perl 6 are slowly starting to trickle in, and these days if your Perl 6 code has bugs, the chances are much higher that your code is to blame than the compilers. Perl 6 has never been this much fun to use. It surely has been a good and productive year for Perl 6, and we’re sure that this last month will continue the tradition. Have fun!