Author Archive

Day 23 – Unary Sort

December 23, 2013

Most languages or libraries that provide a generic sort routine allow you to specify a comparator, that is a callback that tells the sort routine how two given elements compare. Perl is no exception.

For example in Perl 5, which defaults to lexicographic ordering, you can request numeric sorting like this:

 use v5;
 my @sorted = sort { $a <=> $b } @values;

Perl 6 offers a similar option:

 use v6;
 my @sorted = sort { $^a <=> $^b }, @values;

The main difference is that the arguments are not passed through the global variables $a and $b, but rather as arguments to the comparator. The comparator can be anything callable, that is a named or anonymous sub or a block. The { $^a <=> $^b} syntax is not special to sort, I have just used placeholder variables to show the similarity with Perl 5. Other ways to write the same thing are:

 my @sorted = sort -> $a, $b { $a <=> $b }, @values;
 my @sorted = sort * <=> *, @values;
 my @sorted = sort &infix:«<=>», @values;

The first one is just another syntax for writing blocks, * <=> * use * to automatically curry an argument, and the final one directly refers to the routine that implements the <=> "space ship" operator (which does numeric comparison).

But Perl strives not only to make hard things possible, but also to make simple things easy. Which is why Perl 6 offers more convenience. Looking at sorting code, one can often find that the comparator duplicates code. Here are two common examples:

 # sort words by a sort order defined in a hash:
 my %rank = a => 5, b => 2, c => 10, d => 3;
 say sort { %rank{$^a} <=> %rank{$^b} }, 'a'..'d';
 #          ^^^^^^^^^^     ^^^^^^^^^^  code duplication

 # sort case-insensitively
 say sort { $^a.lc cmp $^b.lc }, @words;
 #          ^^^^^^     ^^^^^^  code duplication

Since we love convenience and hate code duplication, Perl 6 offers a shorter solution:

 # sort words by a sort order defined in a hash:
 say sort { %rank{$_} }, 'a'..'d';

 # sort case-insensitively
 say sort { .lc }, @words;

sort is smart enough to recognize that the code object code now only takes a single argument, and now uses it to map each element of the input list to new values, which it then sorts with normal cmp sort semantics. But it returns the original list in the new order, not the transformed elements. This is similar to the Schwartzian Transform, but very convenient since it's built in.

So the code block now acts as a transformer, not a comparator.

Note that in Perl 6, cmp is smart enough to compare strings with string semantics and numbers with number semantics, so producing numbers in the transformation code generally does what you want. This implies that if you want to sort numerically, you can do that by forcing the elements into numeric context:

 my @sorted-numerically = sort +*, @list;

And if you want to sort in reverse numeric order, simply use -* instead.

The unary sort is very convenient, so you might wonder why the Perl 5 folks haven't adopted it yet. The answer is that since the sort routine needs to find out whether the callback takes one or two arguments, it relies on subroutine (or block) signatures, something not (yet?) present in Perl 5. Moreover the "smart" cmp operator, which compares number numerically and strings lexicographically, requires a type system which Perl 5 doesn't have.

I strongly encourage you to try it out. But be warned: Once you get used to it, you'll miss it whenever you work in a language or with a library that lacks this feature.

Day 06 – Parsing and generating recurring dates

December 6, 2013

There are a lot of events that are scheduled on particular days of the week each month, for example the regular Windows Patch Day on the second Tuesday of each month, or in Perl 6 land that Rakudo Perl 6 compiler release, which is scheduled for two days after the Parrot release day, which again is scheduled for the third Tuesday of the month.

So let's write something that calculates those dates.

The specification format I have chosen looks like 3rd tue + 2 for the Rakudo release date, that is, two days after the 3rd Tuesday of each month (note that this isn't always the same as the 3rd Thursday).

Parsing it isn't hard with a simple grammar:

grammar DateSpec::Grammar {
    rule TOP {
        [<count><.quant>?]?
        <day-of-week>
        [<sign>? <offset=count>]?
    }
    token count { \d+ }
    token quant { st | nd | rd | th }
    token day-of-week { :i
        [ mon | tue | wed | thu | fri | sat | sun ]
    }
    token sign { '+' | '-' }
}

As you can see, everything except the day of the week is optional, so sun would simply be the first Sunday of the month, and 2 sun - 1 the Saturday before the second Sunday of the month.

Now it's time to actually turn this specification into a data structure that does something useful. And for that, a class wouldn't be a bad choice:

my %dow = (mon => 1, tue => 2, wed => 3, thu => 4,
        fri => 5, sat => 6, sun => 7);

class DateSpec {
    has $.day-of-week;
    has $.count;
    has $.offset;

    multi method new(Str $s) {
        my $m = DateSpec::Grammar.parse($s);
        die "Invalid date specification '$s'\n" unless $m;
        self.bless(
            :day-of-week(%dow{lc $m<day-of-week>}),
            :count($m<count> ?? +$m<count>[0] !! 1),
            :offset( ($m<sign> eq '-' ?? -1 !! 1)
                    * ($m<offset> ?? +$m<offset> !! 0)),
        );
    }

We only need three pieces of data from those date specification strings: the day of the week, whether the 1st, 2nd, 3rd. etc is wanted (here named $.count), and the offset. Extracting them is a wee bit fiddly, mostly because so many pieces of the grammar are optional, and because the grammar allows a space between the sign and the offset, which means we can't use the Perl 6 string-to-number conversion directly.

There is a cleaner but longer method of extracting the relevant data using an actions class.

The closing } is missing, because the class doesn't do anything useful yet, and that should be added. The most basic operation is to find the specified date in a given month. Since Perl 6 has no built-in type for months, we use a Date object where the .day is one, that is, a Date object for the first day of the month.

    method based-on(Date $d is copy where { .day == 1}) {
        ++$d until $d.day-of-week == $.day-of-week;
        $d += 7 * ($.count - 1) + $.offset;
        return $d;
    }

The algorithm is quite simple: Proceed to the next date (++$d) until the day of week matches, then advance as many weeks as needed, plus as many days as needed for the offset. Date objects support addition and subtraction of integers, and the integers are interpreted as number of days to add or subtract. Handy, and exactly what we need here. (The API is blatantly copied from the Date::Simple Perl 5 module).

Another handy convenience method to implement is next, which returns the next date matching the specification, on or after a reference date.

    method next(Date $d = Date.today) {
        my $month-start = $d.truncated-to(month);
        my $candidate   = $.based-on($month-start);
        if $candidate ge $d {
            return $candidate;
        }
        else {
            return $.based-on($month-start + $month-start.days-in-month);
        }
    }
}

Again there's no rocket science involved: try the date based on the month of $d, and if that's before $d, try again, but with the next month as base.

Time to close the class :-).

So, when is the next Rakudo release? And the next Rakudo release after Christmas?

my $spec = DateSpec.new('3rd Tue + 2');
say $spec.next;
say $spec.next(Date.new(2013, 12, 25));

Output:

2013-12-19
2014-01-23

The code works fine on Rakudo with both the Parrot and the JVM backend.

Happy recurring hollidates!

Day 01 – The State of Perl 6 in 2013

December 1, 2013

Welcome to the 2013 Perl 6 advent calendar!

In Perl 6 land, 2013 will be remembered as the year that brought proper concurrency support.

But I'm getting ahead of myself.

There is also sad news. Niecza, the Perl 6 compiler on the CLR (.NET/Mono) platform, and the Perl 6 compiler with the best runtime characteristics, had its last release in March. Since then there were a few maintenance patches and new built-in types and routines, but little in terms of actual compiler features.

A little later, Rakudo gained support to run on the Java Virtual machine. There are still some bits missing, mostly notably support for the native call interface, but all in all it works quite well, passes more than 99.9% of the tests that Rakudo on Parrot passes, and has two key advantages: it is much faster at run time, and has proper concurrency/parallelism support.

Jonathan Worthington prototyped and implemented it, and later specified it in S17, which again led to lots of improvements. Stay tuned for more advent calendar posts on the JVM and concurrency/parallelism topics.

Another big news this year was the revelation of MoarVM, a virtual machine designed to run Perl 6. With the JVM's high startup time and Parrot being mostly unmaintained and having lots of unsolved problems, there is a niche to be filled. NQP, the "Not Quite Perl" Perl 6 compiler used to bootstrap Rakudo already runs on MoarVM; Rakudo support for MoarVM is on its way, and progressing well so far.

There was also lots of progress in terms of built-in types likes Set and Bag, and IO::Path for handling path and directory objects.

As a developer and early adopter, I find Perl 6 to be pleasant to work with. In 2013 it has gotten easier to use, due to better error reporting and improved IO.

Day 12 – Exceptions

December 12, 2012

Sometimes things go horribly wrong, and the only thing you can do is not to go on. Then you throw an exception.

But of course the story doesn’t end there. The caller (or the caller’s caller) must somehow deal with the exception. To do that in a sensible manner, the caller needs to have as much information as possible.

In Perl 6, exceptions should inherit from the type Exception, and by convention they go into the X:: namespace.

So for example if you write a HTTP client library, and you decide that an exception should be thrown when the server returns a status code starting with 4 or 5, you could declare your exception class as

 class X::HTTP is Exception {
     has $.request-method;
     has $.url;
     has $.status;
     has $.error-string;

     method message() {
         "Error during $.request-method request"
         ~ " to $.url: $.status $.error-string";
     }
 }

And throw an exception as

 die X::HTTP.new(
     request-method  => 'GET',
     url             => 'http://example.com/no-such-file',
     status          => 404,
     error-string    => 'Not found',
 );

The error message then looks like this:

 Error during GET request to
    http://example.com/no-such-file: 404 Not found

(line wrapped for the benefit of small browser windows).

If the exception is not caught, the program aborts and prints the error message, as well as a backtrace.

There are two ways to catch exceptions. The simple Pokemon style “gotta catch ‘em all” method catches exception of any type with try:

 my $result = try do-operation-that-might-die();
 if ($!) {
     note "There was an error: $!";
     note "But I'm going to go on anyway";
 }

Or you can selectively catch some exception types and handle only them, and rethrow all other exceptions to the caller:

 my $result =  do-operation-that-might-die();
 CATCH {
     when X::HTTP {
         note "Got an HTTP error for URL $_.url()";
         # do some proper error handling
     }
     # exceptions not of type X::HTTP are rethrown
 }

Note that the CATCH block is inside the same scope as the one where the error might occur, so that by default you have access to all the interesting varibles from that scope, which makes it easy to generate better error messages.

Inside the CATCH block, the exception is available as $_, and is matched against all when blocks.

Even if you don’t need to selectively catch your exceptions, it still makes sense to declare specific classes, because that makes it very easy to write tests that checks for proper error reporting. You can check the type and the payload of the exceptions, without having to resort to checking the exact error message (which is always brittle).

But Perl 6 being Perl, it doesn’t force you to write your own exception types. If you pass a non-Exception objects to die(), it simply wraps them in an object of type X::AdHoc (which in turn inherits from Exception), and makes the argument available with the payload method:

    sub I-am-fatal() {
        die "Neat error message";
    }
    try I-am-fatal();
    say $!;             # Neat error message;
    say $!.perl;        # X::AdHoc.new(payload => "Neat error message")

To find out more about exception handling, you can read the documentation of class Exception and Backtrace.

Day 6 – Lexical Imports

December 6, 2012

Perl 6 is built on lexical scopes. Variables, subroutines, constants and even types are looked up lexically first, and subroutines are only looked up in lexical scopes.

So it is only fitting that importing symbols from modules is also done into lexical scopes. I often write code such as

    use v6;

    # the main functionality of the script
    sub deduplicate(Str $s) {
        my %seen;
        $s.comb.grep({!%seen{ .lc }++}).join;
    }

    # normal call
    multi MAIN($phrase) {
        say deduplicate($phrase)
    }

    # if you call the script with --test, it runs its unit tests
    multi MAIN(Bool :$test!) {
        # imports &plan, &is etc. only into the lexical scope
        use Test;
        plan 2;
        is deduplicate('just some words'),
            'just omewrd', 'basic deduplication';
        is deduplicate('Abcabd'),
            'Abcd', 'case insensitivity';
    }

This script removes all but the first occurrence of each character given on the command line:

    $ perl6 deduplicate 'Duplicate character removal'
    Duplicate hrmov

But if you call it with the --test option, it runs its own unit tests:

    $ perl6 deduplicate --test
    1..2
    ok 1 - basic deduplication
    ok 2 - case insensitivity

Since the testing functions are only necessary in a part of the program — in a lexical scope, to be more precise –, the use statement is inside that scope, and limits the visibility of the imported symbols to this scope. So if you try to use the is function outside the routine in which Test is used, you get a compile-time error.

Why, you might ask? From the programmer's perspective, it reduces risk of (possibly unintended and unnoticed) name clashes the same way that lexical variables are safer than global variables.

From the point of view of language design, the combination of lexical importing, runtime-immutable lexical scopes and lexical-only lookup of subroutines allows resolving subroutine names at compile time, which again allows neat stuff like detecting calls to undeclared functions, compile-time type checking of arguments, and other nice optimizations.

But subroutines are only the tip of the iceberg. Perl 6 has a very flexible syntax, which you can modify with custom operators and macros. Those too can be exported, and imported into lexical scopes. Which means that language modifications are also lexically by default. So you can safely load any language-modifying extension, without running into danger that a library you use can't cope with it — the library doesn't even see the language modification.

So ultimately, lexical importing is another facet of encapsulation.

Perl 6 Advent Calendar 2012: Table of Contents

December 1, 2012

This post serves as a table of contents for the 2012 Perl 6 advent calendar. Links to new posts will appear here during the course of this month.

See also: table of contents for 2011, 2010, 2009.

Day 1 – State of Perl 6 in 2012

December 1, 2012

Welcome to another edition of your annual Perl 6 advent calendar.

As is tradition on the first of December, you can read a short overview over what has changed in the past year, and where we are standing now.

The list of major changes to the specification is pretty short. The IO subsystem has undergone a rewrite, and now much better reflects the realities in implementations, and actually has a measure of common sense applied. S32::Exceptions has gone through lots of changes (mostly extensions), and now there is a decent core of exception classes in Perl 6.

Both Rakudo and Niecza, the two major Perl 6 compilers, have matured a great deal. Contrary to last year, chances are pretty good that if your program works on one of the compilers, it also works on the other. Niecza also temporarily overtook Rakudo on the count of passing tests.

Niecza had a revamp of the roles implementation, has gained constant folding, awesome Unicode support in regexes, list comprehensions and a no strict; mode. To name just a few of the major changes.

Rakudo now supports heredocs, all phasers (special blocks like BEGIN, END, FIRST, …), longest-token matching in regexes, typed exceptions, much nicer backtraces and operator adverbs. And it now has a debugger, which is shipped with the Rakudo Star distribution.

The module ecosystem has grown a lot, and there is much more documentation for Perl 6 than a year ago.

So, after all these changes, where are we now?

Reports from production uses of Perl 6 are slowly starting to trickle in, and these days if your Perl 6 code has bugs, the chances are much higher that your code is to blame than the compilers. Perl 6 has never been this much fun to use. It surely has been a good and productive year for Perl 6, and we’re sure that this last month will continue the tradition. Have fun!

Where Have All The References Gone?

December 16, 2011

Perl 5 programmers that start to learn Perl 6 often ask me how to take a reference to something, and my answers usually aren’t really helpful. In Perl 6, everything that can be held in a variable is an object, and objects are passed by reference everywhere (though you don’t always notice that, because objects like strings and numbers are immutable, so there’s no difference to passing by value). So, everything is already treated as a reference in some sense, and there’s no point in explicitly taking references.

But people aren’t happy with that answer, because it doesn’t explain how to get stuff done that involved references in Perl 5. So here are a few typical use cases of references, and how Perl 6 handles them.

Creating Objects

In Perl 5, an object is really just a reference to a blessed value (but people usually say "blessed reference", because you virtually never use the blessed value without going through a reference).

So, in Perl 5 you’d write

 package My::Class;
 # constructor
 sub new { bless {}, shift };
 # an accessor
 sub foo {
     my $self = shift; 
     # the ->{} dereferences $self as a hash
     $self->{foo} // 5;
 }
 # use the object:
 say My::Class->new->foo;

In Perl 6, you just don’t think about references; classes are much more declarative, and there’s no need for dereferencing anything anywhere:

 class My::Class {
     # attribute with accessor (indicated by the dot)
     # and default value
     has $.foo = 5;
 }
 # use it:
 say My::Class.new.foo

If you don’t like the default constructor, you can still use bless explicitly, but even then you don’t have to think about references:

 method new() {
     # the * specifies the storage, and means "default storage"
     self.bless(*);
 }

So, no explicit reference handling when dealing with OO. Great.

Nested Data Structures

In both Perl 5 and Perl 6, lists flatten automatically by default. So if you write


 my @a = (1, 2);
 my @b = (3, 4);
 push @a, @b

then @a ends up with the four elements 1, 2, 3, 4, not with three elements of which the third is an array.

In Perl 5, nesting the data structure happens by taking a reference to @b:

 push @a, \@b;

In Perl 6, item context replaces this use of references. It is best illustrated by a rather clumsy method to achieve the same thing:

 my $temp = @b;
 push @a, $temp;  # does not flatten the two items in $temp,
                 # because $temp is a scalar

Of course there are shortcuts; the following lines work too:

 push @a, $(@b);
 push @a, item @b;

(As a side note, push @a, $@b is currently not allowed, it tries to catch a p5ism; I will also try to persuade Larry and the other language designers to allow it, and have it mean the same thing as the other two).

On the flip side you need explicit dereferencing to get the values out of item context:

 my @a = 1, 2;
 my $scalar = @a;
 for @a { 
     # two iterations
 }
 for $scalar {
     # one iteration only
 }
 for @$scalar {
     # two iterations again
 }

This explicit use of scalar and list context is the closest analogy to Perl 5 references, because it requires explicit context annotations in the same places where referencing and dereferencing is used in Perl 5.

But it’s not really the same, because there are cases where Perl 5 needs references, but Perl 6 can deduce the item context all on its own:

 @a[3] = @b; # automatically puts @b in item context

Mutating Arguments

Another use references in Perl 5 is for passing data to routines that should be modified inside the routine:

 sub set_five; {
     my $x = shift;
     # explicit dereferencing with another $:
     $$x = 5;
 }
 my $var;
 # explicit taking of a reference
 set_five \$var;

In Perl 6, there is a separate mechanism for this use case:

 sub set_five($x is rw) {
     # no dereferencing
     $x = 5;
 }
 my $var;
 # no explicit reference taking
 set_five $var;

So again a use case of Perl 5 references is realized by another mechanism in Perl 6 (signature binding, or binding in general).

Summary

Nearly everything is a reference in Perl 6, but you still don’t see them, unless you take a very close look. The control of list flattening with item and list context is the one area where Perl 5′s referencing and dereferencing shines through the most.

Day 15 – Something Exceptional

December 15, 2011

The Perl 6 exception system is currently in development; here is a small example demonstrating a part of the current state:

 use v6;
 
 sub might_die(Real $x) {
     die "negative" if $x < 0;
     $x.sqrt;
 }
 
 for 5, 0, -3, 1+2i -> $n {
     say "The square root of $n is ", might_die($n);
 
     CATCH {
         # CATCH sets $_ to the error object,
         # and then checks the various cases:
         when 'negative' {
             # note that $n is still in scope,
             # since the CATCH block is *inside* the
             # to-be-handled block
             say "Cannot take square root of $n: negative"
         }
         default {
             say "Other error: $_";
         }
     }
 }

This produces the following output under rakudo:

 The square root of 5 is 2.23606797749979
 The square root of 0 is 0
 Cannot take square root of -3: negative
 Other error: Nominal type check failed for parameter '$x'; expected Real but got Complex instead

A few interesting points: the presence of a CATCH block automatically makes the surrounding block catch exceptions. Inside the CATCH block, all lexical variables from the outside are normally accessible, so all the interesting information is available for error processing.

Inside the CATCH block, the error object is available in the $_ variable, on the outside it is available in $!. If an exception is thrown inside a CATCH block, it is not caught — unless there is a second, inner CATCH that handles it.

The insides of a CATCH block typically consists of when clauses, and sometimes a default clause. If any of those matches the error object, the error is considered to be handled. If no clause matches (and no default block is present), the exception is rethrown.

Comparing the output from rakudo to the one that niecza produces for the same code, one can see that the last line differs:

 Other error: Nominal type check failed in binding Real $x in might_die; got Complex, needed Real

This higlights a problem in the current state: The wording of error messages is not yet specified, and thus differs among implementations.

I am working on rectifying that situation, and also throwing interesting types of error objects. In the past week, I have managed to start throwing specific error objects from within the Rakudo compiler. Here is an example:

 $ ./perl6 -e 'try eval q[ class A { $!x } ]; say "error: $!"; say $!.perl'
 error: Attribute $!x not declared in class A
 X::Attribute::Undeclared.new(
         name => "\$!x",
         package-type => "class",
         package-name => "A", filename => "",
         line => 1,
         column => Any,
         message => "Attribute \$!x not declared in class A"
 )
 # output reformatted for clarity

The string that is passed to eval is not a valid Perl 6 program, because it accesses an attribute that wasn’t declared in class A. The exception thrown is of type X::Attribute::Undeclared, and it contains several details: the name of the attribute, the type of package it was missing in (could be class, module, grammar and maybe others), the name of the package, the actual error message and information about the source of the error (line, cfile name (empty because eval() operates on a string, not on a file), and column, though column isn’t set to a useful value yet).

X::Attribute::Undeclared inherits from type X::Comp, which is the common superclass for all compile time errors. Once all compile time errors in Rakudo are switched to X::Comp objects, one will be able to check if errors were produced at run time or at compile with code like

 eval $some-string;
 CATCH {
     when X::Comp { say 'compile time' }
     default      { say 'run time'     }
 }

The when block smart-matches the error object against the X::Comp type object, which succeeds whenever the error object conforms to that type (so, is of that type or a subclas of X::Comp).

Writing and using new error classes is quite easy:

 class X::PermissionDenied is X::Base {
     has $.reason;
     method message() { "Permission denied: $.reason" };
 }
 # and using it somewhere:
 die X::PermissionDenied.new( reason => "I don't like your nose");

So Perl 6 has a rather flexible error handling mechanism, and libraries and applications can choose to throw error objects with rich information. The plan is to have the Perl 6 compilers throw such easily introspectable error objects too, and at the same time unify their error messages.

Many thanks go to Ian Hague and The Perl Foundation for funding my work on exceptions.

Lexicality and Optimizability

December 8, 2011

Traditional optimizations in compilers rely on compile-time knowledge about the program. Usually statically typed langauges like Java and C are rather good at that, and dynamic languages like Perl 5, ruby and python are not.

Perl 6 offers the flexibility of dynamic languages, but tries to provide much optimizability nonetheless by gradual typing, that is offering optional static type annotations.

But even in the presence of type annotations, another piece is needed for compile time dispatch decision and inlining: the knowledge about the available routines (and in the case of multi subs, the available candidates).

To provide that knowledge, Perl 6 installs subroutine in lexical scopes (and not packages / symbol tables, as in Perl 5), and lexical scopes are immutable at run time. (Variables inside the lexical scopes are still mutable, you just cannot add or remove entries at run time).

To provide the necessary flexibility, Perl 6 allows code to run at compile time. A typical way to run code at compile time is with the use directive:

 {
    use Test;  # imports routines into the current
               # lexical scope, at compile time
    plan 1;
    ok 1, 'success';
 }
 # plan() and ok() are not available here,
 # outside the scope into which the routines has been imported to.

The upside is that a sufficiently smart compiler can complain before runtime about missing routines and dispatches that are bound to fail. Current Rakudo does that, though there are a certainly cases that rakudo does not detect yet, but which are possible to detect.

 sub f(Int $x) {
          say $x * 2;
           }
 say "got here";
 f('some string');

produces this output with current Rakudo:

 ===SORRY!===
 CHECK FAILED:
 Calling 'f' will never work with argument types (str) (line 5)
     Expected: :(Int $x)

Since built-in routines are provided in an outer scope to the user’s program, all built-in routines are automatically subjected to all the same rules and optimizations as user-provided routines.

Note that this has other implications: require, which loads modules at run time, now needs a list of symbols to stub in at compile time, which are later wired up to the symbols loaded from the module.

The days are past where "a sufficiently smart compiler" was a legend; these days we have a compiler that can provide measurable speed-ups. There is still room for improvements, but we are now seeing the benefits from static knowledge and lexical scoping.


Follow

Get every new post delivered to your Inbox.

Join 36 other followers