Author Archive

Where Have All The References Gone?

December 16, 2011

Perl 5 programmers that start to learn Perl 6 often ask me how to take a reference to something, and my answers usually aren’t really helpful. In Perl 6, everything that can be held in a variable is an object, and objects are passed by reference everywhere (though you don’t always notice that, because objects like strings and numbers are immutable, so there’s no difference to passing by value). So, everything is already treated as a reference in some sense, and there’s no point in explicitly taking references.

But people aren’t happy with that answer, because it doesn’t explain how to get stuff done that involved references in Perl 5. So here are a few typical use cases of references, and how Perl 6 handles them.

Creating Objects

In Perl 5, an object is really just a reference to a blessed value (but people usually say "blessed reference", because you virtually never use the blessed value without going through a reference).

So, in Perl 5 you’d write

 package My::Class;
 # constructor
 sub new { bless {}, shift };
 # an accessor
 sub foo {
     my $self = shift; 
     # the ->{} dereferences $self as a hash
     $self->{foo} // 5;
 }
 # use the object:
 say My::Class->new->foo;

In Perl 6, you just don’t think about references; classes are much more declarative, and there’s no need for dereferencing anything anywhere:

 class My::Class {
     # attribute with accessor (indicated by the dot)
     # and default value
     has $.foo = 5;
 }
 # use it:
 say My::Class.new.foo

If you don’t like the default constructor, you can still use bless explicitly, but even then you don’t have to think about references:

 method new() {
     # the * specifies the storage, and means "default storage"
     self.bless(*);
 }

So, no explicit reference handling when dealing with OO. Great.

Nested Data Structures

In both Perl 5 and Perl 6, lists flatten automatically by default. So if you write


 my @a = (1, 2);
 my @b = (3, 4);
 push @a, @b

then @a ends up with the four elements 1, 2, 3, 4, not with three elements of which the third is an array.

In Perl 5, nesting the data structure happens by taking a reference to @b:

 push @a, \@b;

In Perl 6, item context replaces this use of references. It is best illustrated by a rather clumsy method to achieve the same thing:

 my $temp = @b;
 push @a, $temp;  # does not flatten the two items in $temp,
                 # because $temp is a scalar

Of course there are shortcuts; the following lines work too:

 push @a, $(@b);
 push @a, item @b;

(As a side note, push @a, $@b is currently not allowed, it tries to catch a p5ism; I will also try to persuade Larry and the other language designers to allow it, and have it mean the same thing as the other two).

On the flip side you need explicit dereferencing to get the values out of item context:

 my @a = 1, 2;
 my $scalar = @a;
 for @a { 
     # two iterations
 }
 for $scalar {
     # one iteration only
 }
 for @$scalar {
     # two iterations again
 }

This explicit use of scalar and list context is the closest analogy to Perl 5 references, because it requires explicit context annotations in the same places where referencing and dereferencing is used in Perl 5.

But it’s not really the same, because there are cases where Perl 5 needs references, but Perl 6 can deduce the item context all on its own:

 @a[3] = @b; # automatically puts @b in item context

Mutating Arguments

Another use references in Perl 5 is for passing data to routines that should be modified inside the routine:

 sub set_five; {
     my $x = shift;
     # explicit dereferencing with another $:
     $$x = 5;
 }
 my $var;
 # explicit taking of a reference
 set_five \$var;

In Perl 6, there is a separate mechanism for this use case:

 sub set_five($x is rw) {
     # no dereferencing
     $x = 5;
 }
 my $var;
 # no explicit reference taking
 set_five $var;

So again a use case of Perl 5 references is realized by another mechanism in Perl 6 (signature binding, or binding in general).

Summary

Nearly everything is a reference in Perl 6, but you still don’t see them, unless you take a very close look. The control of list flattening with item and list context is the one area where Perl 5’s referencing and dereferencing shines through the most.

Day 15 – Something Exceptional

December 15, 2011

The Perl 6 exception system is currently in development; here is a small example demonstrating a part of the current state:

 use v6;
 
 sub might_die(Real $x) {
     die "negative" if $x < 0;
     $x.sqrt;
 }
 
 for 5, 0, -3, 1+2i -> $n {
     say "The square root of $n is ", might_die($n);
 
     CATCH {
         # CATCH sets $_ to the error object,
         # and then checks the various cases:
         when 'negative' {
             # note that $n is still in scope,
             # since the CATCH block is *inside* the
             # to-be-handled block
             say "Cannot take square root of $n: negative"
         }
         default {
             say "Other error: $_";
         }
     }
 }

This produces the following output under rakudo:

 The square root of 5 is 2.23606797749979
 The square root of 0 is 0
 Cannot take square root of -3: negative
 Other error: Nominal type check failed for parameter '$x'; expected Real but got Complex instead

A few interesting points: the presence of a CATCH block automatically makes the surrounding block catch exceptions. Inside the CATCH block, all lexical variables from the outside are normally accessible, so all the interesting information is available for error processing.

Inside the CATCH block, the error object is available in the $_ variable, on the outside it is available in $!. If an exception is thrown inside a CATCH block, it is not caught — unless there is a second, inner CATCH that handles it.

The insides of a CATCH block typically consists of when clauses, and sometimes a default clause. If any of those matches the error object, the error is considered to be handled. If no clause matches (and no default block is present), the exception is rethrown.

Comparing the output from rakudo to the one that niecza produces for the same code, one can see that the last line differs:

 Other error: Nominal type check failed in binding Real $x in might_die; got Complex, needed Real

This higlights a problem in the current state: The wording of error messages is not yet specified, and thus differs among implementations.

I am working on rectifying that situation, and also throwing interesting types of error objects. In the past week, I have managed to start throwing specific error objects from within the Rakudo compiler. Here is an example:

 $ ./perl6 -e 'try EVAL q[ class A; method { $!x } ]; say "error: $!"; say $!.perl'
 error: Attribute $!x not declared in class A
 X::Attribute::Undeclared.new(
         name => "\$!x",
         package-type => "class",
         package-name => "A", filename => "",
         line => 1,
         column => Any,
         message => "Attribute \$!x not declared in class A"
 )
 # output reformatted for clarity

The string that is passed to EVAL is not a valid Perl 6 program, because it accesses an attribute that wasn’t declared in class A. The exception thrown is of type X::Attribute::Undeclared, and it contains several details: the name of the attribute, the type of package it was missing in (could be class, module, grammar and maybe others), the name of the package, the actual error message and information about the source of the error (line, cfile name (empty because EVAL() operates on a string, not on a file), and column, though column isn’t set to a useful value yet).

X::Attribute::Undeclared inherits from type X::Comp, which is the common superclass for all compile time errors. Once all compile time errors in Rakudo are switched to X::Comp objects, one will be able to check if errors were produced at run time or at compile with code like

 EVAL $some-string;
 CATCH {
     when X::Comp { say 'compile time' }
     default      { say 'run time'     }
 }

The when block smart-matches the error object against the X::Comp type object, which succeeds whenever the error object conforms to that type (so, is of that type or a subclas of X::Comp).

Writing and using new error classes is quite easy:

 class X::PermissionDenied is X::Base {
     has $.reason;
     method message() { "Permission denied: $.reason" };
 }
 # and using it somewhere:
 die X::PermissionDenied.new( reason => "I don't like your nose");

So Perl 6 has a rather flexible error handling mechanism, and libraries and applications can choose to throw error objects with rich information. The plan is to have the Perl 6 compilers throw such easily introspectable error objects too, and at the same time unify their error messages.

Many thanks go to Ian Hague and The Perl Foundation for funding my work on exceptions.

Lexicality and Optimizability

December 8, 2011

Traditional optimizations in compilers rely on compile-time knowledge about the program. Usually statically typed langauges like Java and C are rather good at that, and dynamic languages like Perl 5, ruby and python are not.

Perl 6 offers the flexibility of dynamic languages, but tries to provide much optimizability nonetheless by gradual typing, that is offering optional static type annotations.

But even in the presence of type annotations, another piece is needed for compile time dispatch decision and inlining: the knowledge about the available routines (and in the case of multi subs, the available candidates).

To provide that knowledge, Perl 6 installs subroutine in lexical scopes (and not packages / symbol tables, as in Perl 5), and lexical scopes are immutable at run time. (Variables inside the lexical scopes are still mutable, you just cannot add or remove entries at run time).

To provide the necessary flexibility, Perl 6 allows code to run at compile time. A typical way to run code at compile time is with the use directive:

 {
    use Test;  # imports routines into the current
               # lexical scope, at compile time
    plan 1;
    ok 1, 'success';
 }
 # plan() and ok() are not available here,
 # outside the scope into which the routines has been imported to.

The upside is that a sufficiently smart compiler can complain before runtime about missing routines and dispatches that are bound to fail. Current Rakudo does that, though there are a certainly cases that rakudo does not detect yet, but which are possible to detect.

 sub f(Int $x) {
          say $x * 2;
           }
 say "got here";
 f('some string');

produces this output with current Rakudo:

 ===SORRY!===
 CHECK FAILED:
 Calling 'f' will never work with argument types (str) (line 5)
     Expected: :(Int $x)

Since built-in routines are provided in an outer scope to the user’s program, all built-in routines are automatically subjected to all the same rules and optimizations as user-provided routines.

Note that this has other implications: require, which loads modules at run time, now needs a list of symbols to stub in at compile time, which are later wired up to the symbols loaded from the module.

The days are past where "a sufficiently smart compiler" was a legend; these days we have a compiler that can provide measurable speed-ups. There is still room for improvements, but we are now seeing the benefits from static knowledge and lexical scoping.

Traits — Meta Data With Character

December 4, 2011

Traits are a nice, extensible way to attach meta data to all sorts of objects in Perl 6.

An example is the is cached trait that automatically caches the functions return value, based on the argument(s) passed to it.

Here is a simple implementation of that trait:

 # this gets called when 'is cached' is added
 # to a routine
 multi sub trait_mod:<is>(Routine $r, :$cached!) {
     my %cache;
     #wrap the routine in a block that..
     $r.wrap(-> $arg {
         # looks up the argument in the cache
         %cache{$arg}:exists
             ?? %cache{$arg}
             # ... and calls the original, if it
             # is not found in the cache
             !! (%cache{$arg} = callwith($arg))
         }
     );
 }
 
 # example aplication:
 sub fib($x) is cached {
     say("fib($x)");
     $x <= 1 ?? 1 !! fib($x - 1) + fib($x - 2);
 }
 # only one call for each value from 0 to 10
 say fib(10);

A trait is applied with a verb, here is. That verb appears in the routine name that handles the trait, here trait_mod:<is>. The arguments to that handler are the object on which the trait is applied, and the name of the trait (here cached) as a named argument.

Note that a production grade is cached would need to handle multiple arguments, and maybe things like limiting the cache size.

In this example, the .wrap method is called on the routine, but of course you can do whatever you want. Common applications are mixing roles into the routine or adding them to a dispatch table.

Traits can not only be applied to routines, but also to parameters, attributes and variables. For example writable accessors are realized with the is rw trait:

 class Book {
     has @.pages is rw;
     ...
 }

Traits are also used to attach documentation to classes and attributes (stay tuned for an addvent calendar post on Pod6), marking routine parameters as writable and declaring class inheritance and role application.

This flexibility makes them ideal for writing libraries that make the user code look like a domain-specific language, and supplying meta data in a safe way.

Buffers and Binary IO

December 3, 2011

Perl 5 is known to have very good Unicode support (starting from version 5.8, the later the better), but people still complain that it is hard to use. The most important reason for that is that the programmer needs to keep track of which strings have been decoded, and which are meant to be treated as binary strings. And there is no way to reliably introspect variables to find out if they are binary or text strings.

In Perl 6, this problem has been addressed by introducing separate types. Str holds text strings. String literals in Perl 6 are of type Str. Binary data is stored in Buf objects. There is no way to confuse the two. Converting back and forth is done with the encode and decode methods.

    my $buf = Buf.new(0x6d, 0xc3, 0xb8, 0xc3, 0xbe, 0x0a);
    $*OUT.write($buf);

    my $str = $buf.decode('UTF-8');
    print $str;

Both of those output operations have the same effect, and print møþ to the standard output stream, followed by a newline. Buf.new(...) takes a list of integers between 0 and 255, which are the byte values from which the new byte buffer is constructed. $*OUT.write($buf) writes the $buf buffer to standard output.

$buf.decode('UTF-8') decodes the buffer, and returns a Str object (or dies if the buffer doesn’t consistute valid UTF-8). The reverse operation is $Buf.encode($encoding). A Str can simply be printed with print.

Of course print also needs to convert the string to a binary representation somewhere in the process. There is a default encoding for this and other operations, and it is UTF-8. The Perl 6 specification allows the user to change the default, but no compiler implements that yet.

For reading, you can use the .read($no-of-bytes) methods to read a Buf, and .get for reading a line as a Str.

The read and write methods are also present on sockets, not just on the ordinary file and stream handles.

One of the particularly nasty things you can accidentally do in Perl 5 is
concatenating text and binary strings, or combine them in another way (like with join or string interpolation). The result of such an operation is a string that happens to be broken, but only if the binary string contains any bytes above 127 — which can be a nightmare to debug.

In Perl 6, you get Cannot use a Buf as a string when you try that, avoiding that trap.

The existing Perl 6 compilers do not yet provide the same level of Unicode support as Perl 5 does, but the bits that are there are much harder to misuse.

Day 22 – The Meta-Object Protocol

December 22, 2010

Have you ever wondered how to create a class in your favorite programming language, not by writing a class definition, but by running some code? Some languages allow that by simple API calls. The API behind it is called the Meta-Object Protocol, short MOP.

Perl 6 has a MOP, and it allows you to create classes, roles and grammars, add methods and attributes, and to introspect classes. For example we can use calls to the MOP in Rakudo to find out how the Rat type (rational numbers) is implemented. Calls to methods of the MOP generally start with .^ instead of just .:

 $ perl6
 > say join ', ', Rat.^attributes
 $!numerator, $!denominator
 > # the list of all methods is a bit long,
 > # so here is a random selection
 > say join ', ', Rat.^methods(:local).pick(5)
 unpolar, ceiling, reals, Str, round
 > say Rat.^methods(:local).grep('log').[0].signature.perl
 :(Numeric $x: Numeric $base = { ... };; *%_)

Most of these lines should be fairly self-explanatory: objects of class Rat has two attributes, $!numerator and $!denominator, as well as many methods. The log method takes a Numeric value as invocant (marked by the colon after the parameter name), and an optional second parameter called $base, which has a default value (but which Rakudo can’t show you. It’s Euler’s number).

A nice use case comes from the Perl 6 database interface. It has the option to log calls on an object, and to restrict this to only log methods from a certain role (for example only a role related to connection management, or related to data retrieval). Here is the example, and a possible way to call it:

 sub log-calls($obj, Role $r) {
     my $wrapper = RoleHOW.new;
     for $r.^methods -> $m {
         $wrapper.^add_method($m.name, method (|$c) {
             # print logging information
             # note() writes to standard error
             note ">> $m";
             # call the next method of the same name,
             # with the same arguments
             nextsame;
         });
     }
     $wrapper.^compose();
     # the 'does' operator works just like 'but', but
     # only modifies a copy of the object
     $obj does $wrapper;
 }
 role Greet {
     method greet($x) {
         say "hello, $x";
     }
 }
 class SomeGreeter does Greet {
     method LOLGREET($x) {
         say "OH HAI "~ uc $x;
     }
 }
 my $o = log-calls(SomeGreeter.new, Greet);
 # logged, since provided by role Greet
 $o.greet('you');
 # not logged, because not provided by the role
 $o.LOLGREET('u');

Output:

 >> greet
 hello, you
 OH HAI U

So with a Meta-Object Protocol, classes, roles and grammars are not just accessible by special syntax, but can be accessed as a normal API. This gives new flexibility to object oriented code, and allows easy introspection and modification of objects.

Day 8 – Different Names of Different Things

December 8, 2010

(By masak and moritz)

Newcomers to the Perl programming language, version 5, often complain that
they can’t reverse strings. There’s a built-in reverse function, but it doesn’t seem to work at first glance:

    $ perl -E "say reverse 'hello'"
    hello

When such programmers ask more experienced Perl programmers, the solution is quickly found: reverse has actually two different operation modes. In list context it reverses lists, in scalar context it reverses strings.

    $ perl -E "say scalar reverse 'hello'"
    olleh

Sadly this an exception from Perl’s usual context model. For most operators or functions the operator determines the context, and the data is interpreted in that context. For example + and * work on numbers, and . works on strings (concatenation). So a symbol (or function name, in the case of uc) stands for an operation, and provides context. Not so reverse.

In Perl 6, we try to learn from previous mistakes, and get rid of historical inconsistencies. This is why list reversal, string reversal and hash inversion have been split up into separate built-ins:

    # string reversal, sorry, flipping:
    $ perl6 -e 'say flip "hello"'
    olleh
    # reversing lists
    # perl6 -e 'say join ", ", reverse <ab cd ef>'
    ef, cd, ab
    # hash inversion
    perl6 -e 'my %capitals = France => "Paris", UK => "London";
              say %capitals.invert.perl'
    ("Paris" => "France", "London" => "UK")

Hash inversion differs from the other two operations in that the result is of a different type than the input. Since hash values need not be unique, inverting a hash and returning the result as a hash would either change the structure (by grouping the keys of non-unique values together somehow), or lose information (by having one value override the other ones).

Instead hash inversion returns a list of pairs, and the user can decide which mode of operation they want for turning the result back into a hash, if that’s necessary at all.

Here’s how you do it if you want the hash inversion operation to be non-destructive:

    my %inverse;
    %inverse.push( %original.invert );

As pairs with already-seen keys get pushed onto the %inverse hash, the original value isn’t overriden but instead "promoted" to an array. Here’s a small demonstration:

    my %h;
    %h.push('foo' => 1);    # foo => 1
    %h.push('foo' => 2);    # foo => [1, 2]
    %h.push('foo' => 3);    # foo => [1, 2, 3]

All three (flip/reverse/invert) coerce their arguments to the expected type (if possible). For example if you pass a list to flip, it is coerced into a string and then the string is rever^W flipped.

Day 5 – Why Perl syntax does what you want

December 5, 2010

Opening the fifth door of our advent calendar, we don’t find a recipe of how to do something cool with Perl 6 – rather an explanation of how some of the intuitiveness of the language works.

As an example, consider these two lines of code:

    say 6 / 3;
    say 'Price: 15 Euro' ~~ /\d+/;

They print out 2 and 15, respectively. For a Perl programmer this is not surprising. But look closer: the forward slash / serves two very different purposes, the numerical division in the first line, and delimits a regex in the second line.

How can Perl know when a / means what? It certainly doesn’t look at the text after the slash to decide, because a regex can look just like normal code.

The answer is that Perl keeps track of what it expects. Most important are two things it expects: terms and operators.

A term can be literal like 23 or "a string". After parser finds such a literal, there can either be the end of a statement (indicated by a semicolon), or an operator like +, * or /. After an operator, the parser expects a term again.

And that’s already the answer: When the parser expects a term, a slash is recognized as the start of a regex. When it expects an operator, it counts as a numerical division operator.

This has far reaching consequences. Subroutines can be called without parenthesis, and after a subroutine name an argument list is expected, which starts with a term. On the other hand type names are followed by operators, so at parse time all type names must be known.

On the upside, many characters can be reused for two different syntaxes in a very convenient way.

Day 3 – File operations

December 3, 2010

Directories

Instead of opendir and friends, in Perl 6 there is a single dir subroutine, returning a list of the files in a specified directory, defaulting to the current directory. A piece of code speaks a thousand words (some result lines are line-wrapped for better readability):

    # in the Rakudo source directory
    > dir
    build parrot_install Makefile VERSION parrot docs Configure.pl 
    README dynext t src tools CREDITS LICENSE Test.pm
    > dir 't'
    00-parrot 02-embed spec harness 01-sanity pmc spectest.data

dir has also an optional named parameter test, used to grep the results


    > dir 'src/core', test => any(/^C/, /^P/)
    Parcel.pm Cool.pm Parameter.pm Code.pm Complex.pm
    CallFrame.pm Positional.pm Capture.pm Pair.pm Cool-num.pm Callable.pm Cool-str.pm

Directories are created with mkdir, as in mkdir('foo')

Files

The easiest way to read a file in Perl 6 is using slurp. slurp returns the contents of a file, as a String,


    > slurp 'VERSION'
    2010.11

The good, old way of using filehandles is of course still available

    > my $fh = open 'CREDITS'
    IO()<0x1105a068>
    > $fh.getc # reads a single character
    =
    > $fh.get # reads a single line
    pod
    > $fh.close; $fh = open 'new', :w # open for writing
    IO()<0x10f3e704>
    > $fh.print('foo')
    Bool::True
    > $fh.say('bar')
    Bool::True
    > $fh.close; say slurp('new')
    foobar

File tests

Testing the existence and types of files is done with smartmatching (~~). Again, the code:

    > 'LICENSE'.IO ~~ :e # does the file exist?
    Bool::True
    > 'LICENSE'.IO ~~ :d # is it a directory?
    Bool::False
    > 'LICENSE'.IO ~~ :f # a file then?
    Bool::True

Easy peasy.

File::Find

When the standard features are not enough, modules come in handy. File::Find (available in the File::Tools package) traverses the directory tree looking for the files you need, and generates a lazy lists of the found ones. File::Find comes shipped with Rakudo Star, and can be easily installed with neutro if you have just a bare Rakudo.

Example usage? Sure. find(:dir<t/dir1>, :type<file>, :name(/foo/)) will generate a lazy list of files (and files only) in a directory named t/dir1 and with a name matching the regex /foo/. Notice how the elements of a list are not just plain strings: they’re objects which strinigify to the full path, but also provide accessors for the directory they’re in (dir) and the filename itself (name). For more info please refer to the documentation.

Useful idioms

Creating a new file
    open('new', :w).close
"Anonymous" filehandle
    given open('foo', :w) {
        .say('Hello, world!');
        .close
    }

Day 2 – Interacting with the command line with MAIN subs

December 2, 2010

In Unix environment, many scripts take arguments and options from the command line. With Perl 6 it’s very easy to accept those:

    $ cat add.pl
    sub MAIN($x, $y) {
        say $x + $y
    }
    $ perl6 add.pl 3 4
    7
    $ perl6 add.pl too many arguments
    Usage:
    add.pl x y

By just writing a subroutine called MAIN with a signature, you automatically get a command line parser, binding from the command line arguments into the signature variables $x and $y, and a usage message if the command line arguments don’t fit.

The usage message is customizable by adding another sub called USAGE:

    $ cat add2.pl
    sub MAIN($x, $y) {
        say $x + $y
    }
    sub USAGE() {
        say "Usage: add.pl <num1> <num2>";
    }
    $ perl6 add2.pl too many arguments
    Usage: add.pl <num1> <num2>

Declaring the MAIN sub as multi allows declaring several alternative syntaxes, or dispatch based on some constant:

    $ cat calc
    #!/usr/bin/env perl6
    multi MAIN('add', $x, $y)  { say $x + $y }
    multi MAIN('div', $x, $y)  { say $x / $y }
    multi MAIN('mult', $x, $y) { say $x * $y }
    $ ./calc add 3 5
    8
    $ ./calc mult 3 5
    15
    $ ./calc
    Usage:
    ./calc add x y
    or
    ./calc div x y
    or
    ./calc mult x y

Named parameters correspond to options:

    $ cat copy.pl
    sub MAIN($source, $target, Bool :$verbose) {
        say "Copying '$source' to '$target'" if $verbose;
        run "cp $source $target";
    }
    $ perl6 copy.pl calc calc2
    $ perl6 copy.pl  --verbose calc calc2
    Copying 'calc' to 'calc2'

Declaring the parameter as Bool makes it accept no value; without a type constraint of Bool it will take an argument:

    $ cat do-nothing.pl
    sub MAIN(:$how = 'fast') {
        say "Do nothing, but do it $how";
    }
    $ perl6 do-nothing.pl
    Do nothing, but do it fast
    $ perl6 do-nothing.pl --how=well
    Do nothing, but do it well
    $ perl6 do-nothing.pl what?
    Usage:
    do-nothing.pl [--how=value-of-how]

In summary, Perl 6 offers you built-in command line parsing and usage messages, just by using subroutine signatures and multi subs.

Writing good, declarative code has never been so easy before.


Follow

Get every new post delivered to your Inbox.

Join 44 other followers