Archive for December, 2010

Day 5 – Why Perl syntax does what you want

December 5, 2010

Opening the fifth door of our advent calendar, we don’t find a recipe of how to do something cool with Perl 6 – rather an explanation of how some of the intuitiveness of the language works.

As an example, consider these two lines of code:

    say 6 / 3;
    say 'Price: 15 Euro' ~~ /\d+/;

They print out 2 and 15, respectively. For a Perl programmer this is not surprising. But look closer: the forward slash / serves two very different purposes, the numerical division in the first line, and delimits a regex in the second line.

How can Perl know when a / means what? It certainly doesn’t look at the text after the slash to decide, because a regex can look just like normal code.

The answer is that Perl keeps track of what it expects. Most important are two things it expects: terms and operators.

A term can be literal like 23 or "a string". After parser finds such a literal, there can either be the end of a statement (indicated by a semicolon), or an operator like +, * or /. After an operator, the parser expects a term again.

And that’s already the answer: When the parser expects a term, a slash is recognized as the start of a regex. When it expects an operator, it counts as a numerical division operator.

This has far reaching consequences. Subroutines can be called without parenthesis, and after a subroutine name an argument list is expected, which starts with a term. On the other hand type names are followed by operators, so at parse time all type names must be known.

On the upside, many characters can be reused for two different syntaxes in a very convenient way.

Day 4 – The Sequence Operators

December 4, 2010

Last year, there was a brief tease of the sequence operator (tweaked slightly to be correct after a year’s worth of changes to the spec):

my @even-numbers  := 0, 2 ... *;    # arithmetic seq
my @odd-numbers   := 1, 3 ... *;
my @powers-of-two := 1, 2, 4 ... *; # geometric seq

This now works in Rakudo:

> my @powers-of-two := 1, 2, 4 ... *; 1;
1
> @powers-of-two[^10]
1 2 4 8 16 32 64 128 256 512

(Note: All the code examples in this post have been run in Rakudo’s REPL, which you can reach by running the perl6 executable with no command line arguments. Lines that start with > I what I typed; the other lines are Rakudo’s response, which is generally the value of the last expression in the line. Because the variable @powers-of-two is an infinite lazy list, I’ve added 1; at the end of the line, so the REPL prints that instead of going into an infinite loop.)

We need to trim the infinite list so that Rakudo doesn’t spend an infinitely long time calculating it. In this case, I used [^10], which is a quick way of saying “Give me the first ten elements.” (Note that when you bind a lazy list to an array variable like this, values which have been calculated are remembered; it’s a quick form of memoization.)

The sequence operator ... is a very powerful tool for generating lazy lists. The above examples just start to hint at what it can do. Given one number, it just starts counting up from that number (unless the terminal end of the sequence is a lower number, in which case it counts down). Given two numbers to start a sequence, it will treat it as an arithmetic sequence, adding the difference between those first two numbers to the last number generated to generate the next one. Given three numbers, it checks to see if they represent the start of an arithmetic or a geometric sequence, and will continue it.

Of course, many interesting sequences are neither arithmetic nor geometric, in which case you need to explicitly provide the sub to generate the next number in the sequence:

> my @Fibonacci := 0, 1, -> $a, $b { $a + $b } ... *; 1;
1
> @Fibonacci[^10]
0 1 1 2 3 5 8 13 21 34

The -> $a, $b { $a + $b } there is a pointy block (ie a lambda function) which takes two arguments and returns their sum. The sequence operator figures out how many arguments the block takes, and passes the needed arguments from the end of the sequence so far to generate the next number in the sequence. And so on, forever.

Or not forever. So far all these examples have had the Whatever star on the right hand side, which means “There is no terminating condition.” If you instead have a number there, the list will terminate when that number is exactly reached.

> 1, 1.1 ... 2
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
> 1, 1.1 ... 2.01
... Rakudo spins its wheels, because this is an infinite list ...
> (1, 1.1 ... 2.01)[^14]
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2 2.3

The first one of those terminates naturally, but the second one missed the terminator and kept right on going. The result is an infinite list, so I limited it to the first 14 elements so that we could see what it was doing.

Those of you with backgrounds doing floating point math are probably sputtering about the dangers of assuming that adding .1 repeatedly will add up to exactly 2. In Perl 6, that’s not quite such an issue because it will use Rat (ie fractional) math where possible. But the general point is still very solid. If I want to find all the Fibonacci numbers below 10000, needing to know exactly the number to stop on is a big hassle. Luckily, just as you can use a block to specify how to generate the next element in a sequence, you can also use one to test to see whether the sequence should end yet:

> 0, 1, -> $a, $b { $a + $b } ... -> $a { $a > 10000 };
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946

The pointy block -> $a { $a > 10000 } creates a block which takes one argument, and returns true when that argument is greater than 10000; just the test we want.

Except we were looking for all the Fibonacci less than 10000. We generated that plus the first Fibonacci number greater than 10000. When passed a block as a termination test, the sequence operator returns all its elements until that block returns true, then it returns that last element and stops. But there is alternative form of the sequence operator that will do the trick:

> 0, 1, -> $a, $b { $a + $b } ...^ -> $a { $a > 10000 };
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765

Switching from ... to ...^ means the resulting list does not include the first element for which the termination test returned true.

Two side notes on this. This is actually a long-winded way of specifying these sequences in Perl 6. I don’t have space to explain Whatever Closures here, but this post from last year talks about them. Using them, you can rewrite that last sequence as

> 0, 1, * + * ...^ * > 10000;
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765

It’s up to you whether or not you think this is clearer; there’s more than one way to do it.

Also, the left-hand-side of the sequence operator can be any list, even lazy ones. This means you can easily use a terminating block to get a limited portion of an existing lazy list:

> my @Fibonacci := 0, 1, * + * ... *; 1;
1
> @Fibonacci ...^ * > 10000
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765
> @Fibonacci[30]
832040

(I stuck the last check there just to demonstrate that @Fibonacci still goes on past 10000.)

This only begins to scratch the surface of what sequences can do. For more information, see “List infix precedence” in the spec, and scroll down to the sequence operator. (Though note that it is still not completely implemented! It is an extremely complex operator.)

One particular twist I’d like to leave you with: the sequence operator is not constrained to working with numeric values. If you explicitly specify your own generator, you can make a sequence out of any type at all. But I’d like to leave that for a future Advent present…

Day 3 – File operations

December 3, 2010

Directories

Instead of opendir and friends, in Perl 6 there is a single dir subroutine, returning a list of the files in a specified directory, defaulting to the current directory. A piece of code speaks a thousand words (some result lines are line-wrapped for better readability):

    # in the Rakudo source directory
    > dir
    build parrot_install Makefile VERSION parrot docs Configure.pl 
    README dynext t src tools CREDITS LICENSE Test.pm
    > dir 't'
    00-parrot 02-embed spec harness 01-sanity pmc spectest.data

dir has also an optional named parameter test, used to grep the results


    > dir 'src/core', test => any(/^C/, /^P/)
    Parcel.pm Cool.pm Parameter.pm Code.pm Complex.pm
    CallFrame.pm Positional.pm Capture.pm Pair.pm Cool-num.pm Callable.pm Cool-str.pm

Directories are created with mkdir, as in mkdir('foo')

Files

The easiest way to read a file in Perl 6 is using slurp. slurp returns the contents of a file, as a String,


    > slurp 'VERSION'
    2010.11

The good, old way of using filehandles is of course still available

    > my $fh = open 'CREDITS'
    IO()<0x1105a068>
    > $fh.getc # reads a single character
    =
    > $fh.get # reads a single line
    pod
    > $fh.close; $fh = open 'new', :w # open for writing
    IO()<0x10f3e704>
    > $fh.print('foo')
    Bool::True
    > $fh.say('bar')
    Bool::True
    > $fh.close; say slurp('new')
    foobar

File tests

Testing the existence and types of files is done with smartmatching (~~). Again, the code:

    > 'LICENSE'.IO ~~ :e # does the file exist?
    Bool::True
    > 'LICENSE'.IO ~~ :d # is it a directory?
    Bool::False
    > 'LICENSE'.IO ~~ :f # a file then?
    Bool::True

Easy peasy.

File::Find

When the standard features are not enough, modules come in handy. File::Find (available in the File::Tools package) traverses the directory tree looking for the files you need, and generates a lazy lists of the found ones. File::Find comes shipped with Rakudo Star, and can be easily installed with neutro if you have just a bare Rakudo.

Example usage? Sure. find(:dir<t/dir1>, :type<file>, :name(/foo/)) will generate a lazy list of files (and files only) in a directory named t/dir1 and with a name matching the regex /foo/. Notice how the elements of a list are not just plain strings: they’re objects which strinigify to the full path, but also provide accessors for the directory they’re in (dir) and the filename itself (name). For more info please refer to the documentation.

Useful idioms

Creating a new file
    open('new', :w).close
"Anonymous" filehandle
    given open('foo', :w) {
        .say('Hello, world!');
        .close
    }

Day 2 – Interacting with the command line with MAIN subs

December 2, 2010

In Unix environment, many scripts take arguments and options from the command line. With Perl 6 it’s very easy to accept those:

    $ cat add.pl
    sub MAIN($x, $y) {
        say $x + $y
    }
    $ perl6 add.pl 3 4
    7
    $ perl6 add.pl too many arguments
    Usage:
    add.pl x y

By just writing a subroutine called MAIN with a signature, you automatically get a command line parser, binding from the command line arguments into the signature variables $x and $y, and a usage message if the command line arguments don’t fit.

The usage message is customizable by adding another sub called USAGE:

    $ cat add2.pl
    sub MAIN($x, $y) {
        say $x + $y
    }
    sub USAGE() {
        say "Usage: add.pl <num1> <num2>";
    }
    $ perl6 add2.pl too many arguments
    Usage: add.pl <num1> <num2>

Declaring the MAIN sub as multi allows declaring several alternative syntaxes, or dispatch based on some constant:

    $ cat calc
    #!/usr/bin/env perl6
    multi MAIN('add', $x, $y)  { say $x + $y }
    multi MAIN('div', $x, $y)  { say $x / $y }
    multi MAIN('mult', $x, $y) { say $x * $y }
    $ ./calc add 3 5
    8
    $ ./calc mult 3 5
    15
    $ ./calc
    Usage:
    ./calc add x y
    or
    ./calc div x y
    or
    ./calc mult x y

Named parameters correspond to options:

    $ cat copy.pl
    sub MAIN($source, $target, Bool :$verbose) {
        say "Copying '$source' to '$target'" if $verbose;
        run "cp $source $target";
    }
    $ perl6 copy.pl calc calc2
    $ perl6 copy.pl  --verbose calc calc2
    Copying 'calc' to 'calc2'

Declaring the parameter as Bool makes it accept no value; without a type constraint of Bool it will take an argument:

    $ cat do-nothing.pl
    sub MAIN(:$how = 'fast') {
        say "Do nothing, but do it $how";
    }
    $ perl6 do-nothing.pl
    Do nothing, but do it fast
    $ perl6 do-nothing.pl --how=well
    Do nothing, but do it well
    $ perl6 do-nothing.pl what?
    Usage:
    do-nothing.pl [--how=value-of-how]

In summary, Perl 6 offers you built-in command line parsing and usage messages, just by using subroutine signatures and multi subs.

Writing good, declarative code has never been so easy before.

Day 1 – Reaching the Stars

December 1, 2010

Coming from the Perl 5 world, or from some other background, you may think of the programming language and its implementation as the one thing, or at least things very strongly tied to each other. Perl 6 is different. The “official” part is the specification (the Synopses) and the tests suite. Perl 6 encourages multiple implementations. Any implementation passing the official test suite and fulfiling the Synopses may call itself “a Perl 6 implementation”. While there is still no such implementation, there is alredy a few compilers in the ecosystem.

Rakudo is targetting the Parrot virtual machine, and it’s the most complete implementation so far. Niecza is targetting .NET and is aiming to study performance issues. There is also Yapsi, whose “author has claimed that it’s official for over half a year now, and not once been contradicted” :) We will be using Rakudo for our examples as it is the most complete implementation around and has the largest ecosystem built around it.

Like with Perl 6, there is no “one Rakudo to rule them all”. The Rakudo development team has decided to keep the compiler separate from the distribution. That means the Rakudo release is not a ready-to-use Perl 6 tarball, with modules, documentation and a virtual machine. That’s what Rakudo Star is. First released about half a year ago, Rakudo Star contains the toolbox with everything needed for Perl 6 hacking: a release of Parrot virtual machine, the Perl 6 Book, a bunch of modules and the Rakudo itself.

Getting Rakudo Star

The Rakudo Star tarballs are located at https://github.com/rakudo/star/downloads. Go get one suitable for your system and become ready for Christmas… again!


Follow

Get every new post delivered to your Inbox.

Join 37 other followers