Day 17 – Of a new contributor

Day 17 – Of a new contributor

If you’re anything like me, you’ve read last year’s advent calendar posts with delight, squeeing a little bit at the neat things you could express so easily with Perl 6. You might have wondered – like I have – if all those sundry features are just shiny bells and whistles that were added on top of an okay language. And you might be wondering how to find out more, dig deeper, and maybe even contribute.

As you can tell from the fact that I’m writing this article, I’ve decided to become a contributor to the Perl 6 effort. I’m going to tell you how I got interested and then involved (and inloved) with Perl 6.

Before my involvement with Perl 6 I mostly used Python. However beautiful and flexible Python is, I worked on a project where it was a poor fit and writing proper test code was exceptionally uncomfortable. Thus, I often made mistakes – misused my data structures, passed values of incorrect types to methods – that I felt the language should be able to detect early without sacrificing flexibility. The “gradual typing” approach of Perl 6 sounded like a very good fit to me.

Having a friend show me bits and pieces of Perl 6 quickly led to looking at the Advent Calendar last year. I also joined the IRC channel and asked a whole bunch of questions. Not having done any Perl 5 programming before made figuring out the nooks and crannies of Perl 6 syntax a bit harder than I would have liked, especially when using the Perl 6 book. Fortunately, the IRC channel was always active and eager to help.

After having learnt a bit more about Perl 6, I quickly started helping out here and there. In part because I’ve already enjoyed doing that for PyPy – which is implemented in a subset of Python, much like Rakudo is implemented in NQP – but also because I kept hitting little problems and bugs.

So, how do you actually get to contributing?

Well, starting small is always good. Come to the IRC channel and ask for “low hanging fruit”. Check out bugs blocking on test coverage or easy tickets from the Perl 6 bug tracker. Try porting or creating one of the most wanted modules. Experiment with the language and look what bugs you hit. One thing that’s always good and often not very hard is fixing “LTA error messages”; those are error messages that are “less than awesome”.

At some point you’ll find something you’d like to fix. For me, the first thing was making spectests for bugs that were already fixed, but didn’t have a test yet. After that, giving a more helpful error message when a programmer tries to concatenate strings with . instead of ~. Then came fixing the output of nested Pair objects to have proper parenthesis. And then I dug deep into the Grammar, Actions and World classes to implement a suggestion mechanism for typos in variables, classes and subs.

The fact that most of your potential contributions will likely be done either in Perl 6 code or at least in NQP code makes it rather easy to get started, since even NQP is fairly high-level. And if you work through the materials from the Rakudo and NQP Internals Workshop 2013, you’ll get a decent head start in understanding how Rakudo and NQP work.

Whenever you get stuck, the people in #perl6 – including me, of course – will be happy to help you out and give advice along the way.

Let’s try to tackle a simple “testneeded” bug. this bug about parsing :$<a> pair shorthand syntax looks simple enough to write a test case for.

Here’s the executive summary:

The :$foo syntax has already been introduced a few days ago by the post on adverbly adverbs. The syntax here is a combination of the shorthand notation $<foo> to mean $/<foo>, which refers to the named match group “foo” from the regex we just matched.

So :$<foo> is supposed to give the pair "foo" => $<foo>. It didn’t in the past, but someone came along and fixed it. Now all we need to do is create a simple test case to make sure Rakudo doesn’t regress and other implementations don’t make the same mistake.

In the last comment of the discussion, jnthn already wrote a much shortened test case for this bug, hidden behind a “show quoted text” link:

'abc' ~~ /a $<a>=[\w+]/; say :$<a>.perl

Since sometimes it happens that such a test case causes a compilation error on one of the implementations, it’s useful to be able to “fudge” the whole thing in one go. That means writing something like #?niecza skip in front of a block. That’s why we wrap our whole little test in curly braces, like so:

# RT #76998
{
   my $res = do { 'abc' ~~ /a $<a>=[\w+]/; :$<a> };
   ok $res ~~ Pair, ':$<a> returns a pair';
   ok $res.key eq 'a', 'its key is "a"';
   ok $res.value ~~ Match:D, 'the pair's value is a defined match object';
}

So what have we done here? We executed the code in a do block, so that we can grab the value of the last piece of the block. The next three lines inspect the return value of our code block: Is it a Pair object? Is its key “a”, like expected? Is its value a defined Match object instance? (See day 2’s post about the :D smiley).

Also, we wrapped the whole test in a big block and put a comment on top to mention the ticket ID from the bug tracker. That way, it’s easier to see why this test exists and we can reference the commit to the perl6/roast repository in the ticket discussion when we close it (or have someone close it for us).

Next up, we have to “bump up the plan”, which means we go to the beginning of the test file and increase the line that calls “plan” with a number by 3 (which is exactly how many more ok/not ok outputs we expect our test case to generate).

Then we can either fork perl6/roast on github and make a pull-request from our change or submit a git format-patch created patch via IRC, the bug tracker, or the mailing lists.

Note that I didn’t commit anything I wrote here yet. If you’re faster than the other readers of this blog, you can feel free to go ahead and submit this. If you’re not, feel free to select a different ticket to improve upon.

Finally, if you apply for a commit bit – meaning the right to directly commit to the repositories on github – make sure to mention I sent you to get 50% off of your first 10 commits; what an incredible deal!

Day 16 – Slangs

Day 16 – Slangs

use v6;
my $thing = "123abc";
say try $thing + 1; # this will fail

{
    use v5;
    say $thing + 1 # will print 124
}

Slangs are pretty interesting things in natural languages, so naturally they will be pretty awesome in computer languages as well. Without it the cross-language communication is like talking through a thin pipe, like it is when calling C functions. It does work, but calling functions is not the only nor the most comfortable thing out there.

The example above shows that we create a variable in Perl 6 land and use it in a nested block, which derives from another language. (This does only work if the nested language is capable of handling the syntax, a dollar-sigilled variable in this case.)
We use this other language on purpose: it provides a feature that we need to solve our task.
I hope that slangs will pop up not just to provide functionality to solve a given problem, but also help in writing the code in a way that fits the nature of that said problem.

How does that even work?

The key is that the module that lets you switch to the slang provides a grammar and appropriate action methods. That is not different from how Perl 6 is implemented itself, or how JSON::Tiny works internally.
The grammar will parse all statements in our nested block, and the actions are there to translate the parsed source code (text) into something a compiler can handle better: usually abstracted operations in form of a tree, called AST.

The v5 slang compiles to QAST, which is the name of the AST that Rakudo uses. The benefit of that approach is that this data structure is already known by the guts of the Rakudo compiler. So our slang would just need to care about translating the foreign source code text into something known. The compiler takes this AST then and maps it to something the underlying backend understands.
So it does not matter if we’re running on Parrot, on the JVM or something else, the slang’s job is done when it produced the AST.

A slang was born.

In March this year at the GPW2013, I felt the need for something that glues both Perl 6 and Perl 5 together. There were many nice people that shared this urge, so I investigated how to do so.
Then I found a Perl 5 parser in the std repository. Larry Wall took the Perl 6 parser years ago and modified it to be Perl 5 conform. The Perl 6 parser it is based on is the very same that Rakudo is built upon. So the first step was to take this Perl 5 grammar, take the action methods from Rakudo, and try to build something that compiles.
(In theory this is all we needed: grammar + action = slang.)

I can’t quite remember whether it took one week or two, but then there was a hacked Rakudo that could say “Hallo World”. And it already insisted on putting parens around conditions for example. Which might be the first eye catcher for everyone when looking at both languages.
Since then there was a lot of progress in merging in Perl 5’s test suite, and implementing and fixing things, and making it a module rather than a hacked standalone Rakudo-ish thing.

Today I can proudly say that it passes more than 4600 of roughly 45000 tests. These 4600 passing tests are enough so you can play with it and feed it simple Perl 5 code. But the main work for the next weeks and months is to provide the core modules so that you can actually use a module from CPAN. Which, after all, was the main reason to create v5.

What is supported at the moment?

  • all control structures like loops and conditions
  • functions like shift, pop, chop, ord, sleep, require, …
  • mathematical operations
  • subroutine signatures that affect parsing
  • pragmas like vars, warnings, strict
  • core modules like Config, Cwd and English

The main missing pieces that hurt are:

Loop labels for next LABEL, redo LABEL and last LABEL will land soon in rakudo and v5. The other missing parts will take their time but will happen :o).

The set goals of v5:

  • write Perl 5 code directly in Perl 6 code, usually as a closure
  • allow Perl 6 lexical blocks inside Perl 5 ones
  • make it easy to use variables declared in an outer block (outer means the other language here)
  • provide the behaviour of Perl 5 operators and built-ins for v5 blocks only, nested Perl 6 blocks should not be affected
  • and of course: make subs, packages, regexes, etc available to the other language

All of the statements above are already true today. If you do a numeric operation it will behave differently in a v5 block than a Perl 6 block like the example at the top shows. That is simply because in Perl 6 the + operator will dispatch to a subroutine called &infix:<+>, but in a v5 block it translates to &infix:<P5+>.

Oversimplified it looks a bit like this:

Perl 6/5 code:

1 + 2;
{
    use v5;
    3 + 4
}

Produced AST:

- QAST::CompUnit
    - QAST::Block 1 + 2; { use v5; 3 + 4 }
        - QAST::Stmts 1 + 2; { use v5; 3 + 4 }
            - QAST::Stmt
                - QAST::Op(call &infix:<+>) +
                    - QAST::IVal(1)
                    - QAST::IVal(2)
            - QAST::Block { use v5; 3 + 4 }
                - QAST::Stmts  use v5; 3 + 4 
                    - QAST::Stmt
                        - QAST::Op(call &infix:<P5+>)
                            - QAST::IVal(3)
                            - QAST::IVal(4)

The nice thing about this is that you can use foreign operators (of the used slang) in your Perl 6 code. Like &prefix:<P5+>("123hurz") would be valid Perl 6 code that turn a string into a number even when there are trailing word characters.

To get v5 you should follow its README, but be warned, at the moment this involves recompiling Rakudo.

Conclusion: When was the last time you’ve seen a language you could extend that easily? Right. I was merely astonished how easy it is to get started. Next on your TODO list: the COBOL slang. :o)

Day 15 – Numbers and ways of writing them

Day 15 – Numbers and ways of writing them

Consider the humble integer.

my $number-of-dwarfs = 7;
say $number-of-dwarfs.WHAT;   # (Int)

Integers are great. (Well, many of them are.) Computers basically run on integers. Also, God made the integers, and everything else is basically cheap counterfeit knock-offs, smuggled into the Platonic realm when no-one’s looking. If they’re so important, we’d expect to have lots of good ways of writing them in any respectable programming language.

Do we have lots of ways of writing integers in Perl 6? We do. (Does that make Perl 6 respectable? No, because it’s a necessary but not sufficient condition. Duh.)

One thing we might want to do, for example, is write our numbers in other number bases. The need for this crops up now and then, for example if you’re stranded on Binarium where everyone has a total of two fingers:

my $this-is-what-we-humans-call-ten = 0b1010;   # 'b' for 'binary', baby
my $and-your-ten-is-our-two = 0b10;

Or you may find yourself on the multi-faceted world of Hexa X, whose inhabitants (due to an evolutionary arms race involving the act of tickling) sport 16 fingers each:

my $open-your-mouth-and-say-ten = 0xA;    # 'x' is for, um, 'heXadecimal'
my $really-sixteen = 0x10;

If you’re really unlucky, you will find yourself stuck in a file permissions factory, with no other means to signal the outer world than by using base 8:

my $halp-i'm-stuck-in-this-weird-unix-factory = 0o644;

(Fans of other C-based languages may notice the deviation from tradition here: for your own sanity, we no longer write the above number as 0644 — doing so rewards you with a stern (turn-offable) warning. Maybe octal numbers were once important enough to merit a prefix of just 0, but they ain’t no more.)

Of course, just these bases are not enough. Sometimes you will need to count with a number of fingers that’s not any of 2, 8, or 16. For those special occasions, we have another nice syntax for you:

say :3<120>;       # 15 (== 1 * 3**2 + 2 * 3**1 + 0 * 3**0)
say :28<aaaargh>;  # 4997394433
say :5($n);        # let me parse that in base 5 for you

Yes, that’s the dear old pair syntax, here hijacked for base conversions from a certain base. You will recognize this special syntax by the fact that it uses colonpairs (:3<120>), not the "fat arrow" (3 => 120), and that the key is an integer. For the curious, the integer goes up to 36 (at which point we’ve reached ‘z’ in the alphabet, and there’s no natural way to make up more symbols for "digits").

I once used :4($_) to translate DNA into proteins. Perl 6 — fulfilling your bioinformatics dreams!

Oh, and sometimes you want to convert into a certain base, essentially taking a good-old-integer and making it understandable for a being with a certain amount of fingers. We’ve got you covered there, too!

say 0xCAFEBABE;            # 3405691582
say 3405691582.base(16);   # CAFEBABE

Moving on. Perl 6 also has rationals.

my $tenth = 1/10;
say $tenth.WHAT;   # (Rat)

Usually computers (and programming languages) are pretty bad at representing numbers such as a tenth, because most of them are stranded on the planet Binarium. Not so Perl 6; it stores your rational numbers exactly, most of the time.

say (0, 1/10 ... 1);            # 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
say (1/3 + 1/6) * 2;            # 1
say 1/10 + 1/10 + 1/10 == 0.3;  # True (\o/)

The rule here is, if you give some number as a decimal (0.5) or as a ratio (1/2), it will be represented as a Rat. After that, Perl 6 does its darnedest to strike a balance between representing numbers without losing precision, and not losing too much performance in tight loops.

But sometimes you do reach for those lossy, precision-dropping, efficient numbers:

my $pi = 3.14e0;
my $earth-mass = 5.97e24;  # kg
say $earth-mass.WHAT;      # (Num)

You get floating-point numbers by writing things in the scientific notation (3.14e0, where the e0 means "times ten to the power of 0"). You can also get these numbers by doing some lossy calculation, such as exponentiation.

Do keep in mind that these are not exact, and will get you into trouble if you treat them as exact:

say (0, 1e-1 ... 1);             # misses the 1, goes on forever
say (0, 1e-1 ... * >= 1);        # 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 (oops)
say 1e0/10 + 1/10 + 1/10 == 0.3; # False (awww!)

So, don’t do that. Don’t treat them as exact, that is.

You’re supposed to be able to write lists of numbers really neatly with the good old Perl 6 quote-words syntax:

my @numbers = <1 2.0 3e0>;
say .WHAT for @numbers;          # (IntStr) (RatStr) (NumStr)

But this, unfortunately, is not implemented in Rakudo yet. (It is implemented in Niecza.)

If you’re into Mandelbrot fractals or Electrical Engineering, Perl 6 has complex numbers for you, too:

say i * i == -1;            # True
say e ** (i * pi) + 1;      # 0+1.22464679914735e-16i

That last one is really close to 0, but the exponentiation throws us into the imprecise realm of floating-point numbers, and we lose a tiny bit of precision. (But what’s a tenth of a quadrillionth between friends?)

my $googol = EVAL( "1" ~ "0" x 100 );
say $googol;                            # 1000…0 (exactly 100 zeros)
say $googol.WHAT;                       # (Int)

Finally, if you have a taste for the infinite, Perl 6 allows handing, passing, and storage of Inf as a constant, and it compares predictably against numbers:

say Inf;              # Inf
say Inf.WHAT;         # (Num)
say Inf > $googol;    # True
my $max = -Inf;
$max max= 5;          # (max= means "store only if greater")
say $max;             # 5

Perl 6 meets your future requirements by giving you alien number bases, integers and rationals with excellent precision, floating point and complex numbers, and infinities. This is what has been added — now go forth and multiply.

Day 14 – Asynchronous Programming: Promises and Channels

Day 14 – Asynchronous Programming: Promises and Channels

Some of the most exciting progress in Perl 6 over the last year has been in the area of asynchronous, concurrent and parallel programming. In this post, we’ll take a look at two of the language features that relate to this: promises and channels. But first…

A Little Design Philosophy

Threads and locks are the assembly language of parallel programming. In the spirit of “make the hard things possible”, Perl 6 does let you spawn a thread and provide you with a Lock primitive. But these are absolutely aimed at those doing the hard things. I’ve written, code-reviewed and taught parallel programming in languages where these were the primary primitives for a while. Doing code reviews was often a fairly depressing affair. It’s not just that there were bugs, it’s that often it felt like the approach taken by the code’s author was, “just throw locks all over the place and all will be well”.

In this post, I’ll focus on the things we have in Perl 6 to help make the easy things easy. They are designed around a number of principles:

  • The paradigms we provide should have a strong focus on being composable, to make it easy to extend, re-use and refactor code
  • Furthermore, it should be easy to compose the various paradigms together, as well as having ways to move between the synchronous and asynchronous worlds where needed
  • Both asynchrony and synchronization should be explicit, happen at clearly defined boundaries, and be done at a fairly high level

In general, the Perl 6 approach is that you achieve concurrency by decomposing a problem into many pieces, communicating through the provided synchronization mechanisms (those in the language, and no doubt a bunch of extra ones that will be provided by the module ecosystem over time). The approach is not about mutating shared memory. That’s decidedly in the “hard things possible” category. The fact that it’s really hard to get right is the main problem, but from a performance perspective, lots of threads competing to write to the same bit of memory is the worst case for CPU caches – which really matter these days.

Promises

A promise is a synchronization primitive for a piece of asynchronous work that will produce a single result at some point in the future, or fail to do so because something went wrong. Different languages have evolved different terms for this idea, or use the terms with different nuances. Both “future” and “task” are often used.

The easiest way to create a promise is:

my $p10000 = start {
    (1..Inf).grep(*.is-prime)[9999]
}

This schedules the work in the block to be done. By default, this means it will be scheduled to run on a pool of threads. Thus, start introduces asynchrony into a program. We continue by executing the next line of code, and the work we specified will be done on another thread. If it runs to completion and produces a result, we say that the promise was kept. If, by contrast, it dies by throwing an exception, then we say the promise was broken.

So, what can you do with a promise? Well, you can ask it for the result:

say $p10000.result;

This blocks until the promise is kept or broken. If it is kept, the value it produced is returned. If it’s broken, the exception is thrown. There’s a neater way to write this:

say await $p10000;

This may take many promises, and so you can do things like:

my @quotes = await @currency_exchanges.map(-> $ex { start { $ex.get_quote($val) } });

Although this will throw an exception if any of them fail. Thus, we may wish to wait on all of them, then just extract those that produced a result:

my @getting = @currency_exchanges.map(-> $ex { start { $ex.get_quote($val) } });
await Promise.allof(@getting);
my @quotes = @getting.grep(*.status == Kept).map(*.result);

There’s something a little interesting in there: allof. This is an example of a promise combinator: something that takes one or more promises as its arguments and creates some kind of composite promise that relates to them. And this brings us to the next interesting and important thing: a promise need not be backed by a piece of asynchronously executing code! For example, we can create a promise that will be kept after a certain amount of time has elapsed:

my $kept_in_10 = Promise.in(10);

Thus, we might provide a basic timeout mechanism, making sure any exchange that doesn’t give us a result in 5 seconds doesn’t get blocked on:

my @getting = @currency_exchanges.map(-> $ex { start { $ex.get_quote($val) } });
await Promise.anyof(Promise.allof(@getting), Promise.in(5));
my @quotes = @getting.grep(*.status == Kept).map(*.result);

Of course, sitting around and waiting for results is just one thing we can do with a promise. We can also provide things that should be done upon the promise being completed. These will also be scheduled and run asynchronously. For example:

my $p10000 = start {
    (1..Inf).grep(*.is-prime)[9999]
}
my $base16 = $p10000.then(sub ($res) {
    $res.result.base(16)
});
my $pwrite = $base16.then(sub ($res) {
    spurt 'p10000.txt', $res.result;
    return 'p10000.txt';
});

Here, we use then in order to specify something that should be done after the promise is kept or broken. This also returns a promise, meaning you can chain another operation into the process. And you can call then multiple times on one promise too, giving a kind of one-off publish/subscribe mechanism (see a future article on supplies for a much richer way to do this kind of thing, however). Note that promise takes care internally to make sure races work out OK (for example, the work being done in the promise is already completed by the time we call then).

You can also create your own promises, keeping or breaking them as you desire. This is as simple as:

# Create the promise.
my $p = Promise.new;

# Take the "vow" object, used to keep/break it.
my $v = $p.vow;

# Later, one of...
$v.keep($result);
$v.break($exception_or_message);

Thus, you can write your own promise factories and combinators too.

Channels

A promise is OK for conveying a single result, but what about producer/consumer scenarios where the producer will produce many values over time, and the consumer will process them as they are available? This is where a channel can come in useful.

Let’s say we want to read in a bunch of INI configuration files, parse each one using a grammar, and then flatten the configuration results into a single hash. There are three distinct steps here, in a producer/consumer relationship, which we can do in parallel. While the final result is a single value, and so a promise feels suitable, there are many files to read and parse. This is where channels come in. Let’s explore them using this example.

First, here is the top level of the program:

sub MAIN() {
    loop {
        my @files = prompt('Files: ').words;
        read_all(@files);
    }
}

This prompts the user for a bunch of filenames, then calls read_all. This is a little more interesting:

sub read_all(@files) {
    my $read = Channel.new;
    my $parsed = Channel.new;
    read_worker(@files, $read);
    parse_worker($read, $parsed) for 1..2;
    my %all_config = await config_combiner($parsed);
    say %all_config.perl;
}

This creates two channels, $read and $parsed. The $read channel will be used by read_worker in order to send the contents of each of the files it reads in along to the parse_worker. Here is read_worker:

sub read_worker(@files, $dest) {
    start {
        for @files -> $file {
            $dest.send(slurp($file));
        }
        $dest.close();
        CATCH { $dest.fail($_) }
    }
}

It uses the send method in order to send along the contents of each file it slurps. After slurping them all, it calls last on the channel to indicate there will be no more. The CATCH block calls fail on the channel to indicate that the producer failed. This will, when reached, throw an exception in the consumer. A channel that has had last or fail called on it can no longer be used to send values. Finally, the whole thing is wrapped in a start block so it is done on a thread in the thread pool.

The parse_worker is a little more interesting:

sub parse_worker($source, $dest) {
    my grammar INIFile {
        token TOP {
            ^
            <entries>
            <section>+
            $
        }

        token section {
            '[' ~ ']' <key> \n
            <entries>
        }

        token entries {
            [
            | <entry> \n
            | \n
            ]*
        }

        rule entry { <key> '=' <value> }

        token key   { \w+ }
        token value { \N+ }

        token ws { \h* }
    }

    my class INIFileActions {
        method TOP($/) {
            my %result;
            %result<_> = $<entries>.ast;
            for @<section> -> $sec {
                %result{$sec<key>} = $sec<entries>.ast;
            }
            make %result;
        }

        method entries($/) {
            my %entries;
            for @<entry> -> $e {
                %entries{$e<key>} = ~$e<value>;
            }
            make %entries;
        }
    }

    start {
        loop {
            winner $source {
                more $source {
                    if INIFile.parse($_, :actions(INIFileActions)) -> $parsed {
                        $dest.send($parsed.ast);
                    }
                    else {
                        $dest.fail("Could not parse INI file");
                        last;
                    }
                }
                done $source { last }
            }
        }
        $dest.close();
        CATCH { $dest.fail($_) }
    }
}

It starts off with a grammar and actions class for INI files. We then sit in a loop, watching the $source channel, which is the one that read_worker is placing results in. If a channel has one more value available, then the more block will be called. Inside it, $_ will contain the slurped contents of an INI file. We then parse it, and provided this worked out send along the hash of hashes representing the INI file’s content (sections at the top level, then key/value pairs). Again, we take care to call fail and last appropriately.

Finally, config_combiner takes each of those hash of hashes, and does the work to combine them into a single hash. It uses a promise to convey the final, single, result.

sub config_combiner($source) {
    my $p = Promise.new;
    my $v = $p.vow;
    start {
        my %result;
        loop {
            winner $source {
                more $source {
                    for %^content.kv -> $sec, %kvs {
                        for %kvs.kv -> $k, $v {
                            %result{$sec}{$k} = $v;
                        }
                    }
                }
                done $source { last }
            }
        }
        $v.keep(%result);
        CATCH { $v.break($_) }
    }
    return $p;
}

And there we have it: a program using promises and channels happily together, in a producer/consumer, map/reduce style.

Day 13 – Roasting Rakudo Star

Day 13 – Roasting Rakudo Star

Roasting Rakudo Star

When is Perl 6 going to be Ready? We get this question a lot in the Perl 6 community, and the answer is never
as simple as we or the inquirers would like.

One part of the answer involves the specification; When we have an implementation that passes all of the tests marked “perl 6.0”, that will be a Perl 6.

Many people think of the specs as the Synopses, but Patrick Michaud makes a good point that the specification is really more about the tests.

Thus it was recognized early on (in Synopsis 1) that acceptance tests provide a far more objective measure of specification conformance than an English description. There are likely things that need to be “spec” that cannot be fully captured by testing… but I still believe that the test suite should be paramount.

Every language feature must have corresponding spec tests. Trying to find a test? Tests are broken up first by Synopsis (which themselves follow the numbering scheme of the Camel chapters), with multiple directories broken out by a group of features, then individual tests. For example, Synopsis 4 (S04) is about Blocks and Statements, including phasers. So to find the tests for the BEGIN phaser, you’ll want S04-phasers/begin.t in the roast suite.

We call the specification tests roast to follow in the tradition of “smoke test”, and also because TimToady can’t resist a punny backronym: “Repository Of All Spec Tests’.

Each of these files tries to thoroughly test something from the Synopsis including a lot of edge cases that aren’t necessarily mentioned in the prose. This goes to Patrick’s point about the tests being the more canonical answer about what the spec is.

Are we there yet?

Just a little further… 

So, how can we use these tests to determine if we’re done?

Each time a developer adds a feature or a language designer documents something in the Synopses, the team must add corresponding tests in roast – each change in the Synopses text may potentially increase the gap between prose and tests, and we have to regularly verify that both sets of documents are in agreement.

The tests even turn out to inform the prose, because they are concrete code – if you cannot write a coherent test, not just because it isn’t implemented in one of the compilers yet, but because it’s inefficient or breaks other tests, this will in turn require changes to the Synopses. Over the course of Perl 6’s development, compiler authors have pushed back on the specification in this way.

Before the compiler author checks in any code, ideally they should run the full spectest suite (roast) to insure not only their new tests work, but also that nothing else broke. This can be time consuming, so it’s possible they might just run a few test files for that particular feature or Synopsis.

So, despite best efforts, failing tests might be introduced. Even if you run all of roast, something might work on your compiler, but might have problems on another. Or another VM, or another OS, or hardware, or… So, there’s a need for regular testing that’s outside of the normal code/test/commit (or test/code/commit, if you like) workflow.

Speaking of other compilers and VMs, the current landscape of Perl 6 compilers is dominated by Rakudo. It has the most passing tests, two functioning backends (parrot and the JVM), a third on the way (MoarVM), and a possible fourth (JavaScript) landing in 2014. All of the virtual machines here support a wide variety of actual hardware and OSes. Niecza is another compiler, implemented in C#. It passes a substantial number of tests. We also test Pugs (targeting Haskell), but only for historical reasons.

Roasting

It can take a while to build and run the full roast suite for even a single compiler, and we are trying to keep track of between four and six at the moment (And that’s just for one architecture!) So, with limited infrastructure, rather than have a continual integration, we setup a single daily run that builds the latest version of every compiler from a fresh checkout using a shared copy of roast (so we can compare like to like), and then saves out the information into a github repository so we can see the current state. Github provides a nice interface for viewing the CSV data.

So, every day, we get a list of which test files are failing for each compiler/backend, which of the compilers is in the lead. When Rakudo/JVM started passing more spectests than Rakudo/Parrot, we were able see that immediately on the daily run.  Given the historical data available in the github repository, one could easily chart out things like (click to embiggen)

sixchart

number of passing tests per compiler/backend on the first of each month during 2013

Ecosystem 

But that’s just the compiler. The Rakudo team bundles a distribution called Rakudo Star (also spelled Rakudo *) that includes many modules from the Perl 6 ecosystem – kind of a mini-CPAN. This source distribution includes everything you need to build Rakudo from scratch and get a bunch of usable modules. Right now it’s Rakudo/Parrot, but work is being done for the distribution to support Rakudo/JVM and other backends.

We’ve had issues in the past where modules didn’t keep up with spec changes, and the person cutting the Star release would find issues with modules just before a release, causing delays.

Now we have a daily test that builds Rakudo Star, using the latest version of every module, Rakudo, NQP, and runs all the module tests, allowing us to catch any deprecation warnings or test errors as soon as they are made, rather than when we’re trying to cut a release. It’s plain text at the moment, but functions well as a warning indicator.

Test to the Future

Two other projects are in place for testing:

  • Colomon has an ad hoc process that tests all the modules in the ecosystem, not just star.
  • japhb has created a benchmark suite to help us prevent performance regressions across the various compilers. Here’s a video presentation.

Going forward, we need to setup and encourage the use of a smoke server so that we can take the daily runs on the testing platform, and combine them with the results from other platforms, compilers, etc.

Drop by the #perl6 channel on freenode or leave a comment here if you want to chat more about testing!

[Coke]

 

Day 12 – Slicing with adverbs, the only way!

Day 12 – Slicing with adverbs, the only way!

My involvement with adverbs in Perl 6 began very innocently. I had the idea to creating a small, lightning talk size presentation about how the Perl 5 fat arrow corresponds to Perl 6’s fatarrow and adverbs. And how they relate to hash / array slices. And then had to find out you couldn’t combine them on hash / array slices. Nor could you pass values to them.

And so started my first bigger project on Rakudo Perl 6. Making adverbs work as specced on hashes and arrays, and on the way, expand the spec as well. So, do they now work? Well, all spectests pass. But while preparing this blog post, I happened to find a bug which is now waiting for my further attention. There’s always one more bug.

What are the adverbs you can use with hash and array slices?

name description
:exists whether element(s) exist(ed)
:delete remove element(s), return value (if any)
:kv return key(s) and value(s) as Parcel
:p return key(s) and value(s) as Parcel of Pairs
:k return key(s) only
:v return value(s) only

:exists

This adverb replaces the now deprecated .exists method. Adverbs provide a generic interface to hashes and arrays, regardless of number of elements requested. The .exists method only ever allowed checking for a single key.

Examples speak louder than words. To check whether a single key exists:

$ perl6 -e 'my %h = a=>1, b=>2; say %h<a>:exists’
True

If we expand this to a slice, we get a Parcel of boolean values:

$ perl6 -e 'my %h = a=>1, b=>2; say %h<a b c>:exists'
True True False

Note that if we ask for a single key, we get a boolean value back, not a Parcel with one Bool in it.

$ perl6 -e 'my %h = a=>1, b=>2; say (%h<a>:exists).WHAT’
(Bool)

If it is clear that we ask for multiple keys, or not clear at compile time that we are only checking for one key, we get back a Parcel:

$ perl6 -e 'my %h = a=>1, b=>2; say (%h<a b c>:exists).WHAT’
(Parcel)
$ perl6 -e 'my @a="a"; my %h = a=>1, b=>2; say (%h{@a}:exists).WHAT'
(Parcel)

Sometimes it is handier to know if something does not exist. You can easily do this by negating the adverb by prefixing it with !: they’re really just like named parameters anyway!

$ perl6 -e 'my %h = a=>1, b=>2; say %h<c>:!exists'
True

:delete

This is the only adverb that actually can make changes to the hash or array it is (indirectly) applied to. It replaces the now deprecated .delete method.

$ perl6 -e 'my %h = a=>1, b=>2; say %h<a>:delete; say %h.perl'
1
("b" => 2).hash

Of course, you can also delete slices:

$ perl6 -e 'my %h = a=>1, b=>2; say %h<a b c>:delete; say %h.perl'
1 2 (Any)
().hash

Note that the (Any) is the value returned for the non-existing key. If you happened to have given the hash a default value, it would have looked like this:

$ perl6 -e 'my %h is default(42) = a=>1, b=>2; say %h<a b c>:delete; say %h.perl'
1 2 42
().hash

But the behaviour of the is default maybe warrants a blog post of itself, so I won’t go into it now.

Like with :exists, you can negate the :delete adverb. But there wouldn’t be much point, as you might have well not specified it at all. However, since adverbs are basically just named parameters, you can make the :delete attribute conditional:

$ perl6 -e 'my $really = True; my %h = a=>1, b=>2; say %h<a b c>:delete($really); say %h.perl'
1 2 (Any)
().hash

Because the value passed to the adverb was true, the deletion actually took place. However, if we pass a false value:

$ perl6 -e ‘my $really; my %h = a=>1, b=>2; say %h<a b c>:delete($really); say %h.perl'
1 2 (Any)
("a" => 1, "b" => 2).hash

It doesn’t. Note that the return value did not change: the deletion was simply not performed. This can e.g. be very handy if you have a subroutine or method doing some kind of custom slice, and you want to have an optional parameter indicating whether the slice should be deleted as well: simply pass that parameter as the adverb’s value!

:kv, :p, :k, :v

These 4 attributes modify the returned values from any hash / array slice. The :kv attribute returns a Parcel with keys and values interspersed. The :p attribute returns a Parcel of Pairs. The :k and :v attributes return the key only, or the value only.

$ perl6
> my %h = a => 1, b => 2;
("a” => 1, "b” => 2).hash
> %h<a>:kv
a 1
> %h<a>:p
"a" => 1
> %h<a>:k
a
> %h<a>:v
1

Apart from modifying the return value, these attributes also act as a filter for existing keys only. Please note the difference in return values:

> %h<a b c>
1 2 (Any)
> %h<a b c>:v
1 2

Because the :v attribute acts as a filter, there is no (Any). But sometimes, you want to not have this behaviour. To achieve this, you can negate the attribute:

> %h<a b c>:k
a b
> %h<a b c>:!k
a b c

Combining adverbs

You can also combine adverbs on hash / array slices. The most useful combinations are with one or two of :exists and :delete, with zero or one of :kv, :p, :k, :v. Some examples, like putting a slice out of one hash into a new hash:

$ perl6 -e 'my %h = a=>1, b=>2; my %i = (%h<a c>:delete:p).list; say %h.perl; say %i.perl'
("b” => 2).hash
("a” => 1).hash

Or the keys that were actually deleted:

$ perl6 -e 'my %h = a=>1, b=>2; say %h<a b c>:delete:k’
a b

We actually have a spec that describes which combinations are valid, and what they should return.

Arrays are not Hashes

Apart from hashes using {} for slices, and arrays [] for slices, the adverbial syntax for hash and array slices are the same. But there are some subtle differences. First of all, the “key” of an element in an array, is its index. So, to show the indexes of elements in an array that have a defined value, one can use the :k attribute:

$ perl6 -e 'my @a; @a[3] = 1; say @a[]:k'
3

Or, to create a Parcel with all elements in an array:

$ perl6 -e 'my @a; @a[3] = 1; say @a[]:!k’
0 1 2 3

However, deleting an element from an array, is similar to assigning Nil to it, so it will return its default value (usually (Any)):

$ perl6 -e 'my @a = ^10; @a[3]:delete; say @a[2,3,4]; say @a[2,3,4]:exists'
2 (Any) 4
True False True

If we have specified a default value for the array, the result is slightly different:

$ perl6 -e 'my @a is default(42) = ^10; @a[3]:delete; say @a[2,3,4]; say @a[2,3,4]:exists'
2 42 4
True False True

So, even though the element “does not exist”, it can return a defined value! As said earlier, that may become a blog post for another day!

Conclusion

Slices with adverbs are a powerful way of handling your data structures, be they hashes or arrays. It will take a while to get used to all of the combinations of adverbs that can be specified. But once you’re used to them, they provide you with a concise way of dicing and slicing your data that would previously have involved more elaborate structures with loops and conditionals. Of course, if you want to, you can still do that: it’s not illegal to program Perl 5 in Perl 6 :-)

Day 11 – Installing Modules

Day 11 – Installing Modules

“Honey, I can’t find my keys!”
– “Hmmm, have you already looked at home or site?”

Preface: This post is about a new feature which currently resides in the branches rakudo/eleven and panda/eleven.

So this post is about installing “modules” and finding them again later. I quoted the word “modules” here because we are not really talking about modules. Even when we say that we meant classes, roles, grammars and every other packagy type by that term, we’re in fact talking about distributions.

That is what we see when we look at modules.perl6.org. These things that have a name, an author or authority and hopefully a version, are the things that provide compilation units which then can be loaded later using statements like use Foo, need Bar or require Baz.

But these distributions can ship other information as well: executable scripts or music, graphics or fonts that are used by another application.
And this bunch of information that is put in a paper bag called distribution, labeled with name/auth/ver is meant to be downloaded by an installer (panda), placed safely on your harddisk, your stick or a webspace, and should be easily locatable later when we need it.

But we are devs, right? We want to use our in-developement-modules without the need to install them. So, there should be a way of telling the compiler that we have a directory structure where our github clones are. These directories should be considered when searching for candidates of a use statement. And, to the fact that we are lacking the paper bag in such a situation, these should be preferred, whatever name/auth/version-trait a use statement may have attached.

This could be one of our rules of thumb: Not installed modules in a known path are preferred over installed ones.

Our first crux, or: Use it.

use Foo:ver<1.2.3> does not mean you are loading a module Foo with version v1.2.3. You are in case loading a package Foo that is part of a distribution that has the required version and that provides such a namespace.

Al right, we are all good hackers, we can handle that. We would just need a (sort of) database were we can put all installed distributions that we would query later, say, when useing a module.

After a few days and the first prototype we would come at a point where we play with panda, our installer toolchain.
We would be ready in so far that panda would install dists into our database. Our tests would show that we could load these installed modules by name, auth and version even when several distributions would supply modules that only differ by version number.
Wasn’t that hard… All fine now?

The second crux, or: The installer installs the installer.

Even panda itself must be installed in our new environment. And that will become insteresting in two ways. We take the pathy way first:
What panda does when we execute its bootstrap.pl script is that it loads the not-yet-installed File::Find for example, compiles it, and installes it to the destination path, just to pick it up to compile Shell::Command. That breaks the our rule of thumb badly. Now a installed module should preferred.
It seems like we would need some sort of ordering there.¹

The third crux, or: I thought it is all about modules?

Panda (or perhaps pandora) offers another box for us: It is our first distribution that has executable files.
Okay, we have a problem here. Our task is to install several versions of the same distribution, but all of them are going to provide executables with the same name, but likely with different functionality?
Clearly we need a way of invoking the correct executable. Our shell would just pick the executable that is found in PATH first. We need something better.
What if we would only create one `bin` folder per installation repository? We could have a script that delegates to the correct version of the wanted executable. Querying our wrapper would then look like this:

panda --ver=1.2 install Foo::Bar

Our wrapper would only need to know about parameters named `–auth`, `–name` and `–ver`, and would just pass everything else to the original executable panda in this case.
Luckily this helps us in another aspect. We could install wrappers like panda-p and panda-j also, which would explicitely invoke the backends parrot and jvm.

The final chapter.

Let us forget about the subjunctive for a moment, what can we do *now*?

There are two interesting branches: rakudo/eleven and panda/eleven. Called after today’s date and the fact that the corresponding spec is the S11.
With these two branches you are able to:

  1. configure your directories for vendor, perl, site and home and also your developement paths using the libraries.cfg.
  2. bootstrap panda which gives you panda, panda-p and panda-j executeables
  3. install modules the “new” way, and also locate them in the following way:
    use Foo:ver(*);
    use Foo:ver(1.*);
    use Foo:ver(/alpha$/);
    use Foo:auth<FROGGS>
    use Foo:auth({ .substr(0,3) eq 'Bar' });
    ...
  4. you can invoke executables like:
    myscript --auth=Peter rec0001.wav
    yourscript --ver="2.*" index.html
    ...
    

I hope this will land in the master/nom branch soon, but I think there are a few glitches that need to be discovered and fixed before doing so. (One glitch might be just less Windows® testing from my side.)

Another glitch, now that I think about it: When you load a specific version of a module or execute a script, the magic must make sure that it prefers its own distribution when it loads modules without the need to specify this in the use statements. Otherwise you would execute the panda script version v1 while this loads modules of version v2.
This will require additional thought in the S11 specification.

A note for module authors:

You probably know about the META.info, in most cases you need to add a “provides” section as shown here.
Without that the packages can’t be used. This “provides” section will not break current code, so please add that.

¹) You can set the ordering of the repositories in your libraries.cfg and in -I compiler switches like:

perl6 -ICompUnitRepo::Local::File:prio[10]=/home/peter/project-a:/home/peter/project-b