Author Archive

Day 19 – Perl 6 Supplies Reactive Programming

December 19, 2013

Several days back, we took a look at promises and channels. Promises provided a synchronization mechanisms for asynchronous things that produced a single result, while channels were ideal for setting up producer/consumer style workflows, with producers and consumers able to work in parallel. Today we’ll take a look at a third mechanism aimed at introducing asynchrony and coping with concurrency: supplies!

Synchronous = Pull, Asynchronous = Push

One of the most important features of promises is the then method. It enables one or more things to be run whenever the asynchronous work the promise represents is completed. In a sense, it’s like a publish/subscribe mechanism: you call then to subscribe, and when the work is done then the notification is published. This happens asynchronously. You don’t sit around waiting for the promise to complete, but instead just say what to do when this takes place. Thinking about the way data flows, values are pushed along to the next step of the process.

This is rather different to things like sub and method calls. There, we make the call, then we block until it has completed. Iterating over a lazy list is the same: you can’t progress with the iteration until the next value is available. Thus, iteration is really about pulling stuff from a source. So, if a promise can be thought of as something that can push a single values out as it becomes available, do we have something that can push a whole stream of values outwards, as they are produced over time?

Supplies! We do!

A while back, it was realized that the observer pattern is the mathematical dual of the iterator pattern. Why is this exciting? Quite simply, because it means that all the things you can sensibly do with something you can iterate (map, grep, zip, etc.), you can also sensibly do with something you can observe. Out of this was born reactive programming, and the Rx (Reactive Extensions) library, which has now been ported to many platforms. In Perl 6, we’re providing support for this in core.

The basics

Let’s start out simple. First, we create a new Supply:

my $measurements = Supply.new;

We can then tap the supply, passing a closure that should be called whenever a value is made available:

$measurements.tap(-> $value {
    say "Measured: $value";
});

Finally, we produce some values:

$measurements.more(1.5);
$measurements.more(2.3);
$measurements.more(4.6);

On each of these calls, the closure we tapped the supply with is invoked. Just as we can call then many times, so we can tap many times too:

$measurements.tap(-> $value {
    say "Also measured: $value";
});

Now, when we produce a value:

$measurements.more(2.8);

Both of the closures tapping the supply will be called. Note that tap returns an object which can be used to express you’re no longer interested in the supply, essentially turning that tap off.

Note that we didn’t introduce any asynchrony so far. However, supplies are built for it. You can safely have multiple threads supplying values. By default, the supplying thread is used to execute the taps.

Enter the combinators!

Since supplies are essentially a thread-safe observer implementation, we can define many of the same things on them as we’re used to having on lists. For example, imagine we just wanted to tap high measurements. Well, we just re-use knowledge from how we’d filter a list: using grep!

$measurements.grep(* > 4).tap(-> $value {
    say "HIGH: $value";
});

Calling grep on a supply produces another supply, just as calling grep on a list gives another list. We could, if we wished, store it in a variable and tap this derived supply many times, grep it again, map it, etc.

Supply factories

There are ways to get supplies besides simply creating them directly. The Supply class has various factory methods that create various interesting kinds of supply, while introducing asynchrony. For example, interval gives a supply that, when tapped, will produce an ascending integer once per time interval.

my $secs = Supply.interval(1);
$secs.tap(-> $s { say "Started $s seconds ago" });
sleep 10;

Factories can also help map between paradigms. The Supply.for method produces a supply that, when tapped, will iterate the specified (potentially lazy) list and push the values out to the tapper. It does the iteration asynchronously. While it’s not implemented yet, we’ll be able to define a similar mechanism for taking a Channel and tapping each value that is received.

Crossing the streams

Some of the most powerful – and tricky to implement – combinators are those that involve multiple supplies. For example, merge gives a single supply whose values are those of the two other supplies it tapped, and zip pairs together values from two different supplies. These are tricky to implement because it’s entirely possible that two different threads will be supplying values. Thankfully, though, we just need to manage this once inside of the supplies implementation, and save those using them from worrying about the problem! In a sense, combinators on lists factor out flow control, while combinators on supplies factor out both flow control and synchronization. Both let us program in a more declarative style, getting the imperative clutter out of our code.

Let’s bring all of this together with an example from one of my recent presentations. We simulate a situation where we have two sets of readings coming in: first, measurements from a belt, arriving in batches of 100, which we need to calculate the mean of, and second another simpler value arriving once every 5 seconds. We want to label them, and get a single supply merging these two streams of readings together. Here’s how it can be done:

my $belt_raw = Supply.interval(1).map({ rand xx 100 });
my $belt_avg = $belt_raw.map(sub (@values) {
    ([+] @values) / @values
});
my $belt_labeled = $belt_avg.map({ "Belt: $_" });
my $samples = Supply.interval(5).map({ rand });
my $samples_labeled = $samples.map({ "Sample: $_" });
my $merged = $belt_labeled.merge($samples_labeled);
$merged.tap(&say);
sleep 20;

Notice how it’s not actually that much harder than code that maps and greps lists we already had – and yet we’re managing to deal with both time and concurrently arriving data.

The future

Supplies are one of the most recently implemented things in Rakudo, and what’s done so far works on Rakudo on the JVM. With time, we’ll flesh out the range of combinators and factories, keep growing our test coverage, and deliver this functionality on Rakudo on MoarVM too, for those who don’t want to use the JVM.

Day 14 – Asynchronous Programming: Promises and Channels

December 14, 2013

Some of the most exciting progress in Perl 6 over the last year has been in the area of asynchronous, concurrent and parallel programming. In this post, we’ll take a look at two of the language features that relate to this: promises and channels. But first…

A Little Design Philosophy

Threads and locks are the assembly language of parallel programming. In the spirit of “make the hard things possible”, Perl 6 does let you spawn a thread and provide you with a Lock primitive. But these are absolutely aimed at those doing the hard things. I’ve written, code-reviewed and taught parallel programming in languages where these were the primary primitives for a while. Doing code reviews was often a fairly depressing affair. It’s not just that there were bugs, it’s that often it felt like the approach taken by the code’s author was, “just throw locks all over the place and all will be well”.

In this post, I’ll focus on the things we have in Perl 6 to help make the easy things easy. They are designed around a number of principles:

  • The paradigms we provide should have a strong focus on being composable, to make it easy to extend, re-use and refactor code
  • Furthermore, it should be easy to compose the various paradigms together, as well as having ways to move between the synchronous and asynchronous worlds where needed
  • Both asynchrony and synchronization should be explicit, happen at clearly defined boundaries, and be done at a fairly high level

In general, the Perl 6 approach is that you achieve concurrency by decomposing a problem into many pieces, communicating through the provided synchronization mechanisms (those in the language, and no doubt a bunch of extra ones that will be provided by the module ecosystem over time). The approach is not about mutating shared memory. That’s decidedly in the “hard things possible” category. The fact that it’s really hard to get right is the main problem, but from a performance perspective, lots of threads competing to write to the same bit of memory is the worst case for CPU caches – which really matter these days.

Promises

A promise is a synchronization primitive for a piece of asynchronous work that will produce a single result at some point in the future, or fail to do so because something went wrong. Different languages have evolved different terms for this idea, or use the terms with different nuances. Both “future” and “task” are often used.

The easiest way to create a promise is:

my $p10000 = start {
    (1..Inf).grep(*.is-prime)[9999]
}

This schedules the work in the block to be done. By default, this means it will be scheduled to run on a pool of threads. Thus, start introduces asynchrony into a program. We continue by executing the next line of code, and the work we specified will be done on another thread. If it runs to completion and produces a result, we say that the promise was kept. If, by contrast, it dies by throwing an exception, then we say the promise was broken.

So, what can you do with a promise? Well, you can ask it for the result:

say $p10000.result;

This blocks until the promise is kept or broken. If it is kept, the value it produced is returned. If it’s broken, the exception is thrown. There’s a neater way to write this:

say await $p10000;

This may take many promises, and so you can do things like:

my @quotes = await @currency_exchanges.map(-> $ex { start { $ex.get_quote($val) } });

Although this will throw an exception if any of them fail. Thus, we may wish to wait on all of them, then just extract those that produced a result:

my @getting = @currency_exchanges.map(-> $ex { start { $ex.get_quote($val) } });
await Promise.allof(@getting);
my @quotes = @getting.grep(*.status == Kept).map(*.result);

There’s something a little interesting in there: allof. This is an example of a promise combinator: something that takes one or more promises as its arguments and creates some kind of composite promise that relates to them. And this brings us to the next interesting and important thing: a promise need not be backed by a piece of asynchronously executing code! For example, we can create a promise that will be kept after a certain amount of time has elapsed:

my $kept_in_10 = Promise.in(10);

Thus, we might provide a basic timeout mechanism, making sure any exchange that doesn’t give us a result in 5 seconds doesn’t get blocked on:

my @getting = @currency_exchanges.map(-> $ex { start { $ex.get_quote($val) } });
await Promise.anyof(Promise.allof(@getting), Promise.in(5));
my @quotes = @getting.grep(*.status == Kept).map(*.result);

Of course, sitting around and waiting for results is just one thing we can do with a promise. We can also provide things that should be done upon the promise being completed. These will also be scheduled and run asynchronously. For example:

my $p10000 = start {
    (1..Inf).grep(*.is-prime)[9999]
}
my $base16 = $p10000.then(sub ($res) {
    $res.result.base(16)
});
my $pwrite = $base16.then(sub ($res) {
    spurt 'p10000.txt', $res.result;
    return 'p10000.txt';
});

Here, we use then in order to specify something that should be done after the promise is kept or broken. This also returns a promise, meaning you can chain another operation into the process. And you can call then multiple times on one promise too, giving a kind of one-off publish/subscribe mechanism (see a future article on supplies for a much richer way to do this kind of thing, however). Note that promise takes care internally to make sure races work out OK (for example, the work being done in the promise is already completed by the time we call then).

You can also create your own promises, keeping or breaking them as you desire. This is as simple as:

# Create the promise.
my $p = Promise.new;

# Take the "vow" object, used to keep/break it.
my $v = $p.vow;

# Later, one of...
$v.keep($result);
$v.break($exception_or_message);

Thus, you can write your own promise factories and combinators too.

Channels

A promise is OK for conveying a single result, but what about producer/consumer scenarios where the producer will produce many values over time, and the consumer will process them as they are available? This is where a channel can come in useful.

Let’s say we want to read in a bunch of INI configuration files, parse each one using a grammar, and then flatten the configuration results into a single hash. There are three distinct steps here, in a producer/consumer relationship, which we can do in parallel. While the final result is a single value, and so a promise feels suitable, there are many files to read and parse. This is where channels come in. Let’s explore them using this example.

First, here is the top level of the program:

sub MAIN() {
    loop {
        my @files = prompt('Files: ').words;
        read_all(@files);
    }
}

This prompts the user for a bunch of filenames, then calls read_all. This is a little more interesting:

sub read_all(@files) {
    my $read = Channel.new;
    my $parsed = Channel.new;
    read_worker(@files, $read);
    parse_worker($read, $parsed) for 1..2;
    my %all_config = await config_combiner($parsed);
    say %all_config.perl;
}

This creates two channels, $read and $parsed. The $read channel will be used by read_worker in order to send the contents of each of the files it reads in along to the parse_worker. Here is read_worker:

sub read_worker(@files, $dest) {
    start {
        for @files -> $file {
            $dest.send(slurp($file));
        }
        $dest.close();
        CATCH { $dest.fail($_) }
    }
}

It uses the send method in order to send along the contents of each file it slurps. After slurping them all, it calls last on the channel to indicate there will be no more. The CATCH block calls fail on the channel to indicate that the producer failed. This will, when reached, throw an exception in the consumer. A channel that has had last or fail called on it can no longer be used to send values. Finally, the whole thing is wrapped in a start block so it is done on a thread in the thread pool.

The parse_worker is a little more interesting:

sub parse_worker($source, $dest) {
    my grammar INIFile {
        token TOP {
            ^
            <entries>
            <section>+
            $
        }

        token section {
            '[' ~ ']' <key> \n
            <entries>
        }

        token entries {
            [
            | <entry> \n
            | \n
            ]*
        }

        rule entry { <key> '=' <value> }

        token key   { \w+ }
        token value { \N+ }

        token ws { \h* }
    }

    my class INIFileActions {
        method TOP($/) {
            my %result;
            %result<_> = $<entries>.ast;
            for @<section> -> $sec {
                %result{$sec<key>} = $sec<entries>.ast;
            }
            make %result;
        }

        method entries($/) {
            my %entries;
            for @<entry> -> $e {
                %entries{$e<key>} = ~$e<value>;
            }
            make %entries;
        }
    }

    start {
        loop {
            winner $source {
                more $source {
                    if INIFile.parse($_, :actions(INIFileActions)) -> $parsed {
                        $dest.send($parsed.ast);
                    }
                    else {
                        $dest.fail("Could not parse INI file");
                        last;
                    }
                }
                done $source { last }
            }
        }
        $dest.close();
        CATCH { $dest.fail($_) }
    }
}

It starts off with a grammar and actions class for INI files. We then sit in a loop, watching the $source channel, which is the one that read_worker is placing results in. If a channel has one more value available, then the more block will be called. Inside it, $_ will contain the slurped contents of an INI file. We then parse it, and provided this worked out send along the hash of hashes representing the INI file’s content (sections at the top level, then key/value pairs). Again, we take care to call fail and last appropriately.

Finally, config_combiner takes each of those hash of hashes, and does the work to combine them into a single hash. It uses a promise to convey the final, single, result.

sub config_combiner($source) {
    my $p = Promise.new;
    my $v = $p.vow;
    start {
        my %result;
        loop {
            winner $source {
                more $source {
                    for %^content.kv -> $sec, %kvs {
                        for %kvs.kv -> $k, $v {
                            %result{$sec}{$k} = $v;
                        }
                    }
                }
                done $source { last }
            }
        }
        $v.keep(%result);
        CATCH { $v.break($_) }
    }
    return $p;
}

And there we have it: a program using promises and channels happily together, in a producer/consumer, map/reduce style.

Day 03 – Rakudo Perl 6 on the JVM

December 3, 2013

There have been a number of exciting developments for Perl 6 during 2013. In this post, we’ll take a look at one of them in some detail: running Perl 6 on the JVM (Java Virtual Machine).

Why the JVM?

There are many reasons for a language to have an implementation that targets the JVM. Here are some that drove us to bring Perl 6 to this platform.

  • The JVM is a stable, widely deployed,  trusted-in-the-enterprise platform. There are places where they don’t mind which language you write it, but they do care that it can run on the JVM.
  • The JVM has been very well optimized over the years. It most certainly isn’t fast to get started – but for long running things it typically performs well.
  • These days, the JVM is most certainly not just for Java. In fact, the commitment to run other languages – including those very different to Java – is serious. For example, there’s now a yearly JVM Language Summit, and the invokedynamic instruction and infrastructure was added in JDK7, and being improved in JDK8. Since Perl 6 is a gradually typed language, a VM that can play host to both static and dynamic languages is a good fit. Furthermore, every other major dynamic language is on the JVM. So, why not Perl too?
  • The JVM has widely used, well exercised support for concurrent, parallel and asynchronous programming. A wide range of primitives are available. Given that before this year, the Perl 6 story in these areas was also rather weak so far as implementation went, being on the JVM would provide an opportunity for fast prototyping and exploration, to help drive things forward.

But…another Perl 6 implementation?!

Implementing Perl 6 is a large undertaking – as those of us who got sucked into the process along the way have discovered. Many languages have got to the JVM by having a JVM-specific implementation of the language: JRuby, Jython, Nashorn, etc. For Perl 6, we’ve taken a different path.

The Rakudo Perl 6 compiler may only have targeted Parrot for much of its life, but those designing it have had VM portability in mind for a good while. Furthermore, the basic architecture has always been to have strongly isolated compilation stages, communicating by well-defined data structures. This put Rakudo in a good place to gain a JVM backend – at least, in theory. Over the course of the last year, what we hoped would work out well in theory has played out very nicely in practice.

The vast majority of the Rakudo codebase is not in any way VM-specific. Better still, the bits that need to change most often and that undergo most active development are almost always VM-independent. Many developers working on Rakudo test their changes against a single backend, and it’s relatively uncommon to find breakage on the other backend as a result. That said, we have automated daily spectest runs to catch any regressions.

Status

First, let’s consider the specification test suite. You might think at this point I’d mention how close Rakudo on JVM is to passing the number of specification tests that Rakudo on Parrot does. In fact, we need to do it the other way around these days: Rakudo on Parrot passes 99.64% of the spectests that Rakudo on the JVM does. “Huh,” you might think. “How’d the JVM backend come out ahead?” The answer is relatively simple: the JVM backend runs a bunch of concurrency tests that we don’t run on the Parrot backend. There actually are a small number of tests (tens rather than hundreds) that only pass on Rakudo on Parrot, largely due to “interesting” edge-case behaviors that have yet to be hunted down. However, these days the vast majority of programs run unmodified on both.

In the wider ecosystem, things are not quite so polished yet. Panda, the module installer, runs on Rakudo on the JVM. However, a number of modules depend on the NativeCall library, for calling into native code. The NativeCall porting effort is very much underway; last time I looked at it I could do basic things, like calling simple Win32 APIs. But it’s not all the way there yet. This is, however, really the last major missing piece. To say this time last year, we couldn’t run Perl 6 on the JVM at all, we’ve come a very long way.

Is it faster?

Well, it depends. For quick one liners and short-running scripts? No, startup will kill you. For something long running? Yes, usually it’s faster, and sometimes it’s significantly faster (perhaps five times or even forty times). And that’s before we’ve really done a great deal of optimization work on the JVM backend; the focus thus far has largely been “make it work”.

Can I call Java libraries?

Yes, but… :-) We do have some basic interop support in place already. Here’s an example:

use java::util::zip::CRC32:from<java>;

my $crc = CRC32.new();
for 'Hello, Java'.encode('utf-8') {
    $crc.'method/update/(B)V'($_);
}
say $crc.getValue();

It doesn’t look all that bad until you hit the method call in the loop. What’s that funny method/update/(B)V thing about? In Java you can statically overload methods. When there’s no overloading, we quite happily give you a short name. When there’s multiple, for now you need to use the JVM’s method descriptors to indicate the desired one. We’ll improve that, and many other aspects of interop, over the coming months. In summary, it’s often quite possible to call code from Java libraries today, it’s just not pleasant yet.

The future

Much has been done, yet of course there’s still plenty to do. Once NativeCall support is in shape, we’ll be able to add the JVM as an option to the Rakudo Star distribution release (for now, it’s only available in compiler releases – or fresh from Git, of course). Beyond that, the main areas of focus will be convergence, Java interop and performance. Given that this year took us from zero JVM support to Rakudo on JVM being the implementation passing the most spectests, it’s exciting to think where we’ll be in another year from now.

Day 15 – Phasers set to stun

December 15, 2012

When writing programs, it’s important not only to separate the concerns that need separating, but also to try and keep related things close to each other. This gives the program a sense of cohesion, and helps to avoid the inevitable problems that arise when updating one part of a program demands an update in another far-away part. One especially tricky problem can be when the things we want to do are distributed over time. This can cause us to move related things apart in order to get them to happen at the times we want.

Phasers in Perl 6 help you keep related concepts together in your code, while also indicating that certain aspects of them should happen at different points during the lifetime of the current program, invocation or loop construct. Let’s take a look at some of them.

ENTER and LEAVE

One of the things I had most fun writing in Perl 6 recently was the debugger. There are various things that need a little care. For example, the debugger needs to look out for exceptions and, when they are thrown, give the user a prompt to let them debug why the exception was thrown. However, there is also a feature where, at the prompt, you can evaluate an expression. The debugger shouldn’t re-enter itself if this expression throws, so we need to keep track of if we’re already showing the prompt. This meant setting and clearing a flag. Thing is, the prompt method is relatively lengthy; it has a given/when to identify the various different commands. I could, of course, have set the prompt flag at the start and cleared it at the end. But that would have spread out the concern of maintaining the flag. Here’s what I did instead:

method issue_prompt($ctx, $cur_file) {
    ENTER $in_prompt = True;
    LEAVE $in_prompt = False;

    # Lots of stuff here
}

This ensures the flag is set when we enter the method, cleared when we leave the method – and lets me keep the two together.

INIT and END

We’re writing a small utility and want to log what happens as we run it. Time wise, we want to:

  • Open the log file at the start of the program, creating it if needed and overwriting an existing one otherwise
  • Write log entries at various points during the program’s execution
  • Close the log file at the end

Those three actions are fairly spread out in time, but we’d like to collect them together. This time, the INIT and END phasers come to the rescue.

sub log($msg) {
    my $fh = INIT open("logfile", :w);
    $fh.say($msg);
    END $fh.close;
}

Here, we use INIT to perform an action at program start time. It turns out that INIT also keeps around the value produced by the expression following it, meaning it can be used as an r-value. This means we have the file handle available to us, and can write to it during the program. Then, at the END of the program, we close the file handle. All of these have block forms, should you wish to do something more involved:

sub log($msg) {
    my $fh = INIT open("logfile", :w);
    $fh.say($msg);
    END {
        $fh.say("Ran in {now - INIT now} seconds");
        $fh.close;
    }
}

Note the second use of INIT in this example, to compute and remember the program start time so we can use it in the subtraction later on.

FIRST, NEXT and LAST

These phasers work with loops. They fire the first time the loop body executes, at the end of every loop body execution, and after the last loop body execution. FIRST and LAST are especially powerful in so far as they let us move code that wants to special-case the first and last time the loop body runs inside of the loop construct itself. This makes the relationship between these bits of code and the loop especially clear, and lessens the chance somebody moves or copies the loop and forgets the related bits it has.

As an example, let’s imagine we are rendering a table of scores from a game. We want to write a header row, and also do a little ASCII art to denote the start and end of the table. Furthermore, we’d like to keep track of the best score each time around the loop, and then at the end print out the best score. Here’s how we could write it.

for %scores.kv -> $player, $score {
    FIRST say "Score\tPlayer";
    FIRST say "-----\t------";
    LAST  say "-----\t------";

    NEXT (state $best_score) max= $score;
    LAST say "BEST SCORE: $best_score";

    say "$score\t$player";
}

Notice how we keep the header/footer code together, as well as being able to keep the best score tracking code together. It’s also all inside the loop, making its relationship to the loop clear. Note how the state variable also comes in useful here. It too is a construct that lets us keep a variable scoped inside a block even if its usage spans multiple invocations of the block.

KEEP and UNDO

These are variants of LEAVE that trigger conditional on the block being successful (KEEP) or not (UNDO). A successful block completes without unhandled exceptions and returns a defined value. An unsuccessful block exits due to an exception or because it returns an undefined value. Say we were processing a bunch of files and want to build up arrays of successful files and failed files. We could write something like:

sub process($file) {
    KEEP push @success, $file;
    UNDO push @failure, $file;

    my $fh = open($file);
    # ...
}

There are probably a bunch of transaction-like constructs that can also be very neatly implemented with these two.

And there’s more!

While I’ve covered a bunch of the phasers here, there are some others. For example, there’s also BEGIN, which lets you do some computation at compile time. Hopefully, though, this set of examples gives you some inspiration in how phasers can be used effectively, as well as a better grasp of the motivation for them. Bringing related things together and setting unrelated things apart is something we need to think carefully about every day as developers, and phasers help us keep related concerns together, even if they should take place at different phases of our program’s execution.

Day 10 – Don’t quote me on it…

December 10, 2012

In many areas, Perl 6 provides you with a range of sane defaults for the common cases along with the power to do something a little more interesting when you need it. Quoting is no exception.

The Basics

The two most common quoting constructs are the single and double quotes. Single quotes are simplest: they let you quote a string and just about the only “magic” they provide is being able to stick a backslash before a single quote, which escapes it. Since backslash has this special meaning, you can write an explicit backslash with \\. However, you don’t even need to do that, since any other backslashes just pass on straight through. Here’s some examples.

> say 'Everybody loves Magical Trevor'
Everybody loves Magical Trevor
> say 'Oh wow, it\'s backslashed!'
Oh wow, it's backslashed!
> say 'You can include a \\ like this'
You can include a \ like this
> say 'Nothing like \n is available'
Nothing like \n is available
> say 'And a \ on its own is no problem'
And a \ on its own is no problem

Double quotes are, naturally, twice as powerful. :-) They support a range of backslash escapes, but more importantly they allow for interpolation. This means that variables and closures can be placed within them, saving you from having to use the concatenation operator or other string formatting constructs so often. Here are some simple examples.

> say "Ooh look!\nLine breaks!"
Ooh look!
Line breaks!
> my $who = 'Ninochka'; say "Hello, dear $who"
Hello, dear Ninochka
> say "Hello, { prompt 'Enter your name: ' }!"
Enter your name: Jonathan
Hello, Jonathan!

The second example shows the interpolation of a scalar, and the third shows how closures can be placed inside double quoted strings also. The value the closure produces will be stringified and interpolated into the string. But what about all the other sigils besides “$”? The rule is that you can interpolate all of them, but only if they are followed by some kind of postcircumfix (that is, an array or hash subscript, parentheses to make an invocation, or a method call). In fact, you can put all of these on a scalar too.

> my @beer = <Chimay Hobgoblin Yeti>;
Chimay Hobgoblin Yeti
> say "First up, a @beer[0]"
First up, a Chimay
> say "Then @beer[1,2].join(' and ')!"
Then Hobgoblin and Yeti!
> say "Tu je &prompt('Ktore pivo chces? ')"
Ktore pivo chces? Starobrno
Tu je Starobrno

Here you can see interpolation of an array element, a slice that we then call a method on and even a function call. The postcircumfix rule happily means that we don’t go screwing up your email address any more.

> say "Please spam me at blackhole@jnthn.net"
Please spam me at blackhole@jnthn.net

Choose Your Own Delimiters

The single and double quotes are suitable for a bunch of cases, but what if you want to use a bunch of single or double quotes inside the string? Escaping them would rather suck. Thing is, you could probably make that argument about any choice of quoting characters. So instead of making the choice for you, Perl 6 lets you pick. The q and qq quote constructs expect to be followed by a delimiter. If it’s something with a matching closer, it will look for that (for example, if you use an opening curly then your string is terminated by a closing curly; note that there’s only a finite set of these, and no, it doesn’t include having a comet be terminated by a snowman). Otherwise it looks for the same thing to terminate the string. Note you can also use multi-character openers and closers too (but only by repeating the same character). Otherwise, the q gives you the same semantics as single quotes, and qq gives you the same semantics as double quotes.

> say q{C'est la vie}
C'est la vie
> say q{{Unmatched } and { are { OK } in { here}}
Unmatched } and { are { OK } in { here
> say qq!Lottery results: {(1..49).roll(6).sort}!
Lottery results: 12 13 26 34 36 46

Heredocs

All of the quoting constructs demonstrated so far allow you to include multiple lines of content. However, for that there’s usually a better way: here documents. There can be started with either q or qq, and then with the :to adverb being used to specify the string we expect to find, on a line of its own, at the end of the quoted text. Let’s see how this works, illustrated by a touching story.

print q:to/THE END/
    Once upon a time, there was a pub. The pub had
    lots of awesome beer. One day, a Perl workshop
    was held near to the pub. The hackers drank
    the pub dry. The pub owner could finally afford
    a vacation.
    THE END

The output of this script is as follows:

Once upon a time, there was a pub. The pub had
lots of awesome beer. One day, a Perl workshop
was held near to the pub. The hackers drank
the pub dry. The pub owner could finally afford
a vacation.

Notice how the text is not indented like in the program source. Heredocs remove indentation automatically, up to the indentation level of the terminator. If we’d used qq, we could have interpolated things into the heredoc. Note that this is all implemented by using the indent method on strings, but if your string doesn’t do any interpolation we do the call to indent at compile time as an optimization.

You can also have multiple heredocs, and even call methods on the data that will be located in the heredoc (note the call to lines in the following program).

my ($input, @searches) = q:to/INPUT/, q:to/SEARCHES/.lines;
    Once upon a time, there was a pub. The pub had
    lots of awesome beer. One day, a Perl workshop
    was held near to the pub. The hackers drank
    the pub dry. The pub owner could finally afford
    a vacation.
    INPUT
    beer
    masak
    vacation
    whisky
    SEARCHES

for @searches -> $s {
    say $input ~~ /$s/
        ?? "Found $s"
        !! "Didn't find $s";
}

The output of this program is:

Found beer
Didn't find masak
Found vacation
Didn't find whisky

Quote Adverbs for Custom Quoting Constructs

The single and double quote semantics, also available through q and qq, cover most cases. But what if you have a situation where you want to, say, interpolate closures but not scalars? This is where quote adverbs come in. They allow you to turn certain quoting features on and off. Here’s an example.

> say qq:!s"It costs $10 to {<eat nom>.pick} here."
It costs $10 to eat here.

Here, we use the semantics of qq, but then turn off scalar interpolation. This means we can write the price without worrying about it trying to interpolate the 11th capture of the last regex. Note that this is just using the standard colonpair syntax. If you want to start from a quote construct that supports basically nothing, and then just turn on some options, you can use the Q construct.

> say Q{$*OS\n&sin(3)}
$*OS\n&sin(3)
> say Q:s{$*OS\n&sin(3)}
MSWin32\n&sin(3)
> say Q:s:b{$*OS\n&sin(3)}
MSWin32
&sin(3)
> say Q:s:b:f{$*OS\n&sin(3)}
MSWin32
0.141120008059867

Here we start with a featureless quoting construct, then turn on extra features: first scalar interpolation, then backslash escapes, then function interpolation. Note that we could have chosen any delimiter we wished too.

Quote Constructs are Languages

Finally, it’s worth mentioning that when the parser enters a quoting construct, really it is switching to parsing a different language. When we build up quoting constructs from adverbs, really this is just mixing extra roles into the base quoting language to turn on extra features. For the curious, here’s how Rakudo does it. Whenever we hit a closure or some other interpolation, the language is temporarily switched back to the main language. This is why you can do things like:

> say "Hello, { prompt "Enter your name: " }!"
Enter your name: Jonathan
Hello, Jonathan!

And the parser doesn’t get terribly confused about the fact that the closure being interpolated contains another double quoted string. That is, we’re parsing the main language, then slip into a quoting language, then recurse into the main language again, and finally recurse into the quoting language again to parse the string in the closure in the string in the program. It’s like the Perl 6 parser wants to give us all matryoshka dolls for Christmas. :-)

Day 5 – A Perl 6 Debugger

December 5, 2012

There’s much more to the developer experience of a language than its design, features and implementations. While the language and its implementations are perhaps the thing developers will spend most time with, the overall experience will also involve interaction with the community, reading documentation, using modules and employing various development tools. Thus, it’s important that Perl 6 make progress on these fronts too. Over the past year, we’ve taken some good steps forward in these areas; there’s now doc.perl6.org, the module ecosystem has grown, and the module installation tooling has improved. Another big step forward with regards to tooling – and the topic of this post – is that an interactive Perl 6 debugger is now available.

Running With The Debugger

The debugger has been included with the last few Rakudo * releases. If you have one of those, you’re all set. Just run perl6-debug instead of perl6. It takes the same set of options, so if your normal invocation involves, for example, using the -I flag to set the include path for modules, it’ll Just Work Like Usual. Of course, what happens next is entirely different. The debugger will show you each module it is loading, followed by placing you at the first interesting statement of the program, highlighted in yellow.

dbg0

Note how it takes care to put you on the first line that actually does something, skipping the my statement above it (it’s getting increasingly smart about this).

The Basics

Hitting enter allows you to single-step through the program. At any point, you can look at variables, call methods on variables, or even evaluate expressions.

dbg1

If you want to move statement by statement, but never descend into a function call or method call, type an s, followed by enter. To step out of the current sub (that is, run until it returns then break in its caller), use so to step out. To run the program until it hits an exception, just use r. Even at the point you get an exception, you can still access variables to try and dissect what went wrong.

dbg2

One final variant, rt, will run until an exception is thrown, but handled. You’ll break at the point of the throw. This means you’re not disadvantaged in the debugger if you took care to handle exceptions well in your program; you can still break when they are thrown and use the debugger to help understand why. :-)

Breakpoints

Sometimes, you know exactly where the juicy stuff happens in your program that you wish to debug. If only you could just run until you got there. Turns out you can – that’s what breakpoints are for. We can add one, use r to run, and it will stop where we placed the breakpoint.

dbg3

Note that you don’t have to type out the full name of the file you want to put the breakpoint in; any unambiguous substring of the name of a file that is loaded will be sufficient.

I won’t cover them here, but there are also tracepoints, which instead of breaking will log the value of an expression each time a certain place in the program is hit. Later, you can display the log. It’s like adding print statements, but without the print statement going in your code, removing the risk of them accidentally making it into a commit (‘cus we’ve all done that one, right? :-))

Regex and Grammar Debugging

When the debugger detects you are in a regex or grammar, it offers a little extra help. As well as allowing you to single-step your way through the regex, atom by atom, it also displays the match text, indicating what has been matched so far.

dbg4

Here, you can see that the pattern already successfully matched SELECT, and is now looking for a literal * or will try to call the field list rule. If in a regex, which may backtrack, the match position  jumps backwards when backtracking happens, so you can understand the backtracking behavior of the pattern.

Yes, Perl 5 Regexes Too!

Rakudo has some support for the :P5 adverb on regexes, which allows use of the Perl 5 regex syntax. Here the debugger is used in REPL mode (where you enter an expression, then can immediately debug it) to explore the difference between alternations in Perl 5 and Perl 6 (in Perl 5 they go left to right, in Perl 6 they have longest token matching semantics, such that it tries the thing that will match most characters first).

dbg5

The debugger in REPL mode is great for exploring and understanding how things will execute (and as such can serve as a learning or teaching aid). Another use is for debugging modules without having to write a test script; just write a use statement in the debugger or you can even supply the module using the -M command line flag and the debugger will load it!

What About Funky Stuff, Like Macros?

That is, macros, BEGIN time, eval, and those other things where your Perl 6 program does the time warp again, doing a bit of runtime at compile time or compiling some more stuff at runtime. The debugger is built for it. If a macro is applied, the debugger will place you in it. Notice below how we’re still in the process of loading the second file, and did not get to the third yet – we really are debugging at BEGIN time!

dbg6

Any lines of code that are stripped out by the macro are simply never hit at runtime. And what about statements in quasi blocks? The debugger will take you there, so you not only know what macro was applied, but can dig into exactly what it does too.

dbg7

Just as bits of runtime happening at compile time work out fine, any code that gets eval‘d at runtime is also compiled with debug hooks, meaning that you can step straight into it and debug the evaluated code.

Written in Perl 6 and NQP!

You might think that writing a debugger must involve all kinds of low-level hackery. In fact, that’s not the case. The debug hooks mechanism is written in NQP, and the command line user interface is written in Perl 6. This is significant from a couple of angles. The first is the fact that we can write something like this without breaking the encapsulation of the compiler, but instead just by subclassing the Grammar, Actions and Compiler objects and twiddling with the AST. In fact, the debugger was built without any changes being required to Rakudo as it already existed. This provides important feedback on our compiler architecture – this time, very positive feedback. Things are extensible in the ways they were designed to be. The second is that writing so much of it in Perl 6 is a healthy bit of dogfooding – using the product in order to build further products. My hope is that, since most of what people would want to change is actually written in the Perl 6 part, it will feel quite hackable by the community at large.

And What Of Future Plans?

Various features are still to come: conditional breakpoints, dumping tracepoint output to a file, showing the path taken through a grammar to get to the current point, and various bits of configurability. The command line interface is nice, but of course having some extra options would be even nicer. I’m interested in a web-based interface, but also in integration with tools like Padre. There’s some work afoot on a common protocol for these things, which could make such integration possible without having to re-invent too many wheels. In the meantime, having an interactive debugger which is aware of and works well with a wide range of Perl 6 language features is a solid step forward. Happy debugging, and feature ideas (or patches ;-)) are welcome; here’s the GitHub repo!

The view from the inside: using meta-programming to implement Rakudo

December 18, 2011

In my previous article for the Perl 6 advent calendar, I looked at how we can use the meta-programming facilities of Rakudo Perl 6 in order to build a range of tools, tweak the object system to our liking or even add major new features “from the outside”. While it’s nice that you can do these things, the Perl 6 object system that you get by default is already very rich and powerful, supporting a wide range of features. Amongst them are:

  • Classes
  • Parametric roles
  • Attributes
  • Methods (including private ones)
  • Delegation
  • Introspection
  • Subset (aka. refinement) types
  • Enums

That’s a lot of stuff to implement, but it’s all done by implementing meta-objects, and therefore we can take advantage of OOP – with both classes and roles – to factor it. The only real difference between the meta-programming we saw in my last article and the meta-programming we do while implementing the core Perl 6 object system in Rakudo is that the meta-objects are mostly written in NQP. NQP is a vastly smaller, much more easily optimizable and portable subset of Perl 6. Being able to use it also helps us to avoid many painful bootstrapping issues. Since it is mostly a small subset of Perl 6, it’s relatively easy to get in to.

In this article, I want to take you inside of Rakudo and, through implementing a missing feature, give you a taste of what it’s like to hack on the core language. So, what are we going to implement? Well, one feature of roles is that they can also serve as interfaces. That is, if you write:

role Describable {
    method describe() { ... }
}
class Page does Describable {
}

Then we are meant to get a compile time error, since the class Page does not implement the method “describe”. At the moment, however, there is no error at compile time; we don’t get any kind of failure until we call the describe method at runtime. So, let’s make this work!

One key thing we’re going to need to know is whether a method is just a stub, with a body containing just “…”, “???” or “!!!”. This is available to us by checking its .yada method. So, we have that bit. Question is, where to check it?

Unlike classes, which have the meta-object ClassHOW by default, there  isn’t a single RoleHOW. In fact, roles show up in no less than four different forms. The two most worth knowing about are ParametricRoleHOW and ConcreteRoleHOW. Every role is born parametric. Whether you explicitly give it extra parameters or not, it is always parametric on at least the type of the invocant. Before we can ever use a role, it has to be composed into a class. Along the way, we have to specialize it, taking all the parametric things and replacing them with concrete ones. The outcome of this is a role type with a meta-object of type ConcreteRoleHOW, which is now ready for composition into the class.

So that’s roles themselves, but what about composing them? Well, the actual composition is performed by two classes, RoleToClassApplier and RoleToRoleApplier. RoleToClassApplier is actually only capable of applying a single role to a class. This may seem a little odd: classes can do multiple roles, after all. However, it turns out that a neat way to factor this is to always “sum” multiple roles to a single one, and then apply that to the class. Anyway, it would seem that we need to be doing some kind of check in RoleToClassApplier. Looking through, we see this:

my @methods := $to_compose_meta.methods($to_compose, :local(1));
for @methods {
    my $name;
    try { $name := $_.name }
    unless $name { $name := ~$_ }
    unless has_method($target, $name, 1) {
        $target.HOW.add_method($target, $name, $_);
    }
}

OK, so, it’s having a bit of “fun” with, of all things, looking up the name of the method. Actually it’s trying to cope with NQP and Rakudo methods having slightly different ideas about how the name of a method is looked up. But that aside, it’s really just a loop going over the methods in a role and adding them to the class. Seems like a relatively opportune time to spot the yada case, which indicates we require a method rather than want to compose one into the class. So, we change it do this:

my @methods := $to_compose_meta.methods($to_compose, :local(1));
for @methods {
    my $name;
    my $yada := 0;
    try { $name := $_.name }
    unless $name { $name := ~$_ }
    try { $yada := $_.yada }
    if $yada {
        unless has_method($target, $name, 0) {
            pir::die("Method '$name' must be implemented by " ~
            $target.HOW.name($target) ~
            " because it is required by a role");
        }
    }
    elsif !has_method($target, $name, 1) {
        $target.HOW.add_method($target, $name, $_);
    }
}

A couple of notes. The first is that we’re doing binding, because NQP does not have assignment. Binding is easier to analyze and generate code for. Also, the has_method call is passing an argument of 0 or 1, which indicates whether we want to consider methods in just the target class or any of its parents (note that there’s no True/False in NQP). If the class inherits a method then we’ll consider that as good enough: it has it.

So, now we run our program and we get:

===SORRY!===
Method 'describe' must be implemented by Page because it is required by a role

Which is what we were after. Note that the “SORRY!” indicates it is a compile time error. Success!

So, are we done? Not so fast! First, let’s check the inherited method case works out. Here’s an example.

role Describable {
    method describe() { ... }
}
class SiteItem {
    method describe() { say "It's a thingy" }
}
class Page is SiteItem does Describable {
}

And…oh dear. It gives an error. Fail. So, back to RoleToClassApplier. And…aha.

sub has_method($target, $name, $local) {
    my %mt := $target.HOW.method_table($target);
    return nqp::existskey(%mt, $name)
}

Yup. It’s ignoring the $local argument. Seems it was written with the later need to do a required methods check in mind, but never implemented to handle it. OK, that’s an easy fix – we just need to go walking the MRO (that is, the transitive list of parents in dispatch order).

sub has_method($target, $name, $local) {
    if $local {
        my %mt := $target.HOW.method_table($target);
        return nqp::existskey(%mt, $name);
    }
    else {
        for $target.HOW.mro($target) {
            my %mt := $_.HOW.method_table($_);
            if nqp::existskey(%mt, $name) {
                return 1;
            }
        }
        return 0;
    }
}

With that fixed, we’re in better shape. However, you may be able to imagine another case that we didn’t yet handle. What if another role provides the method? Well, first let’s see what the current failure mode is. Here’s the code.

role Describable {
    method describe() { ... }
}
role DefaultStuff {
    method describe() { say "It's a thingy" }
}
class Page does Describable does DefaultStuff {
}

And here’s the failure.

===SORRY!===
Method 'describe' must be resolved by class Page because it exists
in multiple roles (DefaultStuff, Describable)

So, it’s actually considering this as a collision. So where do collisions actually get added? Happily, that just happens in one place: in RoleToRoleApplier. Here’s the code in question.

if +@add_meths == 1 {
    $target.HOW.add_method($target, $name, @add_meths[0]);
}
else {
    # More than one - add to collisions list.
    $target.HOW.add_collision($target, $name, %meth_providers{$name});
}

We needn’t worry if we just have one method and it’s a requirement rather than an actual implementation – it’ll just do the right thing. So it’s just the second branch that needs consideration. Here’s how we change things.

if +@add_meths == 1 {
    $target.HOW.add_method($target, $name, @add_meths[0]);
}
else {
    # Find if any of the methods are actually requirements, not
    # implementations.
    my @impl_meths;
    for @add_meths {
        my $yada := 0;
        try { $yada := $_.yada; }
        unless $yada {
            @impl_meths.push($_);
        }
    }

    # If there's still more than one possible - add to collisions list.
    # If we got down to just one, add it. If they were all requirements,
    # just choose one.
    if +@impl_meths == 1 {
        $target.HOW.add_method($target, $name, @impl_meths[0]);
    }
    elsif +@impl_meths == 0 {
        $target.HOW.add_method($target, $name, @add_meths[0]);
    }
    else {
        $target.HOW.add_collision($target, $name, %meth_providers{$name});
    }
}

Essentially, we filter out those that are implementations of the method rather than just requirements. If we are left with just a single method, then it’s the only implementation, and it satisfies the requirements, so we add it and we don’t need to do anything further. If we discover they are all requirements, then we don’t want to flag up a collision, but instead we just pick any of the required methods and pass it along. They’ll all give the same error. Otherwise, if we have multiple implementations, then it’s a real collision so we add it just as before. And…it works!

So, we run the test suite, things look good…and commit.

3 files changed, 48 insertions(+), 6 deletions(-)

And there we go – Rakudo now supports a part of the spec that it never has before, and it wasn’t terribly much effort to put in. And that just leaves me to go to the fridge and grab a Christmas ale to relax after a little meta-hacking. Cheers!

Meta-programming: what, why and how

December 14, 2011

Sometimes, it’s good to take ones understanding of a topic, throw it away and try to build a new mental model of it from scratch. I did that in the last couple of years with object orientation. Some things feel ever so slightly strange to let go of and re-evaluate. For many people, an object really is “an instance of a class” and inheritance really is a core building block of OOP. I suspect many people who read this post will at this point be thinking, “huh, of course they really are” – and if so, that’s totally fair enough. Most people’s view of OOP will, naturally, be based around the languages they’ve applied object orientation in, and most of the mainstream languages really do have objects that are instances of classes and really do have inheritance as a core principle.

Step back and look around, however, and things get a bit more blurry. JavaScript doesn’t have any notion of classes. CLOS (the Common Lisp Object System) does have classes, but they don’t have methods. And even if we do just stick with the languages that have classes with methods, there’s a dizzying array of “extras” playing their part in the language’s OO world view; amongst them are interfaces, mixins and roles.

Roles – more often known as traits in the literature – are a relatively recent arrival on the OO scene, and they serve as an important reminder than object orientation is not finished yet. It’s a living, breathing paradigm, undergoing its own evolution just as our programming languages in general are.

And that brings me nicely on to Perl 6 – a language that from the start has set out to be able to evolve. At a syntax level, that’s done by opening up the grammar to mutation – in a very carefully controlled way, such that you always know what language any given lexical scope is in. Meta-programming plays that same role, but in the object orientation and type system space.

So what is a meta-object? A meta-object is simply an object that describes how a piece of our language works. What sorts of things in Perl 6 have meta-objects? Here’s a partial list.

  • Classes
  • Roles
  • Subsets
  • Enumerations
  • Attributes
  • Subroutines
  • Methods
  • Signatures
  • Parameters

So that’s meta-objects, but what about the protocol? You can read protocol as “API” or “interface”. It’s an agreed set of methods that a meta-object will provide if it wants to expose certain features. Let’s consider the API for anything that can have methods, such as classes and roles. At a minimum, it will provide:

  • add_method – adds a method to the object
  • methods – enables introspection of the methods that the object has
  • method_table – provides a hash of the methods in this type, excluding any that may be inherited

What about something that you can call a method on? It just has to provide one thing:

  • find_method – takes an object and a name, and returns the method if one exists

By now you may be thinking, “wait a moment, is there something that you can call a method on, but that does not have methods”? And the answer is – yes. For example, an enum has values that you can call a method on – the methods that the underlying type of the enumeration provides. You can’t actually add a method to an enum itself, however.

What’s striking about this is that we are now doing object oriented programming…to implement our object oriented language features. And this in turn means that we can tweak and extend our language – perhaps by subclassing an existing meta-object, or even by writing a new one from scratch. To demonstrate this, we’ll do a simple example, then a trickier one.

Suppose we wanted to forbid multiple inheritance. Here’s the code that we need to write.

my class SingleInheritanceClassHOW
    is Metamodel::ClassHOW
{
    method add_parent(Mu $obj, Mu $parent) {
        if +self.parents($obj, :local) > 0 {
            die "Multiple inheritance is forbidden!";
        }
        callsame;
    }
}
my module EXPORTHOW { }
EXPORTHOW.WHO.<class> = SingleInheritanceClassHOW;

What are we doing here? First, we inherit from the standard Perl 6 implementation of classes, which is defined by the class Metamodel::ClassHOW. (For now, we also inherit from Mu, since meta-objects currently consider themselves outside of the standard type hierarchy. This may change.) We then override the add_parent method, which is called whenever we want to add a parent to a class. We check the current number of (local) parents that a class has; if it already has one, then we die. Otherwise, we use callsame in order to just call the normal add_parent method, which actually adds the parent.

You may wonder what the $obj parameter that we’re taking is, and why it is needed. It is there because if we were implementing a prototype model of OOP, then adding a method to an object would operate on the individual object, rather than stashing the method away in the meta-object.

Finally, we need to export our new meta-object to anything that uses our module, so that it will be used in place of the “class” package declarator. Do do this, we stick it in the EXPORTHOW module, under the name “class”. The importer pays special attention to this module, if it exists. So, here it is in action, assuming we put our code in a module si.pm. This program works as usual:

use si;
class A { }
class B is A { }

While this one:

class A { }
class B { }
class C is A is B { }

Will die with:

===SORRY!===
Multiple inheritance is forbidden!

At compile time.

Now for the trickier one. Let’s do a really, really simple implementation of aspect oriented programming. We’ll write an aspects module. First, we declare a class that we’ll use to mark aspects.

my class MethodBoundaryAspect is export {
}

Next, when a class is declared with “is SomeAspect”, where SomeAspect inherits from MethodBoundaryAspect, we don’t want to treat it as inheritance. Instead, we’d like to add it to a list of aspects. Here’s an extra trait modifier to do that.

multi trait_mod:<is>(Mu:U $type, MethodBoundaryAspect:U $aspect) is export {
    $aspect === MethodBoundaryAspect ??
        $type.HOW.add_parent($type, $aspect) !!
        $type.HOW.add_aspect($type, $aspect);
}

We take care to make sure that the declaration of aspects themselves – which will directly derive from this class – still works out by continuing to call add_parent for those. Otherwise, we call a method add_aspect, which we’ll define in a moment.

Supposing that our aspects work by optionally implementing entry and exit methods, which get passed the details of the call, here’s our custom meta-class, and the code to export it, just as before.

my class ClassWithAspectsHOW
    is Metamodel::ClassHOW
{
    has @!aspects;
    method add_aspect(Mu $obj, MethodBoundaryAspect:U $aspect) {
        @!aspects.push($aspect);
    }
    method compose(Mu $obj) {
        for @!aspects -> $a {
        for self.methods($obj, :local) -> $m {
            $m.wrap(-> $obj, |args {
                $a.?entry($m.name, $obj, args);
                my $result := callsame;
                $a.?exit($m.name, $obj, args, $result);
                $result
            });
        }
        }
        callsame;
    }
}
my module EXPORTHOW { }
EXPORTHOW.WHO.<class> = ClassWithAspectsHOW;

Here, we see how add_aspect is implemented – it just pushes the aspect onto a list. The magic all happens at class composition time. The compose method is called after we’ve parsed the closing curly of a class declaration, and is the point at which we finalize things relating to the class declaration. Ahead of that, we loop over any aspects we have, and the wrap each method declared in the class body up so that it will make the call to the entry and exit methods.

Here’s an example of the module in use.

use aspects;
class LoggingAspect is MethodBoundaryAspect {
    method entry($method, $obj, $args) {
        say "Called $method with $args";
    }
    method exit($method, $obj, $args, $result) {
        say "$method returned with $result.perl()";
    }
}
class Example is LoggingAspect {
    method double($x) { $x * 2 }
    method square($x) { $x ** 2 }
}
say Example.double(3);
say Example.square(3);

And the output is:

Called double with 3
double returned with 6
6
Called square with 3
square returned with 9
9

So, a module providing basic aspect orientation support in 30 or so lines. Not so bad.

As you can imagine, we can go a long way with meta-programming, whether we want to create policies, development tools (like Grammar::Debugger) or try to add entirely new concepts to our language. Happy meta-hacking.

Privacy and OOP

December 11, 2011

There are a number of ways in which Perl 6 encourages you to restrict the scope of elements of your program. By doing so, you can better understand how they are used and will be able to refactor them more easily later, potentially aiding agility. Lexical scoping is one such mechanism, and subroutines are by default lexically scoped.

Let’s take a look at a class that demonstrates some of the object oriented related privacy mechanisms.

    class Order {
        my class Item {
            has $.name;
            has $.price;
        }
        
        has Item @!items;
        
        method add_item($name, $price) {
            @!items.push(Item.new(:$name, :$price))
        }
        
        method discount() {
            self!compute_discount()
        }
        
        method total() {
            self!compute_subtotal() - self!compute_discount();
        }
        
        method !compute_subtotal() {
            [+] @!items>>.price
        }
        
        method !compute_discount() {
            my $sum = self!compute_subtotal();
            if $sum >= 1000 {
                $sum * 0.15
            }
            elsif $sum >= 100 {
                $sum * 0.1
            }
            else {
                0
            }
        }
    }

Taking a look at this, the first thing we notice is that Item is a lexical class. A class declared with “my” scope can never be referenced outside of the scope it is declared within. In our case, we never leak instances of it outside of our Order class either. This makes our class an example of the aggregate pattern: it prevents outside code from holding direct references to the things inside of it. Should we ever decide to change the way that our class represents its items on the inside, we have complete freedom to do so.

The other example of a privacy mechanism at work in this class is the use of private methods. A private method is declared just like an ordinary method, but with an exclamation mark appearing before its name. This gives it the same visibility as an attribute (which, you’ll note, are also declared with an exclamation mark – a nice bit of consistency). It also means you need to call it differently, using the exclamation mark instead of the dot.

Private methods are non-virtual. This may seem a little odd at first, but is consistent: attributes are also not visible to subclasses. By being non-virtual, we also get some other benefits. The latest Rakudo, with its optimizer cranked up to its highest level, optimizes calls to private methods and complains about missing ones at compile time. Thus a typo:

    self!compite_subtotal() - self!compute_discount();

Will get us a compile time error:

    ===SORRY!===
    CHECK FAILED:
    Undefined private method 'compite_subtotal' called (line 18)

You may worry a little over the fact that we now can’t subclass the discount computation, but that’s likely not a good design anyway; for one, we’d need to also expose the list of items, breaking our aggregate boundary. If we do want pluggable discount mechanisms we’d probably be better implementing the strategy pattern.

Private methods can, of course, not be called from outside of the class, which is also a compile time error. First, if you try:

    say $order!compute_discount;

You’ll be informed:

    ===SORRY!===
    Private method call to 'compute_discount' must be fully qualified
    with the package containing the method

Which isn’t so surprising, given they are non-virtual. But even if we do:

    say $o!Order::compute_discount;

Our encapsulation-busting efforts just get us:

    ===SORRY!===
    Cannot call private method 'compute_discount' on package Order
    because it does not trust GLOBAL

This does, however, hint at the get-out clause for private methods: a class may choose to trust another one (or, indeed, any other package) to be able to call its private methods. Critically, this is the decision of the class itself; if the class declaration didn’t decide to trust you, you’re out of luck. Generally, you won’t need “trusts”, but occasionally you may be in a situation where you have two very closely coupled classes. That’s usually undesirable in itself, though. Don’t trust too readily. :-)

So, lexical classes, private methods and some nice compiler support to help catch mistakes. Have an agile advent. :-)

Grammar::Tracer and Grammar::Debugger

December 2, 2011

Grammars are, for many people, one of the most exciting features of Perl 6. They unify parsing with object orientation, with each production rule in your grammar being represented by a method. These methods are a little special: they are declared using the keywords “regex”, “rule” or “token”, each of which gives you different defaults on backtracking and whitespace handling. In common is that they lead to the body of the method being parsed using the Perl 6 rule syntax. Under the hood, however, they really are just methods, and production rules that refer to others are really just method calls.

Perl 6 grammars also give you a seamless way to combine declarative and imperative parsing. This means efficient mechanisms, such as NFAs and DFAs, may be used to handle the declarative parts – the things that your tokens tend to be made up of – while a more imperative mechanism drives the parsing of larger structures. This in turn means that you don’t need to write a tokenizer; it can be derived from the rules that you write in the grammar.

So what is the result of parsing some text with a grammar? Well, provided it’s able to match your input, you get back a parse tree. This data structure – made up of Match objects – captures the structure of the input. You can treat each Match node a little bit like a hash, indexing in to it to look at the values that its production rules matched. While you can build up your own tree or other data structure while parsing, sometimes the Match tree you get back by default will be convenient enough to extract the information you need.

That’s wonderful, but there was a key clause in all of this: “provided it’s able to match”. In the case that the grammar fails to match your input, then it tells you so – by giving back an empty Match object that, in boolean context, is false. It’s at this point that many people stop feeling the wonder of grammars and start feeling the pain of trying to figure out why on earth their seemingly fine grammar did not accept the input they gave it. Often, it’s something silly – but in a grammar of dozens of production rules – or sometimes even just ten – the place where things go wrong can be elusive.

Thankfully, help is now at hand, in the form of two modules: Grammar::Tracer, which gives you a tree-like trace output of your grammar, and Grammar::Debugger, which gives the same trace output but also enables you to set breakpoints and single step through the grammar.

A picture is worth a thousand words, so here’s how Grammar::Tracer looks in action!

What we’re seeing here is a tree representation of the production rules that were called, starting at “TOP”, next trying to parse a production rule called “country”, which in turn wants to parse a name, two “num”s and an “integer”. The green indicates a successful match, and next to it we see the snippet of text that was captured.

So what happens when things go wrong? In that case, we see something like this:

Here, we see that something happened during the parse that caused a cascade of failures all the way back up to the “TOP” production rule, which meant that the parse failed overall. Happily, though, we now have a really good clue where to look. Here is the text my grammar was trying to match at the time:

Russia
	Ulan Ude : 51.833333,107.600000 : 1
	Moscow : 55.75000,37.616667 : 4

Looking at this, we see that the “name” rule appears to have picked up “Ulan”, but actually the place in question is “Ulan Ude”. This leads us directly to the name production in our grammar:

token name { \w+ }

Just a smattering of regex fu is enough to spot the problem here: we don’t parse names that happen to have spaces in them. Happily, that’s an easy fix.

token name { \w+ [\h+ \w+]* }

So how do we turn on the tracing? Actually, that’s easy: just take the file containing the grammar you wish to trace, and add at the top:

use Grammar::Tracer;

And that’s it; now whenever you use the grammar, it will be traced. Note that this statement has lexical effect, so if you’re using modules that also happen to have grammars – which you likely don’t care about – they will not end up getting the tracing behavior.

You can also do this:

use Grammar::Debugger;

The debugger is the tracer’s big sister, and knows a few more tricks. Here’s an example of it in action.

Instead of getting the full trace, now as soon as we hit the TOP production rule the program execution breaks and we get a prompt. Pressing enter allows you to step rule by rule through the parse. For some people, this may be preferable; others prefer to get the full trace output and analyze it. However, there are a few more tricks. In the example above, I added a breakpoint on the “name” rule. Using “r” informs the debugger to keep running through the production rules until it hits one called “name”, at which point it breaks. It is also possible to add breakpoints in code, for more extended debugging sessions with many runs. There’s one additional feature in code, which is to set a conditional breakpoint.

Sound interesting? You can get modules from GitHub, and if you want to see a live demo of a grammar being debugged using it, then there is a video of my Debugging Perl 6 Grammars talk from YAPC::Europe 2011; slides are also available to make the sample code more clear than it is on the video. Note that the modules need one of the compiler releases from the Rakudo “nom” development branch; we’ll be making a distribution release later this month based on that, though, and these modules will come with it.

You may also be thinking: I bet these are complex modules doing lots of guts stuff! In fact, they are 44 lines (Grammar::Tracer) and 171 lines (Grammar::Debugger), and written in Perl 6. They are built using the meta-programming support we’ve been working on in the Rakudo Perl 6 compiler during the course of the last year – and if you want to know more about that, be sure to check out my meta-programming post coming up later on in this year’s advent calendar.


Follow

Get every new post delivered to your Inbox.

Join 37 other followers