Day 14 – A nice supplies: syntactic relief for working with asynchronous data

Asynchronous data is all around us. What is it? It’s data that arrives whenever it pleases, as opposed to when we ask for it. Some common examples are:

  • GUI events
  • Data arriving over an async socket
  • Messages arriving from a message queue
  • Ticks of a timer
  • File change notifications
  • Signals

These contrast with synchronous data, which we get when we ask for it. For example, when we query a database and iterate the resulting rows, or iterate through file system entries in a directory, we block until the data we wish to have is available.

Our programming languages tend to be pretty good about synchronous data, giving us a range of syntactic constructs for producing and processing it. For example, in Perl 6 we have for loops for consuming streams of synchronously produced values, and gather blocks for producing them – perhaps doing so by in turn using for loops to consume other synchronous data sources. Inside for and gather, we’re free to use state (kept in variables), and use our comfortable range of conditionals (if, when, etc.) That is to say, we’re free to act like normal, down to earth, imperative programmers in dealing with our synchronous data. Sure, sometimes something just looks far nicer using the functional style, and we map, grep, zip, and reduce our way to the solution. But some problems are just unnatural – at least for most of us, me included – to see in that way.

In Perl 6, we have a common interface for talking about things that produce values over time – that is, synchronous data. These things do the Iterable role, and work in terms of Iterator objects. You never really see an Iterator in day to day Perl 6, and instead see the Seq type. These Seqs are the things gather blocks produce, and for loops can consume.

In the last years, we introduced supplies to Perl 6. Supplies are our common interface for talking about asynchronous data. These have, from the beginning, been rather well received. With supplies, you can grep UI events, map file system notifications, and zip whatever you fancied with ticks of a timer. And that’s just great when it’s easy to think about your problem in functional terms. But what about the rest of the time?

In the last year, we’ve also added supply/react blocks, along with the whenever asynchronous looping construct. In this post, we’ll take a look at how they might be used.

STOMP!

As I wrote the list of sources of asynchronous data at the start of this post, I realized that there was only one of them that I’d not yet done in Perl 6: working with message queues. Hmm. Two hours until midnight. Advent post needs completing before I sleep. Challenge accepted!

10 minutes later, and I’ve RabbitMQ happily installed and am reading the STOMP specification. STOMP is a simple text-oriented message protocol – that is, a text-based way to deal with message queues. Within an hour, I’d got a working STOMP client handling the absolute basics of sending messages to a queue and subscribing to a queue to get messages. So, let’s take a look at how I did it.

First, I sketched in a Stomp::Client class. I figured it would need the host and port we want to connect to, the connection credentials (yes, yes, this is not the height of security engineering), and something to hold the socket representing the connection to the server.

class Stomp::Client {
    has Str $.host is required;
    has Int $.port is required;
    has Str $.login = 'guest';
    has Str $.password = 'guest';
    has $!connection;
}

So, let’s get connected. The connect method will contain a start block, because we want it to function asynchronously. This means the method will return a Promise that the caller can await. To connect to a stomp server, you establish a TCP connection and send a CONNECT frame:

method connect() {
    start {
        my $conn = await IO::Socket::Async.connect($!host, $!port);
        await $conn.print: qq:to/FRAME/;
            CONNECT
            accept-version:1.2
            login:$!login
            passcode:$!password
            
            \0
            FRAME
        True
    }
}

But…then what? Well, then you wait for the server to either return a CONNECTED frame or an ERROR frame. Hm, seems we’ve some work to do. First is to translate the BNF from the STOMP specification into a Perl 6 grammar:

my grammar StompMessage {
    token TOP {
        <command> \n
        [<header> \n]*
        \n
        <body>
        \n*
    }
    token command {
        < CONNECTED MESSAGE RECEIPT ERROR >
    }
    token header {
        <header-name> ":" <header-value>
    }
    token header-name {
        <-[:\r\n]>+
    }
    token header-value {
        <-[:\r\n]>*
    }
    token body {
        <-[\x0]>* )> \x0
    }
}

Note it’s lexically scoped inside of our STOMP::Client class. So is the following little message data structure:

my class Message {
    has $.command;
    has %.headers;
    has $.body;
}

An IO::Socket::Async connection exposes incoming data as a Supply. There are a few things to consider:

  • A message may be spread over multiple packets
  • Two messages may be in one packet
  • The next packet may arrive while we’re processing the current one, and since Perl 6 delivers I/O notifications on the thread pool – like various other languages – we might worry about data races

In fact, the third point is a non-problem: Perl 6 promises that, unless you go out of your way to avoid it, supplies are serial, and will only have you processing one message at a time on a given message pipeline. Supplies are a tool for taming concurrency rather than introducing it.

Now, if I was writing this synchronously, and I wanted to expose a Seq of the incoming messages, I’d use a gather block, keep myself a buffer, recv each incoming chunk of data from the socket, and try to parse a new message at the start of the buffer. Whenever I got a message I’d take it. What about with supplies? Here’s how it looks:

method !process-messages($incoming) {
    supply {
        my $buffer = '';
        whenever $incoming -> $data {
            $buffer ~= $data;
            while StompMessage.subparse($buffer) -> $/ {
                $buffer .= substr($/.chars);
                if $<command> eq 'ERROR' {
                    die ~$<body>;
                }
                else {
                    emit Message.new(
                        command => ~$<command>,
                        headers => $<header>
                            .map({ ~.<header-name> => ~.<header-value> })
                            .hash,
                        body => ~$<body>
                    );
                }
            }
        }
    }
}

Taking it from the top, $incoming will be the Supply of data arriving from the socket, decoded to strings. We wrap the code in this method up in a supply block, since we’d like to return a new Supply. We declare $buffer, to hold data we’ve yet to parse. In this case, our thread safety is fairly trivial: $incoming will deliver a message at a time. But what if we were going to bring together data from many supplies? We’d still be fine: a supply block promises that, per tapping of the Supply, only one thread will ever be allowed in the code inside of the supply block at a time.

The whenever construct is a asynchronous loop. It taps the supply we specify, and the loop body will run whenever a value is emitted by that supply. In our case, that is every time we get data from the socket. We concatenate the incoming $data onto the $buffer. We then try to parse a message at the start of the buffer (subparse means that we don’t mind if we don’t reach the end of the buffer, unlike parse which will complain if it doesn’t completely parse the incoming string). When we do parse data, we lop it off the start of the buffer.

For each message we parse, we first see if it’s an ERROR frame; if so, we simply die with the message. A die inside a whenever block will propagate the error to whatever tapped the supply, so errors will not be lost. Otherwise, we’ve some kind of more interesting message; we turn the parsed data into a Message instance and emit it.

A supply block gives an “on-demand” supply. Each tap performed on the block will run the supply block, which in turn will tap each of the whenevers. That’s the right thing in most cases, but for my Stomp::Client I’d like to have various bits of code tap the same messages, to look for interesting things. One of these bits of code will look for a single CONNECTED message. So, I’ll add a $!incoming attribute to my class:

has $!incoming;

And then, after establishing the connection to the message queue server, do this:

$!incoming = self!process-messages($conn.Supply).share;

That is, I share the result of processing messages out amongst everything that is interested. And interested things will tap the $!incoming supply and filter out what they care about. For example, here’s how I create a Promise that will be kept when we receive a CONNECTED frame:

my $connected = $!incoming
    .grep(*.command eq 'CONNECTED')
    .head(1)
    .Promise;

Here is the connect method in full:

method connect() {
    start {
        my $conn = await IO::Socket::Async.connect($!host, $!port);
        $!incoming = self!process-messages($conn.Supply).share;
        
        my $connected = $!incoming
            .grep(*.command eq 'CONNECTED')
            .head(1)
            .Promise;
        await $conn.print: qq:to/FRAME/;
            CONNECT
            accept-version:1.2
            login:$!login
            passcode:$!password
            
            \0
            FRAME
        await $connected;
        $!connection = $conn;
        
        True
    }
}

Sending messages is easy:

method send($topic, $body) {
    self!ensure-connected;
    $!connection.print: qq:to/FRAME/;
        SEND
        destination:/queue/$topic
        content-type:text/plain
 
        $body\0
        FRAME
}

And subscription is implemented using another supply block, and a bit of simple filtering on the incoming message:

method subscribe($topic) {
    self!ensure-connected;
    state $next-id = 0;
    supply {
        my $id = $next-id++;
        
        $!connection.print: qq:to/FRAME/;
            SUBSCRIBE
            destination:/queue/$topic
            id:$id
 
            \0
            FRAME
        
        whenever $!incoming {
            if .command eq 'MESSAGE' && .headers<subscription> == $id {
                emit .body;
            }
        }
    }
}

And with that, we have a working STOMP client. Let’s try it:

my $queue = Stomp::Client.new(host => 'localhost', port => 61613);
await $queue.connect();
await $queue.send("test", "hello world");
react {
    whenever $queue.subscribe('test') {
        .say;
    }
}

Here, for the first time, we encounter the react block. What’s that? Well, if you imagine that supply blocks are a little like gather – that is, used to create something that can produce values – a react block is for the places you might normally use a for loop when processing synchronous data. However, in the asynchronous world you’re usually interested in a range of things that might happen. A react block runs, sets up all the whenevers, and then blocks until either all of the supplies tapped by whenever blocks are done, or the “done” control operator is used somewhere inside of react.

A little example

Suppose that we wanted to build a simple monitoring system. Workers “sign on”, and must send us a ping every 10 seconds. If they fail to do so, then a monitor will report this. Workers can sign off when finished. Here is the worker process:

sub MAIN($name) {
    my $queue = Stomp::Client.new(host => 'localhost', port => 61613);
    await $queue.connect();
    $queue.send('signon', $name);
    react {
        whenever Supply.interval(10, 5) {
            if <y y n>.pick eq 'y' {
                $queue.send('ping', $name);
            }
        }
    
        whenever signal(SIGINT) {
            $queue.send('signoff', $name);
            done;
        }
    }
}

We start it from the command line with a name. Every 10 seconds it sends a ping – or at least, two thirds of the time it will. And if we hit Ctrl+C the worker signs off.

Next, here’s the monitor:

my $queue = Stomp::Client.new(host => 'localhost', port => 61613);
await $queue.connect();
react {
    my %last-active;
 
    whenever $queue.subscribe('signon') -> $name {
        %last-active{$name} = now;
    }
  
    whenever $queue.subscribe('ping') -> $name {
        %last-active{$name} = now;
    }
  
    whenever $queue.subscribe('signoff') -> $name {
        %last-active{$name}:delete;
    }
 
    whenever Supply.interval(10) {
        for %last-active.kv -> $name, $last-ping {
            if now - $last-ping > 11 {
                say "No ping from $name";
            }
        }
    }
}

Note that the react block, like supply block, enforces that only a single thread at a time may be operating across the various whenever blocks, so the %last-active hash is safe. We have three subscriptions to queues, and every ten seconds look for missing pings. We give an extra second’s grace on those.

Notice how, by exposing the message queue in terms of supplies, we are able to effortlessly compose this asynchronous data with both signals and time. This is where the real power of a uniform interface to asynchronous data comes in. And, with a little syntactic support, we are no longer required to put our functional hat on to wield that power.

Day 13 – A new trait for old methods

use v6;

El_Che expressed the desire on IRC to have Perl 6 watch his fingers when typing named arguments. It is indeed quite easy to send a wrong name the right way. Luckily we can help him.

First we need some exception to throw. Perl 6 kindly creates a constructor for us that deals with filling all attributes with values, so we can provide details when we throw the exception.

class X::Parameter::ExtraNamed is Exception {
    has $.extra-parameters;
    has $.classname;
    has $.method-name;
    method message () {
        "The method $.method-name of $.classname is offended by the named parameter(s): $.extra-parameters";
    }
}

We want to modify a method (.new in this case) so we need a trait. A trait is a modification of the Perl 6 grammar and a subroutine. It’s name comes from the required named argument. The thing it operates on is stored in the first positional. In our case that’s a method.

multi sub trait_mod:<is>(Method $m, :$strict!){

We want to check named arguments against the list of arguments that are defined in the methods signature. We have to store them somewhere.

    my @named-params;

When we provide some error message we may want to name the class the method is member of.

    my $invocant-type;
    my $method-name = $m.name;

The signature of a method can provide us with a list of parameters.

    for $m.signature.params -> $p {

But first the classname.

        $invocant-type = $p.type if $p.invocant;

We are only interested in named arguments.

        next unless $p.named;

Each named argument starts with a sigil ($ in this case). That would get in our way, so we strip it from the method name and store the rest for safekeeping.

        @named-params.push: $p.name.substr(1);
    }

After we got all the information we need, we come to the business end of our trait. We wrap the method in a new method. We only want to peek into it’s argument list so we take a capture. Conveniently a capture provided us with a method to get a list of all named arguments it contains.

    $m.wrap: method (|args) {

We first check with the subset operator (<=) if there are any named arguments we don’t like and if so, we use the set difference operator (-) to name only those. The exception we defined earlier takes care of the rest.

        X::Parameter::ExtraNamed.new(
            method-name => $method-name,
            classname => $invocant-type.perl, 
            extra-parameters => args.hash.keys (-) @named-params
        ).throw unless args.hash.keys (<=) @named-params;

If all named arguments are fine, we can call the wrapped method and forward all arguments kept in the capture.

        callwith(self, |args);
    }
}

Let’s test it.

class Foo {
    has $.valid;

Perl 6 will deal with the positional part, our trait is strict will learn about :$valid.

    method new (Int $p1, :$valid) is strict { self.bless(valid => $valid) }
    method say () { say 'Foo' }
}

New instance is new and got 2 extra named arguments the method doesn’t know of.

Foo.new(42, valid => True, :invalid, :also-invalid).say;

OUTPUT:

<<The method new of Foo is offended by the named parameter(s): invalid also-invalid>>

Traits are a nice way to generalise behaviour of methods and subroutines. Combined with exceptionally good introspection Perl 6 can be mended and bended to fit into any Christmas package. Merry 1.0 and a happy new Perl 6 to you all!

Day 12 – I just felt a disturbance in the switch

So I said “I’m going to advent post about the spec change to given/when earlier this year, OK?”

And more than one person said “What spec change?”

And I said, “Exactly.”


We speak quite proudly about “whirlpool development” in Perl 6-land. Many forces push on a particular feature and force it to converge to an ideal point: specification, implementation, bugs, corner cases, actual real-world usage…

Perl 6 is not alone in this. By my count, the people behind the HTML specification have now realized at least twice that if you try to lead by specifying, you will find to your dismay that users do something different than you expected them to, and that in the resulting competition between reality and specification, reality wins out by virtue of being real.

I guess my point is: if you have a specification and a user community, often the right thing is to specify stuff that eventually ends up in implementations that the user community benefits from. But information can also come flowing back from actual real-world usage, and in some of those cases, it’s good to go adapting the spec.


Alright. What has happened to given/when since (say) some clown described them in an advent post six years ago?

To answer that question, first let me say that everything in that post is still true, and still works. (And not because I went back and changed it. I promise!)

There are two small changes, and they are both of the kind that enable new behavior, not restrict existing behavior.

First small change: we already knew (see the old post) that the switching behavior of when works not just in a given block, but in any “topicalizer” block such as a for loop or subroutine that takes $_ as a parameter.

given $answer {
    when "Atlantis" { say "that is CORRECT" }
    default { say "BZZZZZZZZZT!" }
}
for 1..100 {
    when * %% 15 { say "Fizzbuzz" }
    when * %% 3 { say "Fizz" }
    when * %% 5 { say "Buzz" }
    default { say $_ }
}
sub expand($_) {
    when "R" { say "Red Mars" }
    when "G" { say "Green Mars" }
    when "B" { say "Blue Mars" }
    default { die "Unknown contraction '$_'" }
}

But even subroutines that don’t take $_ as a parameter get their own lexical $_ to modify. So the rule is actually less about special topicalizer blocks and more about “is there something nice in $_ right now that I might want to switch on?”. We can even set $_ ourselves if we want.

sub seek-the-answer() {
    $_ = (^100).pick;
    when 42 { say "The answer!" }
    default { say "A number" }
}

In other words, we already knew that when (and default) were pretty separate from given. But this shows that the rift is bigger than previously thought. The switch statement logic is all in the when statements. In this light, given is just a handy topicalizer block for when we temporarily want to set $_ to something.

Second small change: you can nest when statements!

I didn’t see that one coming. I’m pretty sure I’ve never used it in the wild. But apparently people have! And yes, I can see it being very useful sometimes.

when * > 2 {
    when 4 { say 'four!' }
    default { say 'huge' }
}
default {
    say 'little'
}

You might remember that a when block has an implicit succeed statement at the end which makes the surrounding topicalizer block exit. (Meaning you don’t have to remember to break out of the switch manually, because it’s done for you by default.) If you want to override the succeed statement and continue past the when block, then you write an explicit proceed at the end of the when block. Fall-through, as it were, is opt-in.

given $verse-number {
    when * >= 12 { say "Twelve drummers drumming"; proceed }
    when * >= 11 { say "Eleven pipers piping"; proceed }
    # ...
    when * >= 5 { say "FIIIIIIVE GOLDEN RINGS!"; proceed }
    when * >= 4 { say "Four calling birds"; proceed }
    when * >= 3 { say "Three French hens"; proceed }
    when * >= 2 {
        say "Two turtle doves";
        say "and a partridge in a pear tree";
    }
    say "a partridge in a pear tree";
}

All that is still true, save for the word “topicalizer”, since we now realize that when blocks show up basically anywhere. The new rule is this: when makes the surrounding block exit. If you’re in a nested-when-type situation, “the surrounding block” is taken to be the innermost block that isn’t a when block. (So, usually a given block or similar.)


It’s nice to see spec changes happening due to how things are being used in practice. This particular spec change came about because jnthn tried to implement a patch in Rakudo to detect topicalizer blocks more strictly, and he came across userland cases such as the above two, and decided (correctly) to align the spec with those cases instead of the other way around.

The given/when spec has been very stable, and so it’s rare to see a change happen in that part of the synopses. I find the change to be an improvement, and even something of a simplification. I’m glad Perl 6 is the type of language that adapts the spec to the users instead of vice versa.

Just felt like sharing this. Enjoy your Advent.

Day 11 – The Source will be with you, always

Reportings from a Learnathon

This past weekend I had the pleasure of hosting a Perl 6 learnathon with a friend who contacted me specifically to have a night of absorbing this new version of Perl. I thought it might be interesting to share some of what we learned during the process. I will begin with by explaining the single line of code which ended up and then show you some examples of where our evening took us.

Pick Whatever, rotor by five

As we opened our laptops, I was excited to show the newest example I had pushed to my Terminal::Print project. It’s taken quite some time to achieve, but asynchronous printing is now working with this module. It’s not fast, yet, but my long saught multi-threaded “Matrix-ish” example is working. Each column is being printed from an individual thread. This approach of spinning off a bunch of new threads and then throwing them away is not efficient, but as a proof of concept I find it very exciting.

pick-rotor

This line contains a few things that inspired questions. The first is the precise meaning of pick(*), which here means that we want to produce a randomized version of @columns. You can think of the Whatever here as a meaning “as many as possible”. It triggers a different multi method code path which knows to use the invoking objects’s own array size as the length of the list to pick.

The second part to explain was rotor. This is another one of those Perl 6 English-isms that at first felt quite strange but quickly became a favorite as I began visualizing a huge turbine rotor-ing all of my lists into whatever shape I desire whenever I use it. In this case, I want to assemble a list of 5-element arrays from a random version of @columns. By default, rotor will only give you back fully formed 5-element arrays, but by passing the :partial we trigger the multi-method form that will include a non 5-element array if the length of @columns is not divisible by 5 (‘Divisibility’ is easily queried Perl 6, by the way. We phrase it as $n %% 5.)

Put another way: rotor is a list-of-lists generator, taking one list and chopping it up into chunks according to your specifications. My friend mentioned that this resembles a question that he asks in interviews, inviting an investigation into the underlying implementation.

I’ve always considered Rakudo’s Perl 6-implemented-in-Perl 6 approach as a secret weapon that often goes overlooked in assessment of Perl 6’s viability. Even with the current reliance on NQP, there is a great deal of Perl 6 which is implemented in the language itself. To this end, I could immediately open src/core/Any.pm in an editor and continue explaining the language by reading the implementation of the language. Not just explaining the implementation, which can be accomplished by the right combination of language/implementation language and polyglot coverage. I mean explaining the language by looking at how it is used to implement itself, by following different threads from questions that arise as a result of looking at that code.

A word of caution

Now, I don’t mean to imply that one’s initial exposure to core can’t be a shocking experience. You are looking at Perl 6 code that is written according to constraints that don’t exist in perl6 which arise from not being fully bootstrapped and performance concerns. These are expressed in core by NQP and relative placement in the compilation process, on the one hand, and in prototyping and hoop jumping, on the other.

In other words: you are not expected to write code like this and core code does not represent an accurate representation of what Perl 6 code looks like in practice. It is 100% really Perl 6 code, though,  and if you look at NQP stuff as funky library calls, everything you are seeing is possible in your own code. You just don’t normally need to.

Down the rabbit hole

From src/core/Any.pm, here is the code for rotor:

any-rotor.png

And the code for pick:

pick-any

These are not the actual implementations, mind you. Those live in List.pm. But already these signatures inspire some questions

What is this |c thing in the signature?

This represents a Capture, which is an object representing the arguments passed into the method call. In this case it is being used such that no unpacking/containerization occurs. Rather we simply accept what we have been given and pass them as a single parameter to self.list.rotor without looking at them.

In the proto signature for pick we see that there is no name for the Capture, but rather a bare ‘|‘ which tells the method dispatcher that there can be zero or more arguments. Put another way: there are no mandatory arguments that apply to all the pick candidates.

What is this proto thing?

The answer is that it is a clarification that you can apply to your multi methods that constrains them to a specific “shape”.  It is commonly used in conjunction with a Capture, and in fact we see this in our pick example.

As the docs put it, a proto is ‘a way to formally declare commonalities between multi candidates’. The prototype for pick specifies that there are no mandatory arguments, but there might be. This is basically the default behavior for multi method dispatch, but here it allows us to specify the is nodal trait, which provides further information to the dispatcher relating to the list-y. Also due to bootstrapping issues, all multi methods built into Rakudo need an explicit proto.In practice we do not need either of these for our code. But they are there when you need them. One example of a trait that you might use regularly is the pure trait: when a given input has a guaranteed output, you can specify that it is pure and the dispatcher can return results for cached arguments without repeated calculations.

Midnight Golf

As promised, here are a few code examples from the learnathon.

stateful.pngstateful-results.png

This is using anonymous state variables, which are available to subroutines and methods but not to other Callables like blocks.  My friend shared my marvel at the power and flexibility of the Perl 6 parser to be able to handle statements like $--*5, when every character besides the 5 in that statement has a different meaning according to context. Meanwhile Perl 6 gives you defaults for your subroutine parameters by using the assignment operator.

Note that each bare $ in a subroutine creates/addresses a new, individual state variable. Some people will hate this, as they hate much that I appreciate about the language. These anonymous state variables are for situations where a name doesn’t matter, such as golfing on the command line. They can be confusing to get a full grasp of, though.

Here is another example we generated while exploring (and failing many times) to grasp the nuances of these dynamics.

more-state.pngmore-state-results.png

Gone is the anonymous state variable. This is by necessity, because you can only refer to an anonymous state variable once in a subroutine. We’ve switched $a to be optional. The parens around the state declaration are necessary because trying to declare a variable in terms of itself is undefined behavior and Perl 6 knows that.

The same thing, expressed in slight variations but with the same meaning:

more-more-state.png

The bottom example shows that the assignment to zero in our initial more-state is actually unnecessary. The middle shows creating a variable and binding it to a state var, which is a rougher way to get the same behavior as openly declaring a state variable. The top example shows what might be considered the ‘politely verbose’ option.

Concluding thoughts

I was hoping to share more from the evening with you, but it’s been a lot of words already and we’ve only scratched the surface of the first example we examined! Instead, I recommend that you spend some time with src/core in an editor,  the perl6 repl in a terminal, and #perl6 in an IRC client (or web browser) and just explore.

This language is deep. Opening src/core is like diving into the deep end of the ocean from the top of a Star Destroyer. The reward for your exploration of the language, however, is an increasing mastery of a programming language that is designed according to a human-centric philosophy and which approaches a level of expressivity that, some would argue, is unparalleled in the world of computing.

 

Day 10 – Perl 6 Pod

Whenever you write code, you also write documentation, right? Wait, don’t answer that! Let me just pretend that you said “yes”.

As with Perl 5, Perl 6 has a built-in documentation format called Pod. In Perl 5, POD was short for “Plain Ol’ Documentation”, but in Perl 6 we just call it Pod and there’s no acronym. One of the great things about Pod in Perl 6 is that the compiler parses Pod for you. The resulting structure is essentially a tree (really, an Array of trees).

Let’s take a look at a few examples …

use v6;

=begin pod

=head1 NAME

example1.p6 - It's a script to show you how Pod works in Perl 6!

=end pod

sub MAIN {
    say $=pod.WHAT;
    say $=pod[0].WHAT;
}

When I run this script I get:

$ perl6 ./bin/example1.p6 
(Array)
(Named)

The $=pod variable is a reference to the parsed Pod from the current file.

From the code above we can see that the $=pod variable contains an Array. Each element of that array is a Pod node of some sort, and those nodes in turn will usually contain more Pod nodes, and so on. Essentially, Pod is parsed into a tree that you can recursively descend and examine.

The $=pod variable is an array is because you can have several distinct chunks of Pod in a single file …

use v6;

=begin pod

=head1 NAME

example2.p6 - It's a script to show you how Pod works in Perl 6!

=end pod

sub MAIN {
    say $=pod.elems;
    say $=pod[0].WHAT;
    say $=pod[1].WHAT;
}

=begin pod

=head1 DESCRIPTION

This is where I'd put some more stuff about the script if it did anything.

=end pod

When we run this script we get:

$ perl6 ./bin/example2.p6 
2
(Named)
(Named)

So there are now two elements in $=pod, one for each chunk of Pod delimited by our =begin pod/=end pod pairs.

Pod Syntax in Perl 6

Up until this point I’ve shown you some Pod without actually explaining what it is. Let’s take a look at some valid Pod constructs in Perl 6 so you can get a better sense of how it looks, both when you write it and when you’re writing code to process it.

Perl 6 Pod is purely semantic. In other words, you describe what something is, rather than how to present it. It’s left up to code that turns Pod into something else (HTML, Markdown, man pages) to decide how to present any given construct.

Perl 6 Pod has quite a bit more to it than we can cover here, so see Synopsis 26 for all the details. We’ll cover some of the highlights and then look more closely at how to process Pod programmatically.

Blocks

Pod supports several different ways to declare a block. You can use =begin and =end markers …

=begin head1
Heading Goes Here
=end head1

Or you can use a =for marker …

=for head1
Heading Goes Here

These block styles allow you to include configuration information:

=begin head3 :include-in-toc
This will show up in TOC
=end head3

=for head4 :!include-in-toc
Not in the TOC

With a =for block, only the lines immediately following the =for is part of that block. Once there is an empty line, anything that follows is separate.

You can use an “abbreviated” block …

=head1 Heading Goes Here

… but abbreviated blocks don’t allow for configuration information. Like =for blocks, abbreviated blocks end at the first empty line.

There are many types of blocks, including table, code, I/O blocks (input and output), lists, and more.

Paragraphs

Text that isn’t explicitly part of a block is considered to be a paragraph block …

use v6;

=begin pod

=begin head1

NAME

=end head1

This is a paragraph block.

=end pod

sub MAIN {
    say $=pod[0].contents[1].WHAT;
}

The output will be (Para), because the second element of contents is a plain paragraph.

Lists

Lists are much simpler in Perl 6 than they were in Perl 5 …

=begin pod

=item1 The first item
=item1 The second item
=item2 This is a sub-item of the second item
=item1 The third item

Of course, you can also use the =for and =begin/=end form with list items too.

Formatting Codes

There are a variety of formatting codes available …

See L<http://design.perl6.org/S26.html> for details.

Some times you need to mark some text as C<code>,
B<being the basis of the sentence>,
or I<important>.

And More

There are a lot of new things with Perl 6 Pod, and if you want to learn more I encourage you to read Synopsis 26 for all of the gory details.

Processing Pod with Perl 6

As I said before, the Perl 6 compiler parses Pod in your program and makes it available to you via various variables. We already saw the use of $=pod, which contains all of the Pod in the current file. That pod is an array of Pod::Block objects. In future versions of Rakudo, you will also be able to get at specific types of Pod blocks by name with variables such as $=head1 or $=TITLE, but this is not yet implemented.

Let’s write a simple program to show us an outline of the Pod in a given file. For example, given some Pod like this …

=head1 NAME

...

=head1 DESCRIPTION

...

=head1 METHODS

...

=METHOD $obj.foo

...

=METHOD $obj.bar

...

 

… we want to print this output …

NAME
DESCRIPTION
METHODS
  $obj.foo (METHOD)
  $obj.bar (METHOD)

Here’s the code to do just that …

use v6;

sub MAIN {
    for $=pod.kv -> $i, $block {
        say "Block $i";
        recurse-blocks($block);
        print "\n";
    }
}

sub recurse-blocks (Pod::Block $block) {
    given $block {
        when Pod::Block::Named {
            if $block.name eq 'pod' {
                for $block.contents.values -> $b {
                    recurse-blocks($b);
                }
            }
            else {
                my $output = $block.contents[0].contents[0]
                    ~ " ({$block.name})";
                depth-say( 2, $output );
            }
        }
                
        when Pod::Heading {
            depth-say( $block.level,
                       $block.contents[0].contents[0] );
        }
    }
}

sub depth-say (Int $depth, Str $thing) {
    print '  ' x $depth;
    say $thing;
}

 

This code takes some shortcuts wherever it uses $block.contents[0].contents[0] by making the very big assumption that a block contains one element which is a single Pod::Block::Para. It then assumes that the Para block contains one element which is plain text.

In practice, blocks may contain arbitrarily nested blocks because of things like formatting codes, but this gives you an idea of how you might walk through the Pod tree.

Of course, because Perl 6 is still Perl, there’s a module for that! I’ve sketched out a preliminary module called Pod::TreeWalker that will do the tree walking for you, generating events as it finds different nodes. You provide a listener object that is called for each event.

GitHub Repo

All of the code shown in these examples is on GitHub. I’ve also written an additional script that uses Pod::TreeWalker to implement the outlining I demonstrate above.

Day 9 – Perl 6 and the wolf pack

This year I rejoin with Perl6 once again and found out that this Christmas is THE Christmas.

With the announcement of Larry Wall and the imminent liberation of a 1.0 version of Rakudo I got really excited and right away started a couple of side projects using it.

And just when I was thinking that Perl6 couldn’t get any cooler I found the Jonathan Worthington, YAPC::Asia conference: «Parallelism, Concurrency, and Asynchrony in Perl 6».

Perl6 parallelism realy impressed me so implemented an  adaptation of the Grey wolf optimizer created by Mirjalili et all in 2014.

This algorithm (by wikipedia’s definition) is a meta heuristic that mimics the social and hunting behavior of the wolf packs to search for “good enough” solutions in a wide range of problems.

The pseudo-code here:

pseudocodigo-gwo

And from my implementation the interesting parallel bits:

uno

Initially I used a loop block here (I like them more), but that caused that all the wolves end up with a random value instead of keeping the passed $wolf_number value.

After a couple hours and a lot of questions on the #Perl6 IRC channel (wonderful, patient people by the way) I found the problem: the $wolf_number variable disappear with the end of the loop block leaving the fitness_libsvm with nothing.

After that is just await and vòila! process the results.

dos

Never ever have I seen a simpler more elegant code for parallelism than the one produced by Perl6… I just love it!

Postdata: The full implementation is available at https://github.com/Sufrostico/perl6-gwo

Day 8 – Grammars generating grammars

By now you have probably gotten used to the prefix “meta” appearing here and there in the Perl 6 world. Metaclasses, metaobjects, metaoperators, the mythical Meta-Object Protocol. Sounds like nothing scary, all good and familiar and you’ve seen it all, eh? Nothing further from the truth! Today, on the Perl 6 Advent Calendar we’re going full meta. We’re going to have grammars that parse grammars then generate grammars that are going to use to parse grammars. I’ll let this sink in for a moment while you scroll down for the next paragraph.

Grammars are hands-down one of the killer features of Perl 6. Taking Perl’s no less than legendary ability to process text and taking it to the next level. Regular expressions, to many people’s disappointment are just what they say they are: regular (well, regular regular expressions do. Perl regexes are a bit of a different beast, but that’s a material for another story). They parse regular, as opposed to, say, context-free languages, to the neverending disappointment of all the people who would just love to parse XML with them, like in that everyone’s favourite Stackoverflow answer. Say no more to silly theoretical limitations of language theory though, since now we have grammars which are all we ever missed in regexes: readability, composability and, of course, ability to parse even Perl 6 itself — and if that doesn’t sell its absolute power, I don’t know what does.

Writing parsers for predefined grammars (say, in Bachus-Naur Form) was always a bit of a dull job, almost as exciting as copypasting stuff. If you ever sat down and wrote a parser from scratch (perhaps while going through the excellent Let’s Build a Compiler book), you probably recognize the pattern all too well: take a single rule from your grammar, write a subroutine for it, have it call (perhaps recursively) other subroutines similarly defined for other grammar rules, rinse, repeat. Well say no more to that now that we have Perl 6 grammars! In this wonderful new world we need not write subroutines for every token to get the work done. Now we write a grammar class, where we put tokens, rules and regexes for every symbol in our grammar, and inside them we write regexes (or code) that refer to (perhaps recursively) other symbols in our Perl 6 grammar. Now, if you ever went through both of those alternatives, you will definitely realize how massive of a convenience the grammars are in Perl 6. Instead of painstakingly cranking out repetitive and error-prone code we have a wonderful, declarative way to specify our language, with an impressive collection of utility to get most of our common, boring work out of the way.

But what if we already have a grammar, specified perhaps in the previously mentioned BNF? What we do then is carefully retype the existing grammar (parsing it in our head, actually) into our new, shiny Perl 6 grammar that represents the exact same thing, but has the clear advantage of actually being executable code. A fair deal, you could say. For most people, no big deal at all. We are not most people. We are programmers. We have the resources. The will. To make these grammars count! Our job revolves around building things that will do the work for us so we don’t have to. To take the well-defined specifications and turn them into tools that work for us, or for themselves. Tools that take the repetitive and automatable parts of our work the way. Why should building parsers be any different? Why, I’m glad you asked.

The wonderful thing about the Perl 6 grammars is that they are no more magical than any other element of the language. Just as classes are first class citizen that we can introspect, augment and build programatically, so are grammars. In fact, you can look at the source code of the compiler itself and notice that grammars are nothing else than a specialized kind of classes. They follow the same rules as classes do, allowing us to create them on the fly, add tokens to them on the fly and eventually finalize them in order to have a proper, instantiatable class object. So now that we can parse BNF grammars (since they’re just ordinary text) and create Perl 6 grammars from code, let’s put those pieces together and write us something that will save us the effort of converting a BNF grammar to Perl 6 grammar manually.

The grammar for a BNF grammar

grammar Grammar::BNF {
    token TOP { \s* <rule>+ \s* }

    token rule {
        <opt-ws> '<' <rule-name> '>' <opt-ws> '::=' <opt-ws> <expression> <line-end>
    }

    token expression {
        <list-of-terms> +% [\s* '|' <opt-ws>]
    }

    token term {
        <literal> | '<' <rule-name> '>'
    }

    token list-of-terms { <term> +% <opt-ws> }

    token rule-name { <-[>]>+ }

    token opt-ws { \h* }

    token line-end { [ <opt-ws> \n ]+ }

    token literal { '"' <-["]>* '"' | "'" <-[']>* "'" }

    ...
}

The interesting stuff happens in the three, almost-topmost tokens. rule is the core building block of a BNF grammar: a <symbol> ::= <expression> pair, followed by a new line. The entire grammar is no more than a list of those. Each expression is a list of terms, or possibly and alternative of them. Each term is either a literal, or a symbol name surrounded by angle brackets. Easy enough! That covers the parsing part. Let’s look at the generating itself. We do have a built-in mechanism of “doing stuff for each token in the grammar”, in the form of Actions, so let’s go ahead and use that:

my class Actions {
    has $.name = 'BNFGrammar';
    method TOP($/) {
        my $grmr := Metamodel::GrammarHOW.new_type(:$.name);
        $grmr.^add_method('TOP',
            EVAL 'token { <' ~ $<rule>[0].ast.key ~ '> }');
        for $<rule>.map(*.ast) -> $rule {
            $grmr.^add_method($rule.key, $rule.value);
        }
        $grmr.^compose;
        make $grmr;
    }

    method expression($/) {
        make EVAL 'token { ' ~ ~$/ ~ ' }';
    }

    method rule($/) {
        make ~$<rule-name> => $<expression>.ast;
    }
}

The TOP method is definitely the most magical and scary, so let’s tackle that first to make the rest of the stuff look trivial in comparison. Basically, three things happen there:

1. We create a new grammar, as a new Perl 6 type
2. We add tokens to it using the `^add_method` method
3. We finalize the grammar using the `^compose` method

While Perl 6 specifies that the token named TOP is where the parsing starts, in BNF the first rule is always the starting point. To adapt one to the other, we craft a phony TOP token which just calls the first rule specfied in the BNF grammar. Unavoidably, the scary and discouraging EVAL catches our attention, as if it said “horrible stuff happens here!” It’s not entirely wrong when it says that, but since we have no other way of programmatically constructing the individual regexes (that I know of), we’ll have to accept this little discomfort in the name of the Greater Good, and look a little bit closer at what we’re actually EVALing in a moment.

After TOP we proceed to add the rest of the BNF rules to our grammar, this time preserving their original names, then ^compose() the whole thing and finally make it the result of the parsing: a readymade parser class.

In the expression method we glue the parsed BNF elements together in order to produce valid Perl 6 code. This turns out to be pretty easy, since both separate symbols with whitespace, alternate them with the pipe character and sorround symbol names with angle brackets. So for a rule that looks like this:

<foo> ::= 'bar' | <baz>

The Perl 6 code that we EVAL turns becomes:

token { 'bar' | <baz> }

Since we already validated in the grammar part of our code that the BNF we parse is correct, there’s nothing stopping us from literally pasting the entire expression into our code and wrap it in a token { }, so let’s go ahead and do just that.

Last but not least, for each BNF rule we parse we produce a nice Pair, so our TOP method has an easier times processing each of them.

Seems like we’re about done here, but just for the users’ convenience, let’s write a nice method that takes a BNF grammar and produces a ready to use type object for us. As we remember, grammars are just classes, so there’s nothing stopping us from just adding it straight to our grammar:

grammar Grammar::BNF {
    ...

    method generate($source, :$name = 'BNFGrammar') {
        my $actions = Actions.new(:$name);
        my $ret = self.new.parse($source, :$actions).ast;
        return $ret.WHAT;
    }
}

Looks good from here! Before you start copypasting all this into your own projects, remember that Grammar::BNF is a Perl 6 module available in your Perl 6 Module Ecosystem, installable with your favourite module manager. It also ships with some goodies like a Perl 6 slang, allowing you to write BNF grammars straight in your Perl 6 code (as opposed to including them as strings), or parsing the more powerful (and perhaps more widespread) ABNF grammars as well.

If you indeed took the time for the beginning of this post to sink in, you may remember that I promised that we’re going to have grammars (a-one) that parse grammars (a-two) then generate grammars (a-three) that are going to use to parse grammars (a-four). So far we’ve seen the BNF::Grammar grammar (that’s our a-one), that parses a BNF grammar (that’s our a-two), generates a Perl 6 grammar in a form of a type object (that’s a-three) and… that’s it. We’re still missing the last part, using this whole thing to parse grammars. We’ve only gone 75%-meta, and that is just not good enough in this day and age. Why stop now? Why not take a BNF grammar of a BNF grammar, parse that with a Perl 6 grammar and use the resulting Perl 6 BNF grammar to parse our original BNF grammar of a BNF grammar? Wouldn’t that be sweet? Sure it will! That is, however, left as an exercise for you, my dear readers. After all, how fair would it be if I had all the fun while you just sit there and watch? Would you like it if you opened the little paper window in your advent calendar only to find a note saying. „There was a chocolate here for you. I ate it. It was truly delivious?” Me neither! There’s a BNF grammar for BNF both on Wikipedia and in Grammar::BNF’s very own test suite; the latter even includes a little breadcrumb that can help you with your adventure. I eagerly await thy results, and as always, thank you all for reading and I wish you a wonderful advent!