Day 18 – Asynchronous Workflow with Tinky

State Machines

The idea of “state” and the “state machine” is so ubiquitous in software programming that we tend to use it all the time without even thinking about it very much: that we have a fixed number of states of some object or system and that there are fixed allowable transitions from one state to another will be almost inevitable when we start modelling some physical process or business procedure.

I quite like the much quoted description of a state machine used in the erlang documentation:

If we are in state S and the event E occurs, we should perform the actions A and make a transition to the state S’.

once we string a number of these together we could consider it to form a workflow. Whole books, definition languages and software systems have been made about this stuff.

Managing State

Often we find ourselves implementing state management in an application on an ad-hoc basis, defining the constraints on state transition in code, often in separate part of the application and relying on the other parts of the application to do the right thing. I’m sure I’m not alone to have seen code in the wild that purports to implement a state machine which in fact has proved to be entirely non-deterministic in the face of the addition of new states or actions in the system. Or systems where a new action has to be performed on some object when it enters a new state and the state can be entered in different ways in different parts of the application so new code has to be implemented in a number of different places with the inevitable consequence that one gets missed and it doesn’t work as defined.

Using a single source of state management with a consistent definition within an application can alleviate these kinds of problems and can actually make the design of the application simpler and clearer than it might otherwise have been.

Thus I was inspired to write Tinky which is basically a state management system that allows you to create workflows in an application.

Tinky allows you to compose a number of states and the transitions between them into an application workflow which can make the transitions between the states available as a set of Supplies: providing the required constraints on the allowed transitions and allowing you to implement the actions on those transitions in the appropriate place or manner for your application.

Simple Workflow

Perhaps the canonical example used for similar software is that of the bug tracking software, so let’s start with the simplest possible example with three states and two transitions between them:

use Tinky;

my $state-new  = Tinky::State.new(name => "new");
my $state-open = Tinky::State.new(name => "open");
my $state-done = Tinky::State.new(name => "done");

my @states = ( $state-new, $state-open, $state-done);

my $new-open   = Tinky::Transition.new(name => "open", from => $state-new, to => $state-open);
my $open-done  = Tinky::Transition.new(name => "done", from => $state-open, to => $state-done);

my @transitions = ( $new-open, $open-done);


my $workflow = Tinky::Workflow.new(name => "support", :@states, :@transitions, initial-state => $state-new);

This defines our three states “new”, “open” and “done” and two transitions between them, from “new” to “open” and “open” to “done”. This defines a “workflow” in which the state must go through “open” before becoming “done”.

Obviously this doesn’t do very much without an object that can take part in this workflow, so Tinky provides a role Tinky::Object that can be applied to a class who’s state you want to manage:

class Ticket does Tinky::Object {
    has Str $.ticket-number = (^100000).pick.fmt("%08d");
}

my $ticket = Ticket.new;

$ticket.apply-workflow($workflow);

say $ticket.state.name;          # new
$ticket.next-states>>.name.say;  # (open)

$ticket.state = $state-open;
say $ticket.state.name;          # open

The Tinky::Object role provides the accessors state and next-states to the object, the latter returning a list of the possible states that the object can be transitioned to (in this example there is only one, but there could be as many as your workflow definition allows,) you’ll notice that the state of the object is defaulted to new which is the state provided as initial-state to the Tinky::Workflow constructor.

The assignment to state is constrained by the workflow definition, so if in the above you were to do:

$ticket.state = $state-done;

This would result in an exception “No Transition for ‘new’ to ‘done'” and the state of the object would not be changed.

As a convenience the workflow object defines a role which provides methods named for the transitions and which is applied to the object when apply-workflow is called, thus the setting of the state in the above could be written as:

$ticket.open

However this feature has an additional subtlety (that I unashamedly stole from a javascript library,) in that if there are two transitions with the same name then it will still create a single method which will use the current state of the object to select which transition to apply; typically you might do this where the to state is the same, so for example if we added a new state ‘rejected’ which can be entered from both ‘new’ and ‘open’:

use Tinky;

my $state-new      = Tinky::State.new(name => "new");
my $state-open     = Tinky::State.new(name => "open");
my $state-done     = Tinky::State.new(name => "done");
my $state-rejected = Tinky::State.new(name => "rejected");


my @states = ( $state-new, $state-open, $state-done, $state-rejected);

my $new-open   = Tinky::Transition.new(name => "open", from => $state-new, to => $state-open);
my $new-rejected   = Tinky::Transition.new(name => "reject", from => $state-new, to => $state-rejected);
my $open-done  = Tinky::Transition.new(name => "done", from => $state-open, to => $state-done);
my $open-rejected  = Tinky::Transition.new(name => "reject", from => $state-open, to => $state-rejected);

my @transitions = ( $new-open,$new-rejected, $open-done, $open-rejected);


my $workflow = Tinky::Workflow.new(name => "support", :@states, :@transitions, initial-state => $state-new);

class Ticket does Tinky::Object {
    has Str $.ticket-number = (^100000).pick.fmt("%08d");
}

my $ticket-one = Ticket.new;
$ticket-one.apply-workflow($workflow);
$ticket-one.next-states>>.name.say;
$ticket-one.reject;
say $ticket-one.state.name;

my $ticket-two = Ticket.new;
$ticket-two.apply-workflow($workflow);
$ticket-two.open;
$ticket-two.next-states>>.name.say;
$ticket-two.reject;
say $ticket-two.state.name;

You are not strictly limited to having the similarly named transitions enter the same state, but they must have different from states (otherwise the method generated wouldn’t know which transition to apply).   Obviously if the method is called on an object which is not in a state for which there are any transitions an exception will be thrown.

So what about this asynchronous thing

All of this might be somewhat useful if we are merely concerned with constraining the sequence of states an object might be in, but typically we want to perform some action upon transition from one state to another (and this is explicitly stated in the definition above). So, for instance, in our ticketing example we might want to send some notification, recalculate resource scheduling or make a branch in a version control system for example.

Tinky provides for the state transition actions by means of a set of  Supplies on the states and transitions,  to which the object for which the transition has been performed is emitted. These “events” are emitted  on the state that is being left, the state that is being entered and the actual transition that was performed. The supplies are conveniently aggregated at the workflow level.

So, if, in the example above, we wanted to log every transition of state of a ticket and additional send a message when the ticket enters the “open” state we can simply tap the appropriate Supply to perform these actions:

use Tinky;

my $state-new      = Tinky::State.new(name => "new");
my $state-open     = Tinky::State.new(name => "open");
my $state-done     = Tinky::State.new(name => "done");
my $state-rejected = Tinky::State.new(name => "rejected");


my @states = ( $state-new, $state-open, $state-done, $state-rejected);

my $new-open   = Tinky::Transition.new(name => "open", from => $state-new, to => $state-open);
my $new-rejected   = Tinky::Transition.new(name => "reject", from => $state-new, to => $state-rejected);
my $open-done  = Tinky::Transition.new(name => "done", from => $state-open, to => $state-done);
my $open-rejected  = Tinky::Transition.new(name => "reject", from => $state-open, to => $state-rejected);

my @transitions = ( $new-open,$new-rejected, $open-done, $open-rejected);


my $workflow = Tinky::Workflow.new(name => "support", :@states, :@transitions, initial-state => $state-new);

# Make the required actions

$workflow.transition-supply.tap(-> ($trans, $object) { say "Ticket '{ $object.ticket-number }' went from { $trans.from.name }' to '{ $trans.to.name }'" });
$state-open.enter-supply.tap(-> $object { say "Ticket '{ $object.ticket-number }' is opened, sending email" });

class Ticket does Tinky::Object {
    has Str $.ticket-number = (^100000).pick.fmt("%08d");
}

my $ticket-one = Ticket.new;
$ticket-one.apply-workflow($workflow);
$ticket-one.next-states>>.name.say;
$ticket-one.reject;
say $ticket-one.state.name;

my $ticket-two = Ticket.new;
$ticket-two.apply-workflow($workflow);
$ticket-two.open;
$ticket-two.next-states>>.name.say;
$ticket-two.reject;
say $ticket-two.state.name;

Which will give some output like

[open rejected]
Ticket '00015475' went from new' to 'rejected'
rejected
Ticket '00053735' is opened, sending email
Ticket '00053735' went from new' to 'open'
[done rejected]
Ticket '00053735' went from open' to 'rejected'
rejected

The beauty of this kind of arrangement, for me at least, is that the actions can be defined at the most appropriate place in the code rather than all in one place and can also be added and removed at run time if required, it also works nicely with other sources of asynchronous events in Perl 6 such as timers, signals or file system notifications.

Defining a Machine

Defining a large set of states and transitions could prove somewhat tiresome and error prone if doing it in code like the above, so you could choose to build it from some configuration file or from a database of some sort, but for convenience I have recently released Tinky::JSON which allows you to define all of your states and transitions in a single JSON document.

The above example would then become something like:

use Tinky;
use Tinky::JSON;

my $json = q:to/JSON/;
{
    "states" : [ "new", "open", "done", "rejected" ],
    "transitions" : [
        {
            "name" : "open",
            "from" : "new",
            "to"   : "open"
        },
        {
            "name" : "done",
            "from" : "open",
            "to"   : "done"
        },
        {
            "name" : "reject",
            "from" : "new",
            "to"   : "rejected"
        },
        {
            "name" : "reject",
            "from" : "open",
            "to"   : "rejected"
        }
    ],
    "initial-state" : "new"
}
JSON

my $workflow = Tinky::JSON::Workflow.from-json($json);

$workflow.transition-supply.tap(-> ($trans, $object) { say "Ticket '{ $object.ticket-number }' went from { $trans.from.name }' to '{ $trans.to.name }'" });
$workflow.enter-supply("open").tap(-> $object { say "Ticket '{ $object.ticket-number }' is opened, sending email" });

class Ticket does Tinky::Object {
    has Str $.ticket-number = (^100000).pick.fmt("%08d");
}

my $ticket-one = Ticket.new;
$ticket-one.apply-workflow($workflow);
$ticket-one.next-states>>.name.say;
$ticket-one.reject;
say $ticket-one.state.name;

my $ticket-two = Ticket.new;
$ticket-two.apply-workflow($workflow);
$ticket-two.open;
$ticket-two.next-states>>.name.say;
$ticket-two.reject;
say $ticket-two.state.name;

As well as providing the means of constructing the workflow object from a JSON description it adds methods for accessing the states and transitions and their respective supplies by name rather than having to have the objects themselves to hand, which may be more convenient in your application. I’m still working out how to provide the definition of actions in a similarly convenient declarative way.

It would probably be easy to make something similar that can obtain the definition from an XML file (probably using XML::Class,) so let me know if you might find that useful.

Making something useful

My prime driver for making Tinky in the first place was for a still-in-progress online radio management software, this could potentially have several different workflows for different types of objects: the media for streaming may need to be uploaded, it may possibly require encoding to a streamable format, have silence detection performed and its metadata normalised and so forth before it is usable in a show; the shows themselves need to have either media added or a live streaming source configured and then be scheduled at the appropriate time and possibly also be recorded (and then the recording fed back into the media workflow.) All of this might be a little too complex for a short example, but an example that ships with Tinky::JSON is inspired by the media portion of this and was actually made in response to something someone was asking about on IRC a while ago.

The basic idea is that a process waits for WAV files to appear in some directory and then copies them to another directory where they are encoded (in this case to FLAC.) The nice thing about using the workflow model for this is that the code is kept quite compact and clear, since failure conditions can be handled locally to the action for the step in the process so deeply nested conditions or early returns are avoided, also because it all happens asynchronously it makes best of the processor time.

So the workflow is described in JSON as:

{
    "states" : [ "new", "ready", "copied", "done", "failed", "rejected" ],
    "transitions" : [
        {
            "name" : "ready",
            "from" : "new",
            "to"   : "ready"
        },
        {
            "name" : "reject",
            "from" : "new",
            "to"   : "rejected"
        },
        {
            "name" : "copied",
            "from" : "ready",
            "to"   : "copied"
        },
        {
            "name" : "fail",
            "from" : "ready",
            "to"   : "failed"
        },
        {
            "name" : "done",
            "from" : "copied",
            "to"   : "done"
        },
        {
            "name" : "fail",
            "from" : "copied",
            "to"   : "failed"
        }
    ],
    "initial-state" : "new"
}

Which defines our six states and the transitions between them. The “rejected” state is entered if the file has been seen before (from state “new”,) and the “failed” state may occur if there was a problem with either the copying or the encoding.

The program expects this to be in a file called “encoder.json” in the same directory as the program itself.

This example uses the ‘flac’ encoder but you could alter this to something else if you want.

use Tinky;
use Tinky::JSON;
use File::Which;


class ProcessFile does Tinky::Object {
    has Str $.path      is required;
    has Str $.out-dir   is required;
    has Str $.new-path;
    has Str $.flac-file;
    has     @.errors;
    method new-path() returns Str {
        $!new-path //= $!out-dir.IO.child($!path.IO.basename).Str;
    }
    method flac-file() returns Str {
        $!flac-file //= self.new-path.subst(/\.wav$/, '.flac');
        $!flac-file;
    }
}


multi sub MAIN($dir, Str :$out-dir = '/tmp/flac') {
    my ProcessFile @process-files;

    my $json = $*PROGRAM.parent.child('encoder.json').slurp;
    my $workflow = Tinky::JSON::Workflow.from-json($json);

    my $flac = which('flac') or die "no flac encoder";
    my $cp   = which('cp');

    my $watch-supply = IO::Notification.watch-path($dir).grep({ $_.path ~~ /\.wav$/ }).unique(as => { $_.path }, expires => 5);

    say "Watching '$dir'";

    react {
        whenever $watch-supply -> $change {
            my $pf = ProcessFile.new(path => $change.path, :$out-dir);
            say "Processing '{ $pf.path }'";
            $pf.apply-workflow($workflow);
        }
        whenever $workflow.applied-supply() -> $pf {
            if @process-files.grep({ $_.path eq $pf.path }) {
                $*ERR.say: "** Already processing '", $pf.path, "' **";
                $pf.reject;
            }
            else {
                @process-files.append: $pf;
                $pf.ready;
            }
        }
        whenever $workflow.enter-supply('ready') -> $pf {
            my $copy = Proc::Async.new($cp, $pf.path, $pf.new-path, :r);
            whenever $copy.stderr -> $error {
                $pf.errors.append: $error.chomp;
            }
            whenever $copy.start -> $proc {
                if $proc.exitcode {
                    $pf.fail;
                }
                else {
                    $pf.copied;
                }
            }
        }
        whenever $workflow.enter-supply('copied') -> $pf {
            my $encode = Proc::Async.new($flac,'-s',$pf.new-path, :r);
            whenever $encode.stderr -> $error {
                $pf.errors.append: $error.chomp;
            }
            whenever $encode.start -> $proc {
                if $proc.exitcode {
                    $pf.fail;
                }
                else {
                    $pf.done;
                }
            }
        }
        whenever $workflow.enter-supply('done') -> $pf {
            say "File '{ $pf.path }' has been processed to '{ $pf.flac-file }'";
        }
        whenever $workflow.enter-supply('failed') -> $pf {
            say "Processing of file '{ $pf.path }' failed with '{ $pf.errors }'";
        }
        whenever $workflow.transition-supply -> ($trans, $pf ) {
            $*ERR.say("File '{ $pf.path }' went from '{ $trans.from.name }' to '{ $trans.to.name }'");
        }
    }
}

If you start this with an argument of the directory where you want to pick up the files ffrom, it will wait until new files appear then create a new ProcessFile object and apply the workflow to it, then every object to which the workflow is applied is sent to the applied-supply which is tapped to check whether the file has already been processed: if it has (and this can happen because the file directory watch may emit more than one event for the creation of the file,) the object is moved to state ‘rejected’ and no further processing happens, otherwise it is moved to state ‘ready’ whereupon it is copied, and if successfully encoded.

Additional states (and transitions to enter them,) could easily be added to, for instance, store the details of the encoded file in a database, or even start playing it, or new actions could be added for existing states by adding additional “whenever” blocks. As it stands this will block forever waiting for new files;  however this could be integrated into a larger program by starting this in a new thread for instance.

The program and the JSON file are in the examples directory for Tinky::JSON, please feel free to grab and tinker with them.

Not quite all

Tinky has a fair bit more functionality that I don’t think I have space to describe here: there are facilities for the run-time validation of transition application and additional supplies that are emitted to at various stages of the workflow lifecycle. Hopefully your interest is sufficiently picqued that you might look at the documentation.

I am considering adding a cookbook-style document for the module for some common patterns that might arise in programs that might use it. If you have any ideas or questions please feel free to drop me a note.

Finally, I chose a deliberately un-descriptive name for the module as I didn’t want to make a claim that this would be the last word in the problem space. There are probably many more ways that a state managed workflow could be implemented nicely in Perl 6. I would be equally delighted if you totally disagree with my approach and release your own design as I would be if you decide to use Tinky.

Tinky Winky is the purple Teletubby with a red bag.

Day 17 – Testing in virtual time

Over the last month, most of my work time has been spent building a proof of concept for a project that I’ll serve as architect for next year. When doing software design, I find spikes (time-boxed explorations of problems) and rapid prototyping really useful ways to gain knowledge of new problem spaces that I will need to work in. Finding myself with a month to do this before the “real” start of the project has been highly valuable. Thanks to being under NDA, I can’t say much about the problem domain itself. I can, however, say that it involves a reasonable amount of concurrency: juggling different tasks that overlap in time.

Perl’s “whipuptitude” – the ability to quickly put something together – is fairly well known. Figuring that Perl 6’s various built-in concurrency constructs would allow me to whip up concurrent things rapidly, I decided to build the proof of concept in Perl 6. I’m happy to report that the bet paid off pretty well, and by now the proof of concept has covered all of the areas I hoped to explore – and some further ones that turned out to matter but that weren’t obvious at the start.

To me, building rapid prototypes explicitly does not mean writing crappy code. For sure, simplifications and assumptions of things not critical to the problem space are important. But my prototype code was both well tested and well structured. Why? Because part of rapid prototyping is being able to evolve the prototype quickly. That means being able to refactor rapidly. Decent quality, well-structured, well-tested code is important to that. In the end, I had ~2,500 lines of code covered by ~3,500 lines of tests.

So, I’ve spent a lot of time testing concurrent code. That went pretty well, and I was able to make good use of Test::Mock in order to mock components that returned a Promise or Supply also. The fact that Perl 6 has, from the initial language release, had ways to express asynchronous values (Promise) or asynchronous streams of values (Supply) is in itself a win for testing. Concurrent APIs expressed via these standard data structures are easy to fake, since you can put anything you want behind a Promise or a Supply.

My work project didn’t involve a huge amount of dealing with time, but in the odd place it did, and I realized that testing this code effectively would be a challenge. That gave me the idea of writing about testing time-based code for this year’s Perl 6 advent, which in turn gave me the final nudge I needed to write a module that’s been on my todo list all year. Using it, testing things involving time can be a lot more pleasant.

Today’s example: a failover mechanism

Timeouts are one of the most obvious and familiar places that time comes up in fairly “everyday” code. To that end, let’s build a simple failover mechanism. It should be used as follows:

my $failover-source = failover($source-a, $source-b, $timeout);

Where:

  • $source-a is a Supply
  • $source-b is a Supply
  • $timeout is a timeout in seconds (any Real number)
  • The result, assigned to $failover-source, is also Supply

And it should function as follows:

  • The Supply passed as $source-a is immediately tapped (which means it is requested to do whatever is needed to start producing values)
  • If it produces its first value before $timeout seconds, then we simply emit every value it produces to the result Supply and ignore $source-b
  • Otherwise, after $timeout seconds, we also tap $source-b
  • Whichever source then produces a value first is the one that we “latch” on to; any results from the other should be discarded

Consider, for example, that $source-a and $source-b are supplies that, when tapped, will send the same query to two different servers, which will stream back results over time. Normally we expect the first result within a couple of seconds. However, if the server queried by $source-a is overloaded or has other issues, then we’d like to try using the other one, $source-b, to see if it can produce results faster. It’s a race, but where A gets a head start.

Stubbing stuff in

So, in a Failover.pm6, let’s stub in the failover sub as follows:

sub failover(Supply $source-a, Supply $source-b, Real $timeout --> Supply) is export {
    supply {
    }
}

A t/failover.t then starts off as:

use Failover;
use Test;

# Tests will go here

done-testing;

And we’re ready to dig in to the fun stuff.

The first test

The simplest possible case for failover is when $source-a produces its first value in time. In this case, $source-b should be ignored totally. Here’s a test case for that:

subtest 'When first Supply produces a value in time, second not used', {
    my $test-source-a = supply {
        whenever Promise.in(1) {
            emit 'a 1';
        }
        whenever Promise.in(3) {
            emit 'a 2';
        }
    }
    my $test-source-b = supply {
        die "Should never be used";
    }
    my $failover-supply = failover($test-source-a, $test-source-b, 2);
    my $output = $failover-supply.Channel;

    is $output.receive, 'a 1', 'Received first value from source A';
    is $output.receive, 'a 2', 'Received second value from source A';
}

Here, we set up $test-source-a as a Supply that, when tapped, will emit a 1 after 1 second, and a 2 after 3 seconds. If $test-source-b is ever tapped it will die. We expect that if this wrongly happens, it will be at the 2 second mark, which is why a 2 is set to be emitted after 3 seconds. We then obtain a Channel from the resulting $failover-supply, which we can use to pull values from at will and check we got the right things. (On coercing a Supply to a Channel, the Supply is tapped, starting the flow of values, and each result value is fed into the Channel. Both completion and errors are also conveyed.)

Making it pass

There are a couple of ways that we might make this test pass. The absolute easiest one would be:

sub failover(Supply $source-a, Supply $source-b, Real $timeout) is export {
    return $source-a;
}

Which feels like cheating, but in TDD the code that passes the first test case almost always does. (It sometimes feels pointless to write said tests. More than once, they’ve ended up saving me when – while making a hard thing work – I ended up breaking the trivial thing.)

An equivalent, more forward-looking solution would be:

sub failover(Supply $source-a, Supply $source-b, Real $timeout) is export {
    supply {
        whenever $source-a {
            .emit;
        }
    }
}

Which is the identity operator on a Supply (just spit out everything you get). For those not familiar with supplies, it’s worth noting that this supply block does 3 useful things for you for free:

  • Passes along errors from $source-a
  • Passes along completion from $source-a
  • Closes the tap on $source-a – thus freeing up resources – if the tap on the supply we’re defining here is closed

Subscription management and error management are two common places for errors in asynchronous code; the supply/whenever syntax tries to do the Right Thing for you on both fronts. It’s more than just a bit of tinsel on the Christmas callback.

When the timeout…times out

So, time for a more interesting test case. This one covers the case where the $source-a fails to produce a value by the timeout. Then, $source-b produces a value within 1 second of being tapped – meaning its value should be relayed. We also want to ensure that even if $test-source-a were to emit a value a little later on, we’d disregard it. Here’s the test:

subtest 'When timeout reached, second Supply is used instead if it produces value first', {
    my $test-source-a = supply {
        whenever Promise.in(4) {
            emit 'a 1';
        }
    }
    my $test-source-b = supply {
        whenever Promise.in(1) { # start time + 2 (timeout) + 1
            emit 'b 1';
        }
        whenever Promise.in(3) { # start time + 2 (timeout) + 3
            emit 'b 2';
        }
    }
    my $failover-supply = failover($test-source-a, $test-source-b, 2);
    my $output = $failover-supply.Channel;

    is $output.receive, 'b 1', 'Received first value from source B';
    is $output.receive, 'b 2', 'Received second value from source B';
}

We expect a 1 to be ignored, because we chose $source-b. So, how can we make this pass? Here goes:

sub failover(Supply $source-a, Supply $source-b, Real $timeout --> Supply) is export {
    supply {
        my $emitted-value = False;

        whenever $source-a {
            $emitted-value = True;
            .emit;
        }

        whenever Promise.in($timeout) {
            unless $emitted-value {
                whenever $source-b {
                    .emit;
                }
            }
        }
    }
}

Will this pass the test? Both subtests?

Think about it…

Well, no, it won’t. Why? Because it doesn’t do anything about disregarding $source-a after it has started spitting out values from $source-b. It needs to commit to one or the other. Didn’t spot that? Good job we have tests! So, here’s a more complete solution that makes both subests pass:

sub failover(Supply $source-a, Supply $source-b, Real $timeout --> Supply) is export {
    supply {
        my enum Committed ;
        my $committed = None;

        whenever $source-a -> $value {
            given $committed {
                when None {
                    $committed = A;
                    emit $value;
                }
                when A {
                    emit $value;
                }
            }
        }

        whenever Promise.in($timeout) {
            if $committed == None {
                whenever $source-b -> $value {
                    $committed = B;
                    emit $value;
                }
            }
        }
    }
}

So tired of waiting

You’d think I’d be happy with this progress. Two passing test cases. Surely the end is in sight! Alas, development is getting…tedious. Yes, after just two test cases. Why? Here’s why:

$ time perl6-m -Ilib t/failover-bad.t
    ok 1 - Received first value from source A
    ok 2 - Received second value from source A
    1..2
ok 1 - When first Supply produces a value in time, second not used
    ok 1 - Received first value from source B
    ok 2 - Received second value from source B
    1..2
ok 2 - When timeout reached, second Supply is used instead if it produces value first
1..2

real    0m8.694s
user    0m0.600s
sys     0m0.072s

Every time I run my tests I’m waiting around 9 seconds now. And when I add more tests? Even longer! Now imagine I was going to write a whole suite of these failover and timeout routines, as a nice module. Or I was testing timeouts in a sizable app and would have dozens, even hundreds, of such tests.

Ouch.

Maybe, though, I could just make the timeouts smaller. Yes, that’ll do it! Here is how the second test looks now, for example:

subtest 'When timeout reached, second Supply is used instead if it produces value first', {
    my $test-source-a = supply {
        whenever Promise.in(0.04) {
            emit 'a 1';
        }
    }
    my $test-source-b = supply {
        whenever Promise.in(0.01) { # start time + 2 (timeout) + 1
            emit 'b 1';
        }
        whenever Promise.in(0.03) { # start time + 2 (timeout) + 3
            emit 'b 2';
        }
    }
    my $failover-supply = failover($test-source-a, $test-source-b, 0.02);
    my $output = $failover-supply.Channel;

    is $output.receive, 'b 1', 'Received first value from source B';
    is $output.receive, 'b 2', 'Received second value from source B';
}

You want it faster? Divide by 100! Job done.

Of course, anybody who has actually done this knows precisely what comes next. The first 3 times I ran my tests after this change, all was well. But guess what happened on the forth time?

ok 1 - When first Supply produces a value in time, second not used
    not ok 1 - Received first value from source B

    # Failed test 'Received first value from source B'
    # at t/failover-short.t line 41
    # expected: 'b 1'
    #      got: 'a 1'
    not ok 2 - Received second value from source B

    # Failed test 'Received second value from source B'
    # at t/failover-short.t line 42
    # expected: 'b 2'
    #      got: 'b 1'
    1..2
    # Looks like you failed 2 tests of 2
not ok 2 - When timeout reached, second Supply is used instead if it produces value first

Mysteriously…it failed. Why? Bad luck. My computer is a busy little machine. It can’t just give my test programs all the CPU all of the time. It needs to decode that music I’m listening to, check if I need to install the 10th set of security updates so far this month, and cope with my web browser wanting to do stuff because somebody tweeted something or emailed me. And so, once in a while, just after the clock hits 0.01 seconds and a thread grabs the whenever block to work on, that thread will be dragged off the CPU. Then, before it can get back on again, the one set to run at 0.04 seconds gets to go, and spits out its value first.

Sufficiently large times mean slow tests. Smaller values mean unreliable tests. Heck, suspend the computer in the middle of running the test suite and even a couple of seconds is too short for reliable tests.

Stop! Virtual time!

This is why I wrote Test::Scheduler. It’s an implementation of the Perl 6 Scheduler role that virtualizes time. Let’s go back to our test code and see if we can do better. First, I’ll import the module:

use Test::Scheduler;

Here’s the first test, modified to use Test::Scheduler:

subtest 'When first Supply produces a value in time, second not used', {
    my $*SCHEDULER = Test::Scheduler.new;

    my $test-source-a = supply {
        whenever Promise.in(1) {
            emit 'a 1';
        }
        whenever Promise.in(3) {
            emit 'a 2';
        }
    }
    my $test-source-b = supply {
        die "Should never be used";
    }
    my $failover-supply = failover($test-source-a, $test-source-b, 2);
    my $output = $failover-supply.Channel;

    $*SCHEDULER.advance-by(3);
    is $output.receive, 'a 1', 'Received first value from source A';
    is $output.receive, 'a 2', 'Received second value from source A';
}

Perhaps the most striking thing is how much hasn’t changed. The changes amount to two additions:

  1. The creation of a Test::Scheduler instance and the assignment to the $*SCHEDULER variable. This dynamic variable is used to specify the current scheduler to use, and overriding it allows us to swap in a different one, much like you can declare a $*OUT to do stuff like capturing I/O.
  2. A line to advance the test scheduler by 3 seconds prior to the two assertions.

The changes for the second test are very similar:

subtest 'When timeout reached, second Supply is used instead if it produces value first', {
    my $*SCHEDULER = Test::Scheduler.new;

    my $test-source-a = supply {
        whenever Promise.in(4) {
            emit 'a 1';
        }
    }
    my $test-source-b = supply {
        whenever Promise.in(1) { # start time + 2 (timeout) + 1
            emit 'b 1';
        }
        whenever Promise.in(3) { # start time + 2 (timeout) + 3
            emit 'b 2';
        }
    }
    my $failover-supply = failover($test-source-a, $test-source-b, 2);
    my $output = $failover-supply.Channel;

    $*SCHEDULER.advance-by(6);
    is $output.receive, 'b 1', 'Received first value from source B';
    is $output.receive, 'b 2', 'Received second value from source B';
}

And what difference does this make to the runtime of my tests? Here we go:

$ time perl6-m -Ilib t/failover-good.t
    ok 1 - Received first value from source A
    ok 2 - Received second value from source A
    1..2
ok 1 - When first Supply produces a value in time, second not used
    ok 1 - Received first value from source B
    ok 2 - Received second value from source B
    1..2
ok 2 - When timeout reached, second Supply is used instead if it produces value first
1..2

real    0m0.679s
user    0m0.628s
sys     0m0.060s

From 9 seconds to sub-second – and much of that will be fixed overhead rather than the time running the tests.

One more test

Let’s deal with the final of the requirements, just to round off the test writing and get to a more complete solution to the original problem. The remaining test we need is for the case where the timeout is reached, and we tap $source-b. Then, before it can produce a value, $source-a emits its first value. Therefore, we should latch on to $source-a.

subtest 'When timeout reached, and second Supply tapped, first value still wins', {
    my $*SCHEDULER = Test::Scheduler.new;

    my $test-source-a = supply {
        whenever Promise.in(3) {
            emit 'a 1';
        }
        whenever Promise.in(4) {
            emit 'a 2';
        }
    }
    my $test-source-b = supply {
        whenever Promise.in(2) { # start time + 2 (timeout) + 2
            emit 'b 1';
        }
    }
    my $failover-supply = failover($test-source-a, $test-source-b, 2);
    my $output = $failover-supply.Channel;

    $*SCHEDULER.advance-by(4);
    is $output.receive, 'a 1', 'Received first value from source A';
    is $output.receive, 'a 2', 'Received second value from source A';
}

This fails, because the latch logic wasn’t included inside of the whenever block that subscribes to $source-b. Here’s the easy fix for that:

sub failover(Supply $source-a, Supply $source-b, Real $timeout --> Supply) is export {
    supply {
        my enum Committed ;
        my $committed = None;

        whenever $source-a -> $value {
            given $committed {
                when None {
                    $committed = A;
                    emit $value;
                }
                when A {
                    emit $value;
                }
            }
        }

        whenever Promise.in($timeout) {
            if $committed == None {
                whenever $source-b -> $value {
                    given $committed {
                        when None {
                            $committed = B;
                            emit $value;
                        }
                        when B {
                            emit $value;
                        }
                    }
                }
            }
        }
    }
}

The easy thing is just a little bit repetitive, however. It would be nice to factor out the commonality into a sub. Here goes:

sub failover(Supply $source-a, Supply $source-b, Real $timeout --> Supply) is export {
    supply {
        my enum Committed ;
        my $committed = None;

        sub latch($onto) {
            given $committed {
                when None {
                    $committed = $onto;
                    True
                }
                when $onto {
                    True
                }
            }
        }

        whenever $source-a -> $value {
            emit $value if latch(A);
        }

        whenever Promise.in($timeout) {
            if $committed == None {
                whenever $source-b -> $value {
                    emit $value if latch(B);
                }
            }
        }
    }
}

And in under a second, the tests can now assure us that this was indeed a successful refactor. Note that this does not yet cancel a discarded request, perhaps saving duplicate work. I’ll leave that as an exercise for the reader.

Safety and realism

One thing you might wonder about here is whether this code is really thread safe. The default Perl 6 scheduler will schedule code across a bunch of threads. What if $source-a and $source-b emit their first value almost simultaneously?

The answer is that supply (and react) blocks promise Actor-like semantics, processing just one message at a time. So, if we’re inside of the whenever block for $source-a, and $source-b emits a message on another thread, then it will be queued up for processing afterwards.

One interesting question that follows on from this is whether the test scheduler somehow serializes everything onto the test thread in order to do its job. The answer is that no, it doesn’t do that. It wraps the default ThreadPoolScheduler and always delegates to it to actually run code. This means that, just as with the real scheduler, the code will run across multiple threads and on the thread pool. This helps to avoid a couple of problems. Firstly, it means that testing code that relies on having real threads (by doing stuff that really blocks a thread) is possible. Secondly, it means that Test::Scheduler is less likely to hide real data race bugs that may exist in the code under test.

Of course, it’s important to keep in mind that virtual time is still very much a simulation of real time. It doesn’t account for the fact that running code takes time; virtual time stands still while code runs. At the same time, it goes to some amount of effort to get the right sequencing when a time-based event triggered in virtual time leads to additional time-based events being scheduled. For example, imagine we schedule E1 in 2s and E2 in 4s, and then advance the virtual time by 4s. If the triggering of E1 schedules E3 in 1s (so, 3s relative to the start point), we need to have it happen before E2. To have this work means trying to identify when all the consequences of E1 have been shaken out before proceeding (which is easier said than done). Doing this will, however, prevent some possible overlaps that could take place in real time.

In summary…

Unit tests for code involving time can easily end up being slow and/or unreliable. However, if we can virtualize time, it’s possible to write tests that are both fast and reliable – as good unit tests should be. The Test::Scheduler module provides a way to do this in Perl 6. At the same time, virtual time is not a simulation of the real thing. The usual rules apply: a good unit test suite will get you far, but don’t forget to have some integration tests too!

Day 16: The Meta spec, Distribution, and CompUnit::Repository explained-ish


This is a tool chain post. You have been warned.

In this post I’m going to explain the typical journey of a module from a request for a distribution (such as zef install My::Module) to the point you are able to perl6 -e 'use My::Module;'. Afterwards I’ll go over implementation details, and how they allow us to do cool things like use a .tar.gz archive or the github API for loading source code.

Most of this stuff is not documented yet, but the source code for these items is not difficult to grok and commented thoroughly. Don’t expect to understand how to actually do anything new after reading this. This is intended to give a high level description of the design considerations used and what they will allow you to do.

And I’ll be ignoring precompilation because that’s hard.

The Journey

We’ll start at zef install My::Module, but it could just as well be a URI or something like My::Module:ver<1.*>:auth<me@foo.com>. The module manager either proxies this request to an external recommendation manager (such as MetaCPAN) [1], or acts as the recommendation manager itself (grepping the current perl6 ecosystem package.json file).[2] The request returns a META6 hash for a matching distribution and likely includes some non-spec fields to hint at the download URI.

 ______                              ______________________
|client| 1==request My::Module====> |recommendation manager|
|______| <==META6 representation==2 |______________________|

The package manager then uses an appropriate Distribution object that understands the META6 fields and how to fetch the files it references. The Distribution will encapsulate all the behavior rakudo expects.

 ______                                     ___________________
|client| 3==META6 representation=========> |Distribution Lookup|
|______| <==Distribution implementation==4 |___________________|

The Distribution is passed to the rakudo core class CompUnit::Repository::Installation.install($dist) (CURI for short). CURI then saves all the files the Distribution represents to its own hierarchy and naming conventions.

 ______                                ______
|client| 5==.install(Distribution)==> | CURI |
|______|                              |______|

If you call .resolve($name) on CURI it will return a Distribution object that it creates from its own structure. .resolve($name) also has to decide what to do when multiple names match. In this way CompUnit::Repository acts as a basic recommendation manager.

 ______                                     ___________
|      | 1==.resolve('My::Module')=======> | CURI#site |
|client| <==Distribution implementation==2 |___________|
|      |                                   ____________ 
|      | 3==.install(Distribution)======> | CURI#home  |
|______|                                  |____________|

The Details

META6

{
    "name"        : "My::Module",
    "version"     : "0.001",
    "auth"        : "me@cpan.org",
    "provides"    : {
        "My::Module"      : "lib/My/Module.pm6",
        "My::Module::Foo" : "lib/My/Module/Foo.pm6",
    },
    "resources" : [
        "config.json",
        "scripts/clean.pl",
        "libraries/mylib",
    ],
}

In most cases a distributions meta data is stored in its root folder as META6.json, but where it comes from is irrelevant. We’ll only be interested in a few of the possible fields, which serve two different purposes:

  1. A “unique” identifierThe identifier is the full name of a distribution, and includes the name, version, and auth (there is also api but we’re ignoring this one). In this case its My::Module:ver<0.001>:auth<me@cpan.org> [3] If you were to install this distribution you could use it with:

    use My::Module:ver<0.001>:auth<me@cpan.org>; (although use My::Module; would probably suffice)

  2. File mapping
  • provides is a key/value mapping where the key is the package name and the value is a content-id (forward slash relative file path). The content-id, while being a file path, might not represent a file that exists yet.
  • resources is a list of resource content-ids included with your distribution, usually including all of the files in the resources/ directory. These files will be accessible by modules in your distribution via %?RESOURCES<$name-path>. In the example the first two items follow this pattern and would be resources/config.json and resources/scripts/clean.pl, but the last one is special. If the first path part of the content-id is “libraries/” then any path under it will have its name mangled to whatever naming convention rakudo thinks is right for the OS it is running on. This can be useful for distributions that compile/generate libs at build time and expect to be named a certain way; libraries/mylib.dll on windows, libraries/libmylib.so on linux, and libraries/libmylib.1.so on OSX. This allows you to reference this library in a (probably NativeCall) module as %?RESOURCES<mylib> instead of guessing if its %?RESOURCES<mylib.dll> or %?RESOURCES<libmylib.so>
  • files optional this would usually be populated automatically by the Distribution but i’ll mention it here because you can construct this manually. It is a key/value of $name-path => $content-id. These may or not be the same. Combined with provides this gives you a way to get a list of all content-ids that can be used with Distribution.content(...)
    # Before CURI.install - Usually generated by the Distribution itself
    "files: {
        "bin/my-script" => "bin/my-script",
        "resources/libraries/foolib" => "resources/libraries/libfoolib.so"
    }
    
    # After CURI.install
    "files: {
        "bin/my-script" => "SDfDFIHIUHuhfue9f3fJ930j",
        "resources/libraries/foolib" => "j98jf9fjFJLJFi3f.so"
    }
    

Distribution

role Distribution {
    method meta(--> Hash) {...}
    method content($content-id --> IO::Handle) {...}
}

Distribution is the IO interface a CompUnit::Repository uses. It only needs to implement two methods:

  • method meta Access to the meta data of a distribution. This does not have to be a local file:
  • method content Given a content-id such as lib/My/Module.pm or libraries/mylib return an IO::Handle from which the appropriate data can be read.

When you (or your module installer) pass a Distribution to CompUnit::Repository::Installation.install($dist) it will look at $dist.meta() to figure out all the content-ids it needs to install, and then calls $dist.content($content-id).slurp-rest to get the actual content. [4]

CompUnit::Repository

At the most basic level a CompUnit::Repository is used to store and/or lookup distributions. How the distribution is stored or loaded is up to the CompUnit::Repository.

CompUnit::Repository::Installation is unique among the core CompUnit::Repository classes in that is has an install method that takes a Distribution implementation and maps it to the file system. Currently this means changing all file names to a sha1 string. So it also returns its own implementation of Distribution that still allows us to access My::Module as $dist.content('lib/My/Module.pm'). It’s path-spec is inst#, so if you install the dependencies of a distribution to inst#local/ you could do one of:

  • perl6 -Iinst#local/ -Ilib -e 'use My::Module'
  • PERL6LIB=inst#local/ perl6 -Ilib -e 'use My::Module'

CompUnit::Repository::FileSystem on the other hand works with original path names, although it does not install – only loads and resolves identities. This is what gets used when you use a module found in -I mylib or use lib "mylib" (both short for file#mylib/). If a META6.json file is found it will use the provides field to map namespaces to paths, but if there is no META6.json file (such as when you start developing a module) it will try the usual perl 5 schematics of ($name =~ s{::}{/}g) . ".pm6.

CompUnit::Repository::AbsolutePath is different in that it represents a single module (and not an entire Distribution of modules), such as: require '/home/perl6/repos/my-module/lib/my/module.pm6'.

A gross oversimplification of the interface
role CompUnit::Repository {
    method id()  { ... }

    method path-spec() { ... } # file#, inst#, etc

    method need(CompUnit::DependencySpecification $spec, |c --> CompUnit) { ... }
    
    method load(|) { ... }
}

Cool stuff

Distribution implementations (non-core)

If you were to pass this to CompUnit::Repository::Installation.install($dist) it would make a http request for each source file (found in the META6 – also fetched with a http request) and save the content to the final installation path:

use Distribution::Common::Remote::Github;

my $github-dist = Distribution::Common::Remote::Github.new(
    user   => "zoffixznet",
    repo   => "perl6-CoreHackers-Sourcery", # XXX: No [missing] dependencies
    branch => "master",
);

say "Source code: " ~ $github-dist.content('lib/CoreHackers/Sourcery.pm6').open.slurp-rest;

my $installation-cur = CompUnit::RepositoryRegistry.repository-for-name('home');
exit $installation-cur.install($github-dist) ?? 0 !! 1;

Similarly you could pipe data from running a command such as tar: [5]

use Distribution::Common::Tar;
use Net::HTTP::GET;
use File::Temp;

my $distribution-uri       = 'https://github.com/zoffixznet/perl6-CoreHackers-Sourcery/archive/master.tar.gz';
my ($filepath,$filehandle) = tempfile("******", :unlink);
spurt $filepath, Net::HTTP::GET($distribution-uri).body;

my $tar-dist = Distribution::Common::Tar.new($filepath.IO);

say "Source code: " ~ $tar-dist.content('lib/CoreHackers/Sourcery.pm6').open.slurp-rest;

my $installation-cur = CompUnit::RepositoryRegistry.repository-for-name('home');
exit $installation-cur.install($tar-dist) ?? 0 !! 1;

Thats not to say you couldn’t just clone or untar the distribution and use the built-in Distribution::Path($path) – this simply makes other possibilities trivial to implement.

CompUnit::Repository implementations (non-core)

You can also make your own CompUnit::Repository, such as CompUnit::Repository::Tar.

use CompUnit::Repository::Tar;
use lib "CompUnit::Repository::Tar#perl6-repos/my-module.tar.gz";
use My::Module;

… which is very similar to CompUnit::Repository::FileSystem, but it uses Distribution::Common::Tar to interface with the distribution. This means you can reuse the loading code from core CompUnit::Repository::* modules with very little modification (and won’t be covered in this post).

It would not take too much effort to use Distribution::Common::Remote::Github as the Distribution interface used when loading/resolving, giving a CloudPAN-like way to load modules.


Some ideas for modules:

  • Distribution::Dpkg – Adapater for the dpkg package format
  • Distribution::Gist – Install a distribution from a gist, because…
  • CompUnit::Repository::IPFS – InterPlanetary File System content storage backend
  • CompUnit::Repository::Temp – This CUR will self destruct in…
  • CompUnit::Repository::Tar::FatPack – Read all dependencies for an application from a single tar archive

More reading

gpw2016 Stefan Seifert – A look behind the curtains – module loading in Perl 6

Synopsis 22: Distributions, Recommendations, Delivery and Installation (non-authoritative)

Slightly less basic Perl6 module management


  1. Transformations on the identity may need to be made before sending to a recommendation manager. MetaCPAN may not understand :ver<1.*>, but its only a matter of representing that as an elastic search parameter.
  2. The recommendation engine also gets to determine what to return for My::Module when it has both My::Module:auth<foo> and My::Module:auth<bar> indexed, so it may become a best practice to declare the auth when you use a module.
  3. It should be noted that auth does not tell you where a module should be downloaded from. For instance: My::Module:auth<github:me> does not mean “download My::Module from github.com/me” – its nothing more than an additional identifying part, which is why using an email address is a better example. That exact identity might be found on github, cpan, or a darkpan. Such recommendation managers could choose to only index distributions that use an auth it can verify.
  4. method content doesn’t actually constrain the return value to an IO::Handle but it does expect it to act like one. This was done so that a socket, while not an IO::Handle, could still be used with a thin wrapper allowing resources to be fetched at the moment they are to be installed:
  5. This is a lie, it actually extracts to a temporary file for files under resources/ but only because %?RESOURCES<...> has to return an IO::Path (instead of an IO::Handle). Without this constraint the temporary file is not needed.

Day 15 – Parsing Globs using Grammars

UPDATE 12/21/16: After some discussions here and at reddit, I realized I’d made a couple important mistakes. Sorry about that. I have corrected GlobAction and a bug in the glob helper subroutine. I also replaced the use of “AST” with “parse tree” below as well as that is more technically correct.

If you are at all familiar with Perl, you are likely to be familiar with regexes. In Perl 6, regexes have been extended to provide features and syntax allowing complex parsing tasks to be simplified and grouped together into grammars. To give a taste of this, let’s consider how to parse file globs. It’s simple, it’s well-defined, and I have some practice writing parsers for the task.

Introducing File Globs

In case you are not familiar with them, a file glob it a sort of pattern match for file names. For example, consider this glob:

*.txt

This will match any file ending in “.txt”. The asterisk (“*”) means “match as many characters as needed to make the match”. This matches “foo.txt” and “blah.blah.txt”, but not “z.html”.

Here’s another glob to consider:

.??*

This will match any file that starts with a period (“.”) followed by 2 or more characters. That means it will match a file named “.zshrc”, but not one named “..”, making it a popular heuristic for finding “hidden” files on a Unix file system.

Basic Grammar

What does a basic glob parsing grammar look like? Let me show a simple one and then I will explain the interesting bits:

grammar Glob {
    token TOP { + }
    token term {
        || 
        || 
    }
    token match {  |  }
    token match-any { '*' }
    token match-one { '?' }
    token char { . }
}

This is just about the simplest possible grammar for a glob that covers the two most popular forms. Let’s consider the parts.

A grammar is just a sort of class. Grammars may contain rules and tokens. Rules and tokens are like methods, but which are defined as a regex and are used to match the input to the grammar. Here we define a grammar named “Glob” to parse our glob.

Every grammar may define a special rule or token named “TOP”. This becomes the default regex to use when matching input with that grammar. Without it, any attempt to parse with the grammar must name the rule or token to begin with.

Aside from that each rule and token defines a regex to use when matching the input. For example, the match-any token here will match only an asterisk (“*”). Similarly, the char token will match any single character.

To be useful as a grammar, rules and tokens refer to one another using angle brackets (“<>”). Here, TOP contains one or more term matches, term matches either a match or a char (using || to try each match in the order given).

Whitespace Handling

Why do I only use token keywords and never make use of the rule? Because of whitespace. A rule inserts some implicit whitespace handling, which is helpful in keeping parsers readable. However, this handling is not helpful when parsing file globs.

Consider the following rule and token:

token { 'a' 'b'+ }
rule  { 'a' 'b'+ }

The token will match “abbbbb” and “ab” but not “a b” or “a bbbb”. The rule, however, will not match “abbbbb” or “ab” but will match “a b” and “a bbbb”. This is the whitespace handling at work. The rule above is implicitly similar to this token:

token { 'a'  'b'+  }

That ws is a predefined token available in all grammars. The built-in definition implies there is a word-break and zero or more spaces, newlines, or other whitespace.

Depending on what works better for you, you may redefine ws if you prefer something different:

token ws { "\n" } # whitespace is always newlines

Or you may just use tokens to avoid the implicit whitespace handling. This latter solution is what I believe makes the most sense for this grammar.

Parsing Input

Now that we have our basic grammar, how do we parse the input? Here’s a simple example:

my $ast = Glob.parse("*.?");
dd $ast;
# OUTPUT is an ugly, long AST
# Match $ast = Match.new(ast => Any, list => (), hash =>
# Map.new((:term([Match.new(ast => Any, list => (), hash =>
# Map.new((:match(Match.new(ast => Any, list => (), hash => Map.new(()), 
# orig => "*.?", to => 1, from => 0)))), orig => "*.?", to => 1, from => 
# 0), Match.new(ast => Any, list => (), hash => Map.new((:char(Match.new(
# ast => Any, list => (), hash => Map.new(()), orig => "*.?", to => 2, 
# from => 1)))), orig => "*.?", to => 2, from => 1), Match.new(ast => Any, 
# list => (), hash => Map.new((:match(Match.new(ast => Any, list => (), 
# hash => Map.new(()), orig => "*.?", to => 3, from => 2)))), orig => 
# "*.?", to => 3, from => 2)]))), orig => "*.?", to => 3, from => 0) 

As we get output, the parse succeeded. Given how accepting our grammar is, everything but an empty string ought to succeed. If a parser fails, the result of parse is an undefined Any.

If you’ve used regexes in Perl 6 at all, this ASTparse tree should look very familiar. Grammars are really nothing more than a little bit of extra structure around regexes. The ASTsparse trees they generate are just regex Match objects that may have additional matches nested within.

To do something useful with this ASTparse tree, you have some options. You could write a tree walker that will go through the ASTparse tree and do something interesting. However, Perl grammars provide a built-in mechanism to do that while in the middle of the parse called an action class. Let’s use an action class instead.

Action Classes

An action class is a regular class that can be paired to a grammar to perform actions based upon each rule or token matched. The methods in the action class will mirror the names of the rules and tokens in the grammar and be called for each.

Here is an action class that is able to translate the parsed glob into a matcher class:

class GlobAction {
    method TOP($/) { make GlobMatcher.new(terms => $».made) }
    method term($/) { make $/.values».made.first }
    method match($/) { make $/.values».made.first }
    method match-any($/) { make '*' }
    method match-one($/) { make '?' }
    method char($/) { make ~$/ }
}

Each method will receive a single argument which is the match object. If the argument list is not given, the implicit match variable $/ is used instead (here we make that explicit). The action class can perform any action in response to the match it wants. Often, this involves rewriting the return value of the match by using the make command. This sets the value returned by the .made method on the match (as you can see is used in term and match).

To put this action class into use, you pass the action class during parsing like so:

my $matcher = Glob.parse("*.txt", :actions(GlobAction.new)).made;
for  -> $f {
    say "$f matches" if $f ~~ $matcher;
}

Now all we need is the magic defined within GlobMatcher, which does the real work of testing whether a given file name matches the parsed glob.

Putting it all together

Here is an implementation of matching. It uses the ACCEPTS method to define a smartmatch operation that can be applied to any defined string.

class GlobMatcher {
    has @.terms;

    method ACCEPTS(Str:D $s) {
        my @backtracks = [ \([ $s.comb ], [ @!terms ]) ];
        BACKTRACK: while @backtracks.pop -> (@letters, @terms) {
            LETTER: while @letters.shift -> $c {
                last LETTER unless @terms;

                my $this-term = @terms.shift;
                my $next-term = @terms[0];

                given $this-term {
                    when '*' {
                        # Continue to the next term if we can
                        if @terms 
                        and $next-term ne '*' | '?' 
                        and $c eq $next-term {
                            push @backtracks, 
                                ([ @letters ], [ '*', |@terms ]);
                            redo;
                        }

                        # Match anything, and try again next round
                        unshift @terms, $this-term;
                    }

                    # We have a letter, so we match!
                    when '?' { }

                    # Only match exactly
                    default {
                        # If not an exact match, we fail; 
                        # try again if we can
                        next BACKTRACK if $c ne $this-term;
                    }
                }
            }

            # If we matched everything, we succeed
            return True unless @terms;

            # Otherwise, try the next backtrack, if any
        }

        # We ran out of back tracks, so we fail
        False;
    }
}

This algorithm is kind of ugly and might be prettier if I’d implemented it via a DFA or translated to regular expressions or something, but it works.

Typing out the code to parse is a bit tedious, so we’ll build ourselves a little helper to make it cleaner looking:

sub glob($glob, :$globlang = Glob) { 
    $globlang.parse($glob, :actions(GlobAction.new)).made; 
}

for  -> $f {
    next unless $f ~~ glob('*.txt');
    say "$f matches";
}

We could even use it in a nice grep:

.grep(glob('*.txt')).say;

Extending the Grammar

Now that we have a basic working grammar, let’s say we want to expand it to support a variant based on ANSI/SO SQL’s LIKE operator. It is very much like a glob, but uses percent (“%”) in place of asterisk (“*”) and underscore (“_”) in place of question mark (“?”). We could implement this as a brand new grammar, but let’s make our new grammar a sub-class instead, so we can avoid duplicating work.

grammar SQLGlob is Glob {
    token match-any { '%' }
    token match-one { '_' }
}

.grep(
    glob('%.txt', :globlang(SQLGlob))
).say;

That was too easy.

Expanding the Grammar

From here, we might also try to expand the grammar to handle character classes or curly-expansions:

[abc]*.{txt,html}

In BSD-like globs, the above will match any file starting with “a” or “b” or “c” that also ends with “.txt” or “.html”. In which case, we might consider using a proto to make it easy to expand the kinds of matches we allow. A full demonstration of this is beyond the scope of this post, but it could look something like this:

class Glob {
    ...

    token term {
        || 
        || 
        || 
    }
    proto token expansion { * }
    proto token match { * }

    ...
}

class BasicGlob is Glob {
    token match:any { '*' }
    token match:one { '?' }
}

class BSDGlob is BasicGlob {
    token expansion:list { '{' + % ',' '}' }
    token match:char-class { '['  ']' }
}

class SQLGlob is Glob {
    token match:any { '%' }
    token match:one { '_' }
}

By using a proto token or proto rule we gain a nice way of handling alternative matches within grammars. This allows us to create variety of grammar variants that don’t repeat themselves. It also gives us the flexibility to allow developers using your grammars to subclass and modify them to suit their custom needs without monkey patching.

A working example of something like this can be found in the source for IO::Glob. (Though, much of that having been written long enough ago that I’m not sure it is the best possible example of what to do, but it is an example.)

Grammars are a really nifty way of grouping a set of regexes together to build a complete parser as a unit. They are easy to read, provide tools that allow you to avoid repeating yourself, and I find them to be an enjoyable tool to play with and use in Perl 6.

Cheers.

Day 14 – Targetting MoarVM, the Wrong Way

MoarVM is a virtual machine specifically designed to be a backend for the NQP compiler toolchain in general and the Rakudo Perl 6 compiler in particular.

It is not restricted to running Perl 6, though, and if anyone wants to implement their own language on top of it, Jonathan has been kind enough to provide free course material that walks you through the process. In particular, the code examples for PHPish and Rubyish are worth a quick look to see how things are supposed to work.

However, where there’s a Right Way of doing things, there’s also a Wrong Way, and that’s what we’re gonna look at today!

Generating Bytecode

MoarVM bytecode is generated from MAST trees, defined in lib/MAST/Nodes.nqp of your MoarVM checkout. The file states:

# This file contains a set of nodes that are compiled into MoarVM
# bytecode. These nodes constitute the official high-level interface
# to the VM. At some point, the bytecode itself will be declared
# official also. Note that no text-based mapping to/from these nodes
# will ever be official, however.

This has historical reasons: Parrot, the VM that Rakudo used to target, had an unhealthy overreliance on its textual intermediate representation PIR. Personally, I think it is a good idea to have some semi-official text-based bytecode representation – you just shouldn’t use it as the exchange format between compilation stages.

That’s where doing things the Wrong Way come in: During the last two weeks, I’ve started writing an assembler targetting MAST and a compiler for a tiny low-level language targetting this assembly dialect, doing exactly what I just told you not to do.

Why did I? What I hope to accomplish eventually is providing a bootstrapped alternative to the NQP toolchain, and you have to start your bootstrapping process somewhere.

Currently, only a few bits and pieces have been implemented, but these bits and pieces are somewhat functional and you can do such useful things as echo input from stdin to stdout:

$ cat t/echo.tiny
fn main() {
    obj stdin = getstdin
    do {
        str line = readline stdin
        int len = chars line
        done unless len
        print line
        redo
    }
    exit 0
}

You can either run the code directly

$ ./moartl0 --run t/echo.tiny

compile it first

$ ./moartl0 --compile t/echo.tiny

$ moar t/echo.moarvm

or take a look at the generated assembly

$ ./moartl0 --dump t/echo.tiny
.hll tiny
.frame main
.label bra0_main
    .var obj v0_stdin
    getstdin $v0_stdin
.label bra1_do
    .var str v1_line
    readline_fh $v1_line $v0_stdin
    .var int v2_len
    chars $v2_len $v1_line
    unless_i $v2_len @ket1_do
    print $v1_line
    goto @bra1_do
.label ket1_do
    .var int i0
    const_i64 $i0 0
    exit $i0
.label ket0_main
# ok

There isn’t really anything fancy going on here: Text goes in, text goes out, we can explain that.

Note that the assembly language is not yet finalized, but so far I’ve opted for a minimalistic syntax that has VM instructions separated from its operands by whitespace and accompanied by assembler directives prefixed with a ..

Under the Hood

If you were to look at the source code of the compiler (as we probably should – this is supposed to be the Perl 6 advent calendar, after all), you might discover some useful idiom likes using a proto declaration

proto MAIN(|) {
    CATCH {
        ... # handle errors
        exit 1;
    }

    ... # preprocess @*ARGS
    {*}
}

to accompany our multi MAIN subs that define the command line interface.

However, you would also come across things that might not necessarily be considered best practice.

For one, the compiler is not reentrant: In general, we’re supposed to pass state along the call chain either in the form of arguments (the implicit self parameter of methods is a special case of that) or possibly as dynamic variables. When writing compilers specifically, the latter tend to be useful to implement recursive declarations like nested lexical scopes: a lexical frame of the target language will correspond to a dynamic frame of the parser. If you don’t care about reentrancy, though, you can just go with global variabes and use the temp prefix to achieve the same result.

For another, the compiler also doesn’t use grammars, but instead, the body of the line-based parsing loop is a single regex, essentially

# next-line keeps track of line numbering and trims the string
while ($_ := next-line) !=:= IterationEnd { /^[
    | ['#'|$]                       # ignore comments and empty lines
    | (:s ld (\w+)'()' '{'${ ... }) # start of load frame definition
    | (:s fn (\w+)${ ... })         # forward declaration of a function
    | ...                           # more statements
    || {bailout}
]/ }

The blocks { ... } represent the actions that have been embedded into the regex after $ anchors terminating each line.

That’s not really a style of programming I’d be comfortable advocating for in general – but Perl being Perl, There’s More Than One Way to Do It: For better or worse, Perl 6 gives programmers a lot of freedom to structure code how they see fit. As the stage 0 compiler is supposed to be supplanted anyway, I decided to have some fun instead of crafting a proper architecture.

In comparison, the assembler implemented in NQP is far more vanilla, with state held by an actions object.

But… Why?

The grinches among you may ask, What is this even doing here? Is this just someone’s personal side project that just happens to be written in Perl 6, of no greater use to the community at large?

Well, potentially, but not necessarily:

First, I do plan on writing a disassembler for MoarVM bytecode, and that may come in handy for bug hunting, testing or when looking for optimization opportunities.

Second, when running on MoarVM, Perl 6 code may load and interact with compilation units written in our tiny language or even hand-optimized VM assembly. The benefit over something like NativeCall is that we never leave the VM sandbox, and in contrast to foreign code that has to be treated as black boxes, the JIT compiler will be able to do its thing.

Third, an expansion of the MoarVM ecosystem might attract the attention of language enthusiasts beyond the Perl community, with at least the chance that MoarVM could deliver on what Parrot promised.

However, for now that’s all just idle speculation – remember, all there is right now is a two weeks old toy I came up with when looking for something to write about for this advent calendar. It’s a very real possibility that this project will die a quiet death before amounting to anything. But on the off chance it does not, it’s nice to have a hot cup of the preferred beverage of your choice and dream about a future where MoarVM rises as a butterfly-winged phoenix from the ashes of a dead parrot….

Day 13 – Audio Streaming done Completely Wrong

Starting out

I made the audio streaming source client Audio::Libshout about a year and a half ago, it works quite well: with the speed improvements in Rakudo I have been able to stream 320kb/s MP3 without a problem, but it always annoyed me that it was difficult to test properly and even to test it at all required an Icecast server. Even with an icecast server that I could stream to, it would be necessary to actually listen to the stream in order to determine whether the stream was actually functioning correctly.

This all somewhat came to a head earlier in the year when I discovered that even the somewhat rudimentary tests that I had been using for the asynchronous streaming support had failed to detect that the stream wasn’t being fed at all. What I really needed was a sort of dumb streaming server that could act in place of the real icecast and could be instrumented to determine whether a connection was being made and that the correct audio data was being received. How hard could it be? After all it was just a special kind of web server.

I should have known from experience that this was a slippery slope, but hey.

A little aside about Audio Streaming

An icecast streaming server is basically an HTTP server that feeds a continuous stream of encoded audio data to the client listeners who connect with an HTTP GET request, the data to be streamed is typically provided by a source client which will connect to the server, probably authenticate using HTTP Authentication, and start sending the data at a steady rate that is proportional to the bitrate of the stream. libshout connects with a custom request method of SOURCE which is inherited from its earlier shoutcast origins, though icecast itself understands PUT as well for the source client. Because it is the responsibility of the listening client to supply the decoded audio data to the soundcard at exactly the right rate and the encoded data contains the bitrate of the stream as transmitted from the source the timing demands on the server are not too rigorous: it just has to be consistent and fast enough that a buffer on the client can be kept sufficiently full to supply the audio data to the sound card. Icecast does a little more in detail to, for instance, adjust for clients that don’t seem to be reading fast enough and so forth, but in principle it’s all relatively simple.

Putting it together

As might be obvious by now, an audio streaming server differs from a typical HTTP server in that rather than serving some content from disk or generated by the program itself for example, it needs to share data received on one client connection with one or more other client connections. In the simplest C implementation one might have a set of shared buffers, one of which is being populated from the source connection at any given time, whilst the others are being consumed by the client connections, alternating on filling and depletion. Whether the implementation settles on a non-blocking or threaded source possibly the most critical part of the code will be the synchronisation between the source writer and the client readers to ensure that a buffer is not being read from and written to at the same time.

In Perl 6 of course you’d hope you didn’t need to worry about these kind of annoying details as there are well thought out concurrency features that abstract away most of the nasty details.

From one to many

Perhaps the simplest program that illustrates how easy this might be would be this standalone version of the old Unix service chargen :

use v6.c;

my $supplier = Supplier.new;

start {
    loop {
        for ( 33 ... 126 ).map( { Buf.new($_) }) -> $c {
            sleep 0.05;
            $supplier.emit($c);
        }
    }
}

my $sig = signal(SIGPIPE).tap( -> $v {
    $sig.close;
});

react {
    whenever IO::Socket::Async.listen('localhost', 3333) -> $conn {
        my $p = Promise.new;
        CATCH {
            when /'broken pipe'/ {
                if $p.status !~~ Kept {
                    $p.keep: "done";
                }
            }
        }

        my $write = $supplier.Supply.tap( -> $v {
            if $p.status ~~ Planned {
                $conn.write: $v;
            }
            else {
                $write.close;
            }
        });
    }
}

In this the Supplier is being fed asynchronously to stand in for the possible source client in our streaming server, and each client that connects on port 3333 will receive the same stream of characters – if you connect with two clients (telnet or netcat for instance,) you will see they are getting the same data at roughly the same time.

The Supplier provides a shared sequence of data of which there can be many consumers, so each connection provided by the IO::Socket::Async will be fed the data emitted to the Supplier starting at the point the client connected.

The CATCH here is to deal with the client disconnecting, as the first our code will know about this is when we try to write to the connection, we’re not expecting any input from the client we can check and besides the characters becoming available to write may happen sooner than the attempt to read may register the close, so, while it may seem like a bit of a hack, it’s the most reliable and simple way of doing this: protecting from further writes with a Promise. If, by way of experiment, you were to omit the CATCH you would find that the server would quit without warning the first time the first client disconnected.

I’ll gloss over the signal outside the react as that only seems necessary in the case where we didn’t get any input data on the first connection.

Making something nearly useful

The above example is almost all we need to make something that you might be able to use, all we need is for it to handle an HTTP connection from the clients and get a source of actual MP3 data into the Supply and we’re good. To handle the HTTP parts we’ll just use the handy HTTP::Server::Tiny which will conveniently take a Supply that will feed the output data, so in fact we end up with quite a tiny program:

use HTTP::Server::Tiny;

sub MAIN(Str $file) {

    my $supplier = Supplier.new;

    my $handle = $file.IO.open(:bin);

    my $control-promise = start {
        while !$handle.eof {
            my $c = $handle.read(1024);
            $supplier.emit($c);
        }
    }

    sub stream(%env) {
        return 200, [ Content-Type => 'audio/mpeg', Pragma => 'no-cache', icy-name => 'My First Streaming Server'], $supplier.Supply;
    }
    HTTP::Server::Tiny.new(port => 3333).run(&stream);
}

The HTTP::Server::Tiny will run the stream subroutine for every request and we need to do is return the status code, some headers, and a supply from which the output data will be read, the client connection will be closed when the Supply is done (that is when the done method is called on the source Supplier.) It couldn’t really be much more
simple.

Just start the program with the path to a file containing MP3 audio and then point your favourite streaming client at port 3333 on your localhost and you should get the stream, I say should as it makes no attempt to regulate the rate at which the audio data is fed to the client. But for constant bit-rate MP3 data and a client that will buffer as much as it
can get it works.

Of course a real streaming server would read the data frame by frame and adjust the rate according to the bit-rate of each frame. I actually have made (but not yet released,) Audio::Format::MP3::Frame to help do this, but it would be over-kill for this example.

Relaying a stream

Of course the original intent of this streaming server was to be able to test a streaming source client, so we are going to have to add another part that will recognise a source connection, read from that and relay it to the normal clients in a similar way to the above.

You’ll recall that the libshout client library will connect with a request method of SOURCE so we can adjust the file streaming example to identify the source connect and feed the Supplier with the data read from the connection:

use HTTP::Server::Tiny;

sub MAIN() {

    my $supplier = Supplier.new;

    my Bool $got-source = False;

    sub stream(%env) {
        if %env<REQUEST_METHOD> eq 'GET' {
            if $got-source {
                return 200, [ Content-Type => 'audio/mpeg', Pragma => 'no-cache', icy-name => 'My First Streaming Server'], $supplier.Supply;
            }
            else {
                return 404, [ Content-Type => 'text/plain'], "no stream connected";
            }
        }
        elsif %env<REQUEST_METHOD> eq 'SOURCE' {
            my $connection = %env<p6sgix.io>;

            my $finish-promise = Promise.new;

            $connection.Supply(:bin).tap(-> $v {
                $supplier.emit($v);
            }, done => -> { $finish-promise.keep: "done" });

            $got-source = True;
            return 200, [ Content-Type => 'audio/mpeg' ], supply { whenever $finish-promise { done; } };
        }
    }
    HTTP::Server::Tiny.new(port => 3333).run(&stream);
}

You’ll see immediately that nearly all of the action is happening in the request handler subroutine: the GET branch is almost unchanged from the previous example (except that it will bail with a 404 if the source isn’t connected,) the SOURCE branch replaces the reading of the file previously. HTTP::Server::Tiny makes reading the streamed data from the source client really easy as it provides the connected IO::Socket::Async to the handler in the p6sgix.io (which I understand was originally primarily to support the WebSocket module,) theSupply of which is tapped to feed the shared Supplier that
conveys the audio data to the clients. All else that is necessary is to return with a Supply that is not intended to actually provide any data but just to close when the client closes their connection.

Now all you have to do is run the script and feed it with some source client like the following:

use Audio::Libshout;

multi sub MAIN(Str $file, Int $port = 3333, Str $password = 'hackme', Str $mount = '/foo') {

    my $shout = Audio::Libshout.new(:$port, :$password, :$mount, format => Audio::Libshout::Format::MP3);
    $shout.open;
    my $fh = $file.IO.open(:bin);

    while not $fh.eof {
        my $buf = $fh.read(4096);
        say $buf.elems;
        $shout.send($buf);
        $shout.sync;
    }

    $fh.close;
    $shout.close;
}

(Providing the path to some MP3 file again,) then if you connect a streaming audio player you will be getting some audio again.

You might notice that the script can take a password and a mount which aren’t used in this case, this is because Audio::Libshout requires them and also because this is basically the script that I have been using to test streaming servers for the last year or so.

Surprisingly this tiny streaming server works quite well, in testing I found that it ran out of threads before it got too bogged down handling the streams with multiple clients, showing how relatively efficient and well thought out the Perl 6 asynchronous model is. And how simple it is to put together a program that would probably require a lot more
code in many other languages.

Where do we go from here

I’m pretty sure that you wouldn’t want to use this code to serve up a popular radio station, but it would definitely be sufficient for my original testing purposes with a little additional instrumentation.

Of course I couldn’t just stop there so I worked up the as yet unreleased Audio::StreamThing which uses the same basic design with shared supplies, but works more like Icecast in having multiple mounts for individual streams, provision for
authentication and better exception handling.

If you’d find it useful I might even release it.

Postscript

I’d just like to get a mention in for the fabulous DJ Mike Stern as I always use his recorded
sets for testing this kind of stuff for some reason.

Have fun and make more noise.

Day 12 – Avoiding User Namespace Pollution with Perl 6 Modules

A bit of trickery is needed

As Perl 5 module authors, we were able to allow selective exporting easily by use of the @EXPORT_OK array where we listed all the objects that the user of our module could import by name, thereby giving the user full control over possible namespace clashes. However, in spite of all the new goodies provided by Perl 6, that feature is not yet available to us Perl 6 module authors. (For more information see issue #127305 on https://rt.perl.org.)

But the good news is there is a way to get the same effect, albeit with a bit more effort. The trick is in using the full power of tags with the export trait available with every object in a module.

One might get a little glassy-eyed when reading the Perl 6 docs concerning module import and export, of which a portion of code is shown here:

unit module MyModule;
my package EXPORT::DEFAULT {
    our sub foo { ... }
}
my package EXPORT::other {
    our sub bar { ... }
}

where each object to be exported is enclosed in a named export category block. That method has serious limitations (in addition to requiring another set of child blocks):

  • the objects cannot appear in more than one grouping and
  • the objects cannot be assigned other tags via the is export adjective on individual objects.

Consequently, to emulate Perl 5’s @EXPORT_OK array, each object would have to be wrapped in a unique block and could not have multiple tags. But then the docs say that all that ugly package exporting code can be replaced by this much simpler code for each item to be exported:

sub foo is export { ... }
sub bar is export(:other) { ... }

And therein is our solution: We will use export tags to (1) allow the user to import only what he or she desires to import or (2) import everything. The module author can also be helpful by adding other tags which will allow importing sets of objects (but such sets are not always so easily determined in practice).

Note one restriction before we continue: only valid identifier names are allowed in the tags, so we can’t use sigils or other special characters as we could in Perl 5.

For an example we use a simple, but unusual, Perl 6 module with multiple objects to be exported (and all have the same name, probably not a good practice, but handy for demonstration of the disambiguation problem):

unit module Foo;
sub bar() is export { say "bar" }
constant $bar is export = 99;
constant %bar is export = (a => 1);
constant @bar is export = <a b>;

We can then have access to all of those objects by merely using the module in a program file:

use Foo;
bar;
say $bar;
say %bar;
say @bar;

Executing it yields:

bar
99
{a => 1}
[a b]

as expected. Now let’s modify the module by adding an export tag to the bar routine:

sub bar() is export(:bar) { say "bar" }

and then, when our program is executed, we get:

===SO6RRY!=== Error while compiling...
Undeclared routine:
    bar used at line 7. Did you mean 'VAR', 'bag'?

so we modify our use line in the program to:

use Foo :bar;

and execute the program again to get:

===SORRY!=== Error while compiling...
Variable '$bar' is not declared. Did you mean '&bar'?

It seems that as soon as we, the users, restrict the use statement with a tag, the non-tagged objects are not available! Now we, the authors, have two choices:

  • tag all objects with the same tag or
  • tag them with separate tags.

If we tag them the same, then all will be available—which is probably not a problem. However, that would defeat our goal of having unique tags.

If we name them with unique tags, we will need some way to distinguish them (remember, we can’t use their sigils), which leads us to a possible convention we show here in the final module:

unit module Foo;
sub bar() is export(:bar :SUB) { say "bar" }
constant $bar is export(:bar-s :CONST) = 99;
constant %bar is export(:bar-h :CONST) = (a => 1);
constant @bar is export(:bar-a :CONST) = <a b>;

Notice the suffixes on the non-subroutine objects (all constants) have their kind indicated by the ‘-X’ where the ‘X’ is ‘s’ for scalar, ‘h’ for hash, and ‘a’ for array. We have also added an additional tag which groups the objects into the general categories of subroutines or constants so the user could import, say, just the constants as a set, e.g, use Foo :CONST;.

Now we just change the use line to

use Foo :bar, :bar-s, :bar-h, :bar-a;

and get

bar
99
{a => 1}
[a b]

again.

Thus the good news is we can add a tag to the export trait that is the same as the name of the subroutine, but the sad news is we can’t use the appropriate sigil as in Perl 5 to disambiguate among objects with the same name. The solution to that is to use some convention as demonstrated above (akin to the ugly Hungarian notation in C) that will have the same effect.

Of course in this somewhat-contrived example, the user could have imported all the objects at once by using the special, automatically-defined tag `:ALL:

use Foo :ALL;

but the author has provided users of the module complete flexibility in its application for them.

Conclusion

We now have a way to protect the user’s namespace by requiring him or her to selectively import objects by “name” (or perhaps other subsets of objects) unless the user chooses to import everything. The only downsides I see are:

  • the extra effort required by the module author to explicitly tag all exportable objects and
  • the restriction on selecting an appropriate tag for different objects with the same name.

I love Perl 6, and the old Perl motto TIMTOWTDI still holds true as we’ve just seen!

Merry Christmas! // Happy Holidays! I hope you will enjoy the gift of Perl 6 throughout the coming year!


Note: Thanks to Moritz Lenz (IRC #perl6: user moritz) for constructive comments about the bad practice of exporting variables—hence the example now exports constants instead.