Day 24 – Make It Snow

Hello again, fellow sixers! Today I’d like to take the opportunity to highlight a little module of mine that has grown up in some significant ways this year. It’s called Terminal::Print and I’m suspecting you might already have a hint of what it can do just from the name. I’ve learned a lot from writing this module and I hope to share a few of my takeaways.

Concurrency is hard

Earlier in the year I decided to finally try to tackle multi-threading in Terminal::Print and… succeeded more or less, but rather miserably. I wrapped the access to the underlying grid (a two-dimensional array of Cell objects) in a react block and had change-cell and print-cell emit their respective actions on a Supply. The react block then handled these actions. Rather slowly, unfortunately.

Yet, there was hope. After jnthn++ fixed a constructor bug in OO::Monitors I was able to remove all the crufty hand-crafted handling code and instead ensure that method calls to the Terminal::Print::Grid object would only run in a single thread at any given time. (This is the class which holds the two-dimensional array mentioned before and was likewise the site of my react block experiment).

Here below are the necessary changes:

- unit class Terminal::Print::Grid;
+ use OO::Monitors;
+ unit monitor Terminal::Print::Grid;

This shift not only improved the readability and robustness of the code, it was significantly faster! Win! To me this is really an amazing dynamic of Perl 6. jnthn’s brilliant, twisted mind can write a meta-programming module that makes it dead simple for me to add concurrency guarantees at a specific layer of my library. My library in turn makes it dead simple to print from multiple threads at once on the screen! It’s whipuptitude enhancers all the the way down!

That said, our example today will not be writing from multiple threads. For some example code that utilizes async, I point you to examples/async.p6 and examples/matrix-ish.p6.

Widget Hero

Terminal::Print is really my first open source library in the sense that it is the first time that I have started my own module from scratch with the specific goal of filling a gap in a given language’s ecosystem. It is also the first time that I am not the sole contributor! I would be remiss not to mention japhb++ in this post, who has contributed a great deal in a relatively short amount of time.

In addition to all the performance related work and the introduction of a span-oriented printing mechanism, japhb’s work on widgets especially deserves its own post! For now let’s just say that it has been a real pleasure to see the codebase grow and extend even as I have been too distracted to do much good. My takeaway here is a new personal milestone in my participation in libre/open software (my first core contributor!) that reinforces all the positive dynamics it can have on a code base.

Oh, and I’ll just leave this here as a teaser of what the widget work has in store for us:

rpg-ui-p6

You can check it out in real-time and read the code at examples/rpg-ui.p6.

Snow on the Golf Course

Now you are probably wondering, where is the darn, snow! Well, here we go! The full code with syntax highlighting is available in examples/snowfall.p6. I will be going through it step by step below.

use Terminal::Print;

class Coord {
    has Int $.x is rw where * <= T.columns = 0;
    has Int $.y is rw where * <= T.rows = 0 ;
}

Here we import Terminal::Print. The library takes the position that when you import it somewhere, you are planning to print to the screen. To this end we export an instantiated Terminal::Print object into the importer’s lexical scope as T. This allows me to immediately start clarifying the x and y boundaries of our coordinate system based on run-time values derived from the current terminal window.

class Snowflake {
    has $.flake = ('❆','❅','❄').roll;
    has $.pos = Coord.new;
}

sub create-flake {
    state @cols = ^T.columns .pick(*); # shuffled
    if +@cols > 0 {
        my $rand-x = @cols.pop;
        my $start-pos = Coord.new: x => $rand-x;
        return Snowflake.new: pos => $start-pos;
    } else {
        @cols = ^T.columns .pick(*);
        return create-flake;
    }
}

Here we create an extremely simple Snowflake class. What is nice here is that we can leverage the default value of the $.flake attribute to always be random at construction time.

Then in create-flake we are composing a way to make sure we have hit every x coordinate as a starting point for the snowfall. Whenever create-flake gets called, we pop a random x coordinate out of the @cols state variable. The state variable enables this cool approach because we can manually fill @cols with a new randomized set of our available x coordinates once it is depleted.

draw( -> $promise {

start {
    my @flakes = create-flake() xx T.columns;
    my @falling;
    
    Promise.at(now + 33).then: { $promise.keep };
    loop {
        # how fast is the snowfall?
        sleep 0.1; 
    
        if (+@flakes) {
            # how heavy is the snowfall?
            my $limit = @flakes > 2 ?? 2            
                                    !! +@flakes;
            # can include 0, but then *cannot* exclude $limit!
            @falling.push: |(@flakes.pop xx (0..$limit).roll);  
        } else {
            @flakes = create-flake() xx T.columns;
        }
    
        for @falling.kv -> $idx, $flake {
            with $flake.pos.y -> $y {
                if $y > 0 {
                    T.print-cell: $flake.pos.x, ($flake.pos.y - 1), ' ';
                }

                if $y < T.rows {
                    T.print-cell: $flake.pos.x, $flake.pos.y, $flake.flake;            
                }

                try {
                    $flake.pos.y += 1;
                    CATCH {
                        # the flake has fallen all the way
                        # remove it and carry on!
                        @falling.splice($idx,1,());
                        .resume;
                    }
                }
            }
        }
    }
}

});

Let’s unpack that a bit, shall we?

So the first thing to explain is draw. This is a handy helper routine that is also imported into the current lexical scope. It takes as its single argument a block which accepts a Promise. The block should include a start block so that keeping the argument promise works as expected. The implementation of draw is shockingly simple.

So draw is really just short-hand for making sure the screen is set up and torn down properly. It leverages promises as (I’m told) a “conv-var” which according to the Promises spec might be an abuse of promises. I’m not very futzed about it, to be honest, since it suits my needs quite well.

This approach also makes it quite easy to create a “time limit” for the snowfall by scheduling a promise to be kept at now + 33 — thirty three seconds from when the loop starts. then we keep the promise and draw shuts down the screen for us. This makes “escape” logic for your screensavers quite easy to implement (note that SIGINT also restores your screen properly. The most basic exit strategy works as expected, too :) ).

The rest is pretty straightforward, though I’d point to the try block as a slightly clever (but not dangerously so) combination of where constraints on Coord‘s attributes and Perl 6’s resumable exceptions.

Make it snow!

And so, sixers, I bid you farewell for today with a little unconditional love from ab5tract’s curious little corner of the universe. Cheers!

snowfall-p6

Day 24 – One Year On

This time of year invites one to look back on things that have been, things that are and things that will be.

Have Been

I was reminded of things that have been when I got my new notebook a few weeks ago. Looking for a good first sticker to put on it, I came across an old ActiveState sticker:

If you don’t know Perl
you don’t know Dick

A sticker from 2000! It struck me that that sticker was as old as Perl 6. Only very few people now remember that a guy called Dick Hardt was actually the CEO of ActiveState at the time. So even though the pun may be lost on most due to the mists of time, the premise still rings true to me: that Perl is more about a state of mind, then about versions. There will always be another version of Perl. Those who don’t know Perl are doomed to re-implement it, poorly. Which, to me, is why so many ideas were borrowed from Perl. And still are!

Are

Where are we now? Is it the moment we know, We know, we know? I don’t think we are at twenty thousand people using Perl 6 just yet. But we’re keeping our fingers crossed. Just in case.

We are now 12 compiler releases after the initial Christmas release of Perl 6. In this year, many, many areas of Rakudo Perl 6 and MoarVM have dramatically improved in speed and stability. Our canary-in-the-coalmine test has dropped from around 14 seconds a year ago to around 5.5 seconds today. A complete spectest run is now about 3 minutes, whereas it was about 4.5 minutes at the beginning of the year, while about 4000 tests were added (from about 50K to 54K). And we now have 757 modules in the Perl 6 ecosystem (aka temporary CPAN for Perl 6 modules), with a few more added every week.

The #perl6 IRC channel has been too busy for me to follow consistently. But if you have a question related to Perl 6 and you want a quick answer, the #perl6 channel is the place to be. You don’t even have to install an IRC client: you can also use a browser to chat, or just follow “live” what is being said.

There are also quite a few useful bots on that channel: they e.g. take care of running a piece of Perl 6 code for you. Or find out at which commit the behaviour of a specific piece of code changed. These are very helpful for the developers of Perl 6, who usually also hang out on the #perl6-dev IRC channel. Which could be you! The past year, at least one contributor was added to the CREDITS every month!

Will Be

The coming year will see at least three Perl 6 books being published. First one will be Think Perl 6 – How To Think Like A Computer Scientist by Laurent Rosenfeld. It is an introduction to programming using Perl 6. But even for those of you with programming experience, it will be a good book to start learning Perl 6. And I can know. Because I’ve already read it :-)

Second one will be Learning Perl 6 by veteran Perl developer and writer brian d foy. It will have the advantage of being written by a seasoned writer going through the newbie experience that most people will have when coming from Perl 5.

The third one will be Perl 6 By Example by Moritz Lenz, which will, as the title already gives away, introduce Perl 6 topics by example.

There’ll be at least two (larger) Perl Conferences apart from many smaller Perl workshops: the The Perl Conference NA on 18-23 June, and the The Perl Conference in Amsterdam on 9-11 August. Where you will meet all sorts of nice people!

And for the rest? Expect a faster, leaner, Perl 6 and MoarVM compiler release on the 3rd Saturday every month. And an update of weekly events in the Perl 6 Weekly on every Monday evening/Tuesday morning (depending on where you live).

Day 23 – Everything is either wrong or less than awesome

Have you ever spent your precious time on submitting a bug report for some project, only to get a response that you’re an idiot and you should f⊄∞÷ off?

Right! Well, perhaps consider spending your time on Perl 6 to see that not every free/open-source project is like this.

In the Perl 6 community, there is a very interesting attitude towards bug reports. Is it something that was defined explicitly early on? Or did it just grow organically? This remains to be a Christmas mystery. But the thing is, if it wasn’t for that, I wouldn’t be willing to submit all the bugs that I submitted over the last year (more than 100). You made me like this.

Every time someone submits a bug report, Perl 6 hackers always try to see if there is something that can done better. Yes, sometimes the bug report is invalid. But even if it is, is there any way to improve the situation? Perhaps a warning could be thrown? Well, if so, then we treat the behavior as LTA (Less Than Awesome), and therefore the bug report is actually valid! We just have to tweak it a little bit, meaning that the ticket will now be repurposed to improve or add the error message, not change the behavior of Perl 6.

The concept of LTA behavior is probably one of the key things that keeps us from rejecting features that may seem to do too little good for the amount of effort required to implement them, but in the end become game changers. Another closely related concept that comes to mind is “Torment the implementors on behalf of the users”.

OK, but what if this behavior is well-defined and is actually valid? In this case, it is still probably our fault. Why did the user get into this situation? Maybe the documentation is not good enough? Very often that is the issue, and we acknowledge that. So in a case of a problem with the documentation, we will usually ask you to submit a bug report for the documentation, but very often we will do it ourselves.

Alright, but what if the documentation for this particular case is in place? Well, maybe the thing is not easily searchable? That could be the reason why the user didn’t find it in the first place. Or maybe we lack some links? Maybe the places that should link to this important bit of information are not doing so? In other words, perhaps there are still ways to improve the docs!

But if not, then yes, we will have to write some tests for this particular case (if there are no tests yet) and reject the ticket. This happens sometimes.

The last bit, even if obvious to some, is still worth mentioning. We do not mark tickets resolved without tests. One reason is that we want roast (which is a Perl 6 spec) to be as full as possible. The other reason is that we don’t want regressions to happen (thanks captain obvious!). As the first version of Perl 6 was released one year ago, we are no longer making any changes that would affect the behavior of your code. However, occasional regressions do happen, but we have found an easy way to deal with those!

If you are not on #perl6 channel very often, you might not know that we have a couple of interesting bots. One of them is bisectable. In short, Bisectable performs a more user-friendly version of git bisect, but instead of building Rakudo on each commit, it has done it before you even asked it to! That is, it has over 5500 rakudo builds, one for every commit done in the last year and a half. This turns the time to run git bisect from minutes to about 10 seconds (Yes, 10 seconds is less than awesome! We are working on speeding it up!). And there are other bots that help us inspect the progress. The most recent one is Statisfiable, here is one of the graphs it can produce.

So if you pop up on #perl6 with a problem that seems to be a regression, we will be able to find the cause in seconds. Fixing the issue will usually take a bit more than that though, but when the problem is significant, it will usually happen in a day or two. Sorry for breaking your code in attempts to make it faster, we will do better next time!

But as you are reading this, perhaps you may be interested in seeing some bug reports? I thought that I’d go through the list of bugs of the last year to show how horribly broken things were, just to motivate the reader to go hunting for bugs. The bad news (oops, good news I mean), it seems that the number of “horrible” bugs is decreasing a bit too fast. Thanks to many Rakudo hackers, things are getting more stable at a very rapid pace.

Anyway, there are still some interesting things I was able to dig up:

  • RT #128804 – this is one of the examples where we attempt to print something better than “syntax error”, but have a problem in the error message itself. This was fixed, and now the error message says Cannot convert string to number: malformed base-35 number in 'li⏏zmat' (indicated by ⏏). Can you spot why this error message is Less Than Awesome?

  • RT #128421 – sometimes we are just wrong for no good reason. Makes you wonder how many other bugs like this are hiding somewhere. Can you find one?

That being said, my favorite bug of all times is RT #127473. Three symbols in the source code causing it to go into an infinite loop printing stuff about QAST nodes. That’s a rather unique issue, don’t you think?

I hope this post gave you a little insight on how we approach bugs, especially if you are not hanging around on #perl6 very often. Is our approach less than awesome? Do you have some ideas for other bots that could help us work with bugs? Leave it in the comments, we would like to know!

Day 22 – Generative Testing

OK! So say you finished writing your code and it’s looking good. Let’s say it’s this incredible sum function:

module Sum {
   sub sum($a, $bis export {
      $a + $b
   }
}

Great, isn’t it?! Let’s use it:

use Sum;
say sum 2, 3; # 5

That worked! We summed the number 2 with the number 3 as you saw. If you carefully read the function you’ll see the variables $a and $b haven’t a type set. If you don’t type a variable it’s, by default, of type Any. 2 and 3 are Ints… Ints are Any. So that’s OK! But do you know what’s Any too? Str (just a example)!

Let’s try using strings?

use Sum;
say sum "bla", "ble";

We got a big error:

Cannot convert string to number: base-10 number must begin with valid digits or '.' in 'bla' (indicated by ⏏)
  in sub sum at sum.p6 line 1
  in block  at sum.p6 line 7

Actually thrown at:
  in sub sum at sum.p6 line 1
  in block  at sum.p6 line 7

Looks like it does not accept Strs… It seems like Any may not be the best type to use in this case.

Worrying about every possible input type for all our functions can prove to demand way too much work, as well as still being prone to human error. Thankfully there’s a module to help us with that! Test::Fuzz is a perl6 module that implements the “technique” of generative testing/fuzz testing.

Generative testing or Fuzz Testing is a technique of generating random/extreme data and using this data to call the function being tested.

Test::Fuzz gets the signature of your functions and decides what generators it should use to test it. After that it runs your functions giving it (100, by default) different arguments and testing if it will break.

To test our function, all that’s required is:

module Sum {
   use Test::Fuzz;
   sub sum($a, $bis export is fuzzed {
      $a + $b
   }
}
multi MAIN(:$fuzz!) {
   run-tests
}

And run:

perl6 Sum.pm6 --fuzz

This case will still show a lot of errors:

Use of uninitialized value of type Thread in numeric context
  in sub sum at Sum.pm6 line 4
Use of uninitialized value of type int in numeric context
  in sub sum at Sum.pm6 line 4
    ok 1 - sum(Thread, int)
Use of uninitialized value of type X::IO::Symlink in numeric context
  in sub sum at Sum.pm6 line 4
    ok 2 - sum(X::IO::Symlink, -3222031972)
Use of uninitialized value of type X::Attribute::Package in numeric context
  in sub sum at Sum.pm6 line 4
    ok 3 - sum(X::Attribute::Package, -9999999999)
Use of uninitialized value of type Routine in numeric context
  in sub sum at Sum.pm6 line 4
    not ok 4 - sum(áéíóú, (Routine))
...

What does that mean?

That means we should use one of the big features of perl6: Gradual typing. $a and $b should have types.

So, let’s modify the function and test again:

module Sum {
   use Test::Fuzz;
   sub sum(Int $a, Int $bis export is fuzzed {
      $a + $b
   }
}
multi MAIN(:$fuzz!) {
   run-tests
}
    ok 1 - sum(-2991774675, 0)
    ok 2 - sum(5471569889, 7905158424)
    ok 3 - sum(8930867907, 5132583935)
    ok 4 - sum(-6390728076, -1)
    ok 5 - sum(-3558165707, 4067089440)
    ok 6 - sum(-8930867907, -5471569889)
    ok 7 - sum(3090653502, -2099633631)
    ok 8 - sum(-2255887318, 1517560219)
    ok 9 - sum(-6085119010, -3942121686)
    ok 10 - sum(-7059342689, 8930867907)
    ok 11 - sum(-2244597851, -6390728076)
    ok 12 - sum(-5948408450, 2244597851)
    ok 13 - sum(0, -5062049498)
    ok 14 - sum(-7229942697, 3090653502)
    not ok 15 - sum((Int), 1)

    # Failed test 'sum((Int), 1)'
    # at site#sources/FB587F3186E6B6BDDB9F5C5F8E73C55195B73C86 (Test::Fuzz) line 62
    # Invocant requires an instance of type Int, but a type object was passed.  Did you forget a .new?
...

A lot of OKs!  \o/

But there’re still some errors… We can’t sum undefined values…

We didn’t say the attributes should be defined (with :D). So Test::Fuzz generated every undefined sub-type of Int that it could find. It uses every generator of a sub-type of Int to generate values. It also works if you use a subset or even if you use a where in your signature. It’ll use a super-type generator and grep the valid values.

So, let’s change it again!

module Sum {
   use Test::Fuzz;
   sub sum(Int:D $a, Int:D $bis export is fuzzed {
      $a + $b
   }
}
multi MAIN(:$fuzz!) {
   run-tests
}
    ok 1 - sum(6023702597, -8270141809)
    ok 2 - sum(-8270141809, -3762529280)
    ok 3 - sum(242796759, -7408209799)
    ok 4 - sum(-5813412117, -5280261945)
    ok 5 - sum(2623325683, 2015644992)
    ok 6 - sum(-696696815, -7039670011)
    ok 7 - sum(1, -4327620877)
    ok 8 - sum(-7712774875, 349132637)
    ok 9 - sum(3956553645, -7039670011)
    ok 10 - sum(-8554836757, 7039670011)
    ok 11 - sum(1170220615, -3)
    ok 12 - sum(-242796759, 2015644992)
    ok 13 - sum(-9558159978, -8442233570)
    ok 14 - sum(-3937367230, 349132637)
    ok 15 - sum(5813412117, 1170220615)
    ok 16 - sum(-7408209799, 6565554452)
    ok 17 - sum(2474679799, -3099404826)
    ok 18 - sum(-5813412117, 9524548586)
    ok 19 - sum(-6770230387, -7408209799)
    ok 20 - sum(-7712774875, -2015644992)
    ok 21 - sum(8442233570, -1)
    ok 22 - sum(-6565554452, 9999999999)
    ok 23 - sum(242796759, 5719635608)
    ok 24 - sum(-7712774875, 7039670011)
    ok 25 - sum(7408209799, -8235752818)
    ok 26 - sum(5719635608, -8518891049)
    ok 27 - sum(8518891049, -242796759)
    ok 28 - sum(-2474679799, 2299757592)
    ok 29 - sum(5356064609, 349132637)
    ok 30 - sum(-3491438968, 3438417115)
    ok 31 - sum(-2299757592, 7580671928)
    ok 32 - sum(-8104597621, -8158438801)
    ok 33 - sum(-2015644992, -3)
    ok 34 - sum(-6023702597, 8104597621)
    ok 35 - sum(2474679799, -2623325683)
    ok 36 - sum(8270141809, 7039670011)
    ok 37 - sum(-1534092807, -8518891049)
    ok 38 - sum(3551099668, 0)
    ok 39 - sum(7039670011, 4327620877)
    ok 40 - sum(9524548586, -8235752818)
    ok 41 - sum(6151880628, 3762529280)
    ok 42 - sum(-8518891049, 349132637)
    ok 43 - sum(7580671928, 9999999999)
    ok 44 - sum(-8235752818, -7645883481)
    ok 45 - sum(6460424228, 9999999999)
    ok 46 - sum(7039670011, -7788162753)
    ok 47 - sum(-9999999999, 5356064609)
    ok 48 - sum(8510706378, -2474679799)
    ok 49 - sum(242796759, -5813412117)
    ok 50 - sum(-3438417115, 9558159978)
    ok 51 - sum(8554836757, -7788162753)
    ok 52 - sum(-9999999999, 3956553645)
    ok 53 - sum(-6460424228, -8442233570)
    ok 54 - sum(7039670011, -7712774875)
    ok 55 - sum(-3956553645, 1577669672)
    ok 56 - sum(0, 9524548586)
    ok 57 - sum(242796759, -6151880628)
    ok 58 - sum(7580671928, 3937367230)
    ok 59 - sum(-8554836757, 7712774875)
    ok 60 - sum(9524548586, 2474679799)
    ok 61 - sum(-7712774875, 2450227203)
    ok 62 - sum(3, 1257247905)
    ok 63 - sum(8270141809, -2015644992)
    ok 64 - sum(242796759, -3937367230)
    ok 65 - sum(6770230387, -6023702597)
    ok 66 - sum(2623325683, -3937367230)
    ok 67 - sum(-5719635608, -7645883481)
    ok 68 - sum(1, 6770230387)
    ok 69 - sum(3937367230, 7712774875)
    ok 70 - sum(6565554452, -5813412117)
    ok 71 - sum(7039670011, -8104597621)
    ok 72 - sum(7645883481, 9558159978)
    ok 73 - sum(-6023702597, 6770230387)
    ok 74 - sum(-3956553645, -7788162753)
    ok 75 - sum(-7712774875, 8518891049)
    ok 76 - sum(-6770230387, 6565554452)
    ok 77 - sum(-8554836757, 5356064609)
    ok 78 - sum(6460424228, 8518891049)
    ok 79 - sum(-3438417115, -9999999999)
    ok 80 - sum(-1577669672, -1257247905)
    ok 81 - sum(-5813412117, -3099404826)
    ok 82 - sum(8158438801, -3551099668)
    ok 83 - sum(-8554836757, 1534092807)
    ok 84 - sum(6565554452, -5719635608)
    ok 85 - sum(-5813412117, -2623325683)
    ok 86 - sum(-8158438801, -3937367230)
    ok 87 - sum(5813412117, -46698532)
    ok 88 - sum(9524548586, -2474679799)
    ok 89 - sum(3762529280, -2474679799)
    ok 90 - sum(7788162753, 9558159978)
    ok 91 - sum(6770230387, -46698532)
    ok 92 - sum(1577669672, 6460424228)
    ok 93 - sum(4327620877, 3762529280)
    ok 94 - sum(-6023702597, -2299757592)
    ok 95 - sum(1257247905, -8518891049)
    ok 96 - sum(-8235752818, -6151880628)
    ok 97 - sum(1577669672, 7408209799)
    ok 98 - sum(349132637, 6770230387)
    ok 99 - sum(-7788162753, 46698532)
    ok 100 - sum(-7408209799, 0)
    1..100
ok 1 - sum

No errors!!!

Currently Test::Fuzz only implement generators for Int and Str, but as I said, it will be used for its super and sub classes. If you want to have generators for your custom class, you just need to implement a “static” method called generate-samples that returns sample instances of your class, infinite number of instances if possible.

Test::Fuzz is under development and isn’t in perl6 ecosystem yet. And we’re needing some help!

EDITED: New now you can only call run-tests()

Day 21 – Show me the data!

Over the years, I have enjoyed using the different data dumpers that Perl5 offers. From the basic Data::Dumper to modules dumping in hexadecimal, JSON, with colors, handling closures, with a GUI, as graphs via dot and many other that fellow module developers have posted on CPAN (https://metacpan.org/search?q=data+dump&search_type=modules).

I always find things easier to understand when I can see data and relationships. The funkiest display belonging to ddd (https://www.gnu.org/software/ddd/) that I happen to fire up now and then just for the fun (in the example showing C data but it works as well with the Perl debugger).

ddd

Many dumpers are geared towards data transformation and data transmission/storage. A few modules specialize in generating output for the end user to read; I have worked on system that generated hundreds of thousands lines of output and it is close to impossible to read dumps generated by, say, Data::Dumper.

When I started using Perl6, I immediately felt the need to dump data structures (mainly because my noob code wasn’t doing what I expected it to do); This led me to port my Perl5 module (https://metacpan.org/pod/Data::TreeDumper  https://github.com/nkh/P6-Data-Dump-Tree) to Perl6. I am now also thinking about porting my HexDump module. I recommend warmly learning Perl6 by porting your modules (if you have any on CPAN), it’s fun, educative, useful for the Perl6 community, and your modules implement a need in a domain that you master leaving you time to concentrate on the Perl6.

My Perl5 module was ripe for a re-write and I wanted to see if and how it would be better if written in Perl6, I was not disappointed.

Perl6 is a big language, it takes time to get the pieces right, for a beginner it may seem daunting, even if one has years of experience, the secret is to take it easy, not give up and listen. Porting a module is the perfect exercise, you can take it easy because you have already done it before, you’re not going to give up because you know you can do it, and you have time to listen to people that have more experience (they also need your work), the Perl6 community has been examplary, helpful, patient, supportive and always present; if you haven visited #perl6 irc channel yet, now is a good time.

.perl

Every object in Perl6 has a ‘perl’ method, it can be used to dump the object and objects under it. The official documentation (https://docs.perl6.org/language/5to6-nutshell#Data%3A%3ADumper) provides a good example.

.gist

Every object also inherits a ‘gist’ method from Mu, the official documentation (https://docs.perl6.org/routine/gist#(Mu)_routine_gist) states: “Returns a string representation of the invocant, optimized for fast recognition by humans.”

dd, the micro dumper

It took me a while to discover this one, I saw that in a post on IRC. You know how it feel when you discover something simple after typing .perl and .gist a few hundred times, bahhh!

https://docs.perl6.org/routine/dd

The three dumpers above are built-in. They are also the fastest way to dump data but as much as their output is welcome, I know that it is possible to present data in a more legible way.

Enter Data::Dump

You can find the module on https://modules.perl6.org/ where all the Perl6 modules are. Perl6 modules link to repositories, Data::Dump source is on https://github.com/tony-o/perl6-data-dump.

Data::dump introduces color, depth limitation, and type specific dumps. The code is a compact hundred lines that is quite easy to understand. This module was quite helpful for a few cases that I had. It also dumps all the methods associated with objects. Unfortunately, it did fail on a few types of objects. Give it a try.

Data::Dump::Tree

Emboldened by the Perl6 community, the fact that I really needed a Dumper for visualization, and the experience from my Perl5 module (mainly the things that I wanted to be done differently) I started working on the module. I had some difficulties at the beginning, I knew nothing about the details of Perl6 and even if there is a resemblance with Perl5, it’s another beast. But I love it, it’s advanced, clean, and well designed, I am grateful for all the efforts that where invested in Perl6.

P6 vs P5 implementation

It’s less than half the size and does as much, which makes it clearer (as much as my newbie code can be considered clean). The old code was one monolithic module with a few long functions, the new code has a better organisation and some functionality was split out to extra modules. It may sound like bit-rot (and it probably is a little) but writing the new code in Perl6 made the changes possible, multi dispatch, traits and other built-in mechanism greatly facilitate the re-factoring.

What does it do that the other modules don’t?

I’ll only talk about a few points here and refer you to the documentation for all the details (https://raw.githubusercontent.com/nkh/P6-Data-Dump-Tree/master/lib/Data/Dump/Tree.pod); also have a look at the examples in the distribution.

The main goal for Data::Dump::Tree is readability, that is achieved with filter, type specific dumpers, colors, and dumper specialization via traits. In the examples directory, you can find JSON_parsed.pl which parses 20 lines of JSON by JSON::Tiny(https://github.com/moritz/json),. I’ll use it as an example below. The parsed data is dumped with .perl,  .gist , Data::Dump, and Data::Dump::Tree

.perl output (500 lines, unusable for any average human, Gods can manage)screenshot_20161219_185724

.gist (400 lines, quite readable, no color and long lines limit the readability a bit). Also note that it looks better here than on my terminal who has problems handling unicode properly.screenshot_20161219_190004

Data::Dump (4200 lines!, removing the methods would probably make it usable)screenshot_20161219_190439

The methods dump does not help.screenshot_20161219_190601

Data::Dump::Tree (100 lines, and you are the judge for readability as I am biased). Of course, Data::Dump::Tree is designed for this specific usage, first it understands Match objects, second it can display only part of the string that are matched, which greatly reduces the noise.
screenshot_20161219_190932

Tweeking output

The options are explained in the documentation but here is a little list
– Defining type specific dumper
screenshot_20161219_185409

– filtering to remove data or add a representation for a data set;  below the data structure is dumped as it is and then filtered (a filter that shows what it is doing).

As filtering happens on the “header” and “footer” is should be easy to make a HTML/DHTML plugin; Althoug bcat (https://rtomayko.github.io/bcat/), when using ASCII glyphs, works fine.

screenshot_20161219_191525
– set the display colors
– change the glyphs
– display address information or not
– use subscripts for indexes
– use ASCII, ANSI, or unicode for the glyphs

Diffs

I tried to implement a diff display with the Perl5 module but failed miserably as it needed architectural changes, The Perl6 version was much easier, in fact, it’s an add-on, a trait, that synchronizes two data dumps. This could be used in tests to show differences between expected and gotten data.

screenshot_20161219_184701
Of course we can eliminate the extra glyphs and the data that is equivalent (I also changed the glyph types to ASCII)screenshot_20161219_185035

From here

Above anything else, I hope many authors will start writing Perl6 modules. And I also hope to see other data dumping modules. As for Data::Dump::Tree, as it gathers more users, I hope to get requests for change, patches, and error reports.

Day 20 – Bridging the gap

Perl 6 arrived last year, after quite a time, I might say! It promised a number of things, which we awaited eagerly. Some it delivered (a solid and sane threading model, for example), some it delivered in perhaps unexpected ways (it was ready by Christmas, after all), but some of them didn’t end up quite how we expected them to. One of those things is Perl 5 interoperability.

It should come as no surprise that CPAN, our legendary repository of modules for Perl 5, with readymade, well documented and thoroughly tested solutions for just about anything you can imagine[1] is quite unlikely to get ported to a completely new language. The bold move to drop backwards compatibility between Perl 5 and Perl 6 was necessary, but it came at a price: the pride of our community, the thing we used to call our language with Perl 5 itself being merely a VM to run it, our bread and butter is now doomed as a mere historical artifact, a legacy we may never live up to, but in the brave, new world of Perl 6 hardly useful anymore. What a loss that would be! Good news is that one of the assumptions of Perl 6 was that it’ll be capable of running Perl 5 code alongside Perl 6 code, loading Perl 5 modules and using them from Perl 6, among other things. Entire power of “good, old” CPAN available at your fingertips, dear reader. Bad news is, however, that this particular functionality we have not entirely get.

The original spec…ulation[2] documents said that you should be able to start a Perl 5 block of code in your Perl 6 code and expect it to work, as so:

grammar Shiny-New-Perl6-Things {
    ...
}

{
    use v5;

    sub good-old-perl5-things($$&) { ... }
}

That requires the Perl 6 compiler to be able to parse Perl 5 (yes, yes, I know; thankfully, the speculation also allows for a “well-behaved subset (…) much like PPI” :)) and execute it, providing interoperability between one and the other. Admirable as it is, it turns out to be much more complicated than it looks to be (and if it does look simple to you, remember to hug a Perl 6 core hacker next time you meet one). It doesn’t mean that there is no effort to make it reality (v5 being one of them, check it out!), but it’s not quite the promised land we were… promised. Yet, I hope! But now is now, and work needs to be done. So what are we left with?

Well, one of the things Perl 6 did deliver marvelously is the foreign function interface, most often used as a way to call C code from Perl 6. So some of us sat down and thought: well, Perl 5 is embeddable and is available to be ran as a C library, so to say. Perl 6 can call functions from C, C can execute Perl 5 code, what’s really stopping us from putting the two and two together? Quite a bit of hard work, but the goal is worthy, and smart, hardworking people is something we have quite in handy in our community. And so, Inline::Perl5 was born.

~~~

The simplest thing are hardly the most exciting ones, but let’s start somewhere:

use Inline::Perl5; # I will be skipping this from now on

my $p5 = Inline::Perl5.new;
$p5.run('print "Hello, older sister!\n"');

Or in a more Perl6-y way (and shorter, so better, right?):

EVAL 'print "Hello, older sister!\n"', :lang<Perl5>;

“But tadzik, how is that better than shelling out and having the external Perl 5 process do some predetermined thing?” Ah, it is better though: we can already drag the results out of the Perl 5 land and get it as a proper, Perl 6 object:

use Inline::Perl5;

my %thing = EVAL 'my %stuff = split(/[=,]/, "foo=bar,baz=beta"); \%stuff', :lang<Perl5>;
for %thing.kv -> $k, $v {
    say "$k => $v"
}

When a little elf saw this, it shouted “woo! Does that mean I can stuff whatever complex magic I want in that EVAL block and I’ll get back a working Perl 6 object?” Why yes, little elf. You can load modules, create objects and pass them around between one Perl and the other, and they’ll work just as you think they will. Check this hot stuff out:

my $mech = EVAL ' use WWW::Mechanize; WWW::Mechanize->new()', :lang<Perl5>;
say $mech.WHAT; # Perl5Object
$mech.get("https://perl6.org");

“Holy Moly!” the little elf could hardly contain itself. This means that the dream of having CPAN still available is still there, still reachable, and as usable as always! Can things get any better?

Hah! Why, I’m glad you asked. Remember how we made EVAL, the Perl 6 mechanism work in the Perl 5-augmented way and produce meaningful results? Syntactic magic is not a Perl 5 exclusive thing, you know. What makes you think we can’t teach use and alike to cooperate with Inline::Perl5 as well?

“Wait, you don’t mean…”

Oh yes, little elf. Yes I do:

use WWW::Mechanize:from<Perl5>;

my $mech = WWW::Mechanize.new;
$mech.get("https://perl6.org");

You don’t always see an elf cry, but when you do, it’s the tears of joy. Just like the ones I saw now, reflected in the comforting, blue-ish[3] light of the computer screen. It’s all there, integrated almost seamlessly, just as the speculations promised. The only price you pay is one use statement and some annotations on the ones that come from The World That Used To Be, and the rest works perfectly fine. All the power of CPAN in all the delicious wrapping paper of the new, shiny Christmas gift of a language. Little elf is a practical little being though, so it immediately started looking for edge cases that’ll prevent its real-world code from being utilized in this way. “I can create objects from Perl 6, call methods from Perl 6, get results back… as long as it can be represented with an object and method calls there’s is no limit to what I can do; in the worst case I can always wrap the good old Perl 5 module in something that plays nice with our limitations…”

I raise my eyebrow, Spock-like. “When would you need to do that?”

“Oh, you know”, started the little elf, “when an exception gets thrown for example.”

We almost couldn’t believe our ears when we heard Santa himself mutter “LOL” from behind his screen. We looked in confusion as he sent us this snippet:

my $mech = WWW::Mechanize.new;
try {
    $mech.get("xmas://perl6.org");
    CATCH {
        default {
            say $_.perl;
        }
    }
}
# Output: X::AdHoc.new(payload => "Error GETing xmas://perl6.org: Protocol scheme 'xmas' is not supported at -e line 0.\n")

The little elf looked almost indignant. “Oh for dancing reindeers, is there anything this thing can’t do!?”

“Like what?” Rudolph asked casually, passing by on its four legs.

“Gee, I don’t think I… ah!” it exclaimed suddenly, finding in its memory a particularly complicated piece of code running the facility at this very moment. “I guess if I need to create objects that are subclasses of existing Perl 5 classes I’m a bit out of luck, am I not?”

“What makes you think that wouldn’t just work as you expect it to?”

“I… heh, I guess I should’ve tried it first. So all the complex things, DBIx::Class, Catalyst, they can all work with this just fine?”

“Been there, done that” muttered Santa from behind his screen again, while the servers hummed peacefully running the now part Perl 5, part Perl 6 production code in the gift factory.

“Well, the future sure does seem bright, does it not?”

“Yes it does”, I replied. “It really does.”

Light years away, a star shone bright, and while it’ll take us a few more days to actually see it above our heads, it is already there, sooner than we thought it would be.

~~~

We may not have gotten the seamless interop we asked for, but for all practical purposes our marvelous Perl 5 legacy is far from gone. It may be almost surprising how much you can already get done with it. Those few battle-tested modules that need glueing together? This nasty script that needs refactoring so badly it may as well end up being a full-blown rewrite? That little project that you dreamed of using Perl 6 for, but never expected to have all the dependencies available (and good, and fast, and proved to work)? It’s time to try it. Your star may already be shining.

[1] I swear I remember seeing a module that existed for the sole purpose of uploading the My Little Pony fanfiction to an appropriate place. If you can help me find it, there’s a prize, why the heck not!

[2] We try very hard to not call it specification :)

[3] Yes, we know about Redshift in the Santa Claus Magical factory, but it’s December, crunch time, we can’t afford to go to bed quite at the time we wish we would be able to. Compromises had to be made.

Day 19 – Fixing Flow

Finding flow while coding is one of the joys of programming.

Encountering simple syntactic bugs, however, can sometimes interrupt flow. A single missing semicolon, for example, can result in a “WAT!?” followed by a “DOH!”

Perl 6 helps you around the code -> run -> fix cycle by identifying the cause and location of a bug and often suggesting a solution. Take this program:

say "hello"
say "world";

When you run it, Perl 6 will suggest what’s wrong, where the problem is, and what to do next …

===SORRY!=== Error while compiling /home/nige/hello-world.pl
Two terms in a row across lines (missing semicolon or comma?)
at /home/nige/hello-world.pl:6
------> say "hello"⏏

That helps to keep things flowing.

Normally, at this point, it’s off to your $EDITOR to manually add the semicolon and the cycle repeats.

What if Perl 6 could suggest a fix and apply it for you?

Introducing perl6fix

Here is the beginnings of a command-line utility called, perl6fix, that takes the hint from Perl 6 and tries to speed up the code -> run -> fix cycle by applying the fix for you.

hello-world

Let’s look at the code.

It needs a handle on bug descriptions found in Perl 6 output.

class Bug {
   has Int        $.line-number;
   has SourceFile $.source-file;
   has Str        $.description;
   has Str        $.pre-context;
}

And a way to describe fixes:

class Fix {

    has $.prompt;
    has $.pattern;
    has $.action;

    method apply ($bug) {
        $!action($bug);
    }
    method applies-to ($bug) {
        return True without $.pattern;
        return $bug.description ~~ $.pattern;
    }
}

And a way to update the source file:

class SourceFile is IO::Path {

    has @!content-lines = self.IO.slurp.lines;

    method append-to-first-matching-line ($append-char, $text-to-match) {
        return unless my $first-matching-index = @!content-lines.first(rx/$text-to-match/, :k);
        @!content-lines[$first-matching-index] ~= $append-char;
        self.save;
    }
    method swap-characters-on-line ($from-chars, $to-chars, $line-number) {
        @!content-lines[$line-number - 1].subst-mutate(/$from-chars/, $to-chars);
        self.save;
    }
    method save {
        self.IO.spurt(@!content-lines.join("\n"));
    }
}

Here is just some of the fixes I encountered while writing this program:

my @fixes = (
    Fix.new(
        prompt  => 'add semicolon',
        pattern => rx/'missing semicolon or comma?'/,
        action  => sub ($bug) {
            $bug.source-file.append-to-first-matching-line(';', $bug.pre-context);
        }
    ),
    Fix.new(
         prompt  => 'add comma',
         pattern => rx/'missing semicolon or comma?'/,
         action  => sub ($bug) {
             $bug.source-file.append-to-first-matching-line(',', $bug.pre-context);
         }
    ),
    Fix.new(
         prompt  => 'convert . to ~',
         pattern => rx/'Unsupported use of . to concatenate strings; in Perl 6 please use ~'/,
         action  => sub ($bug) {
              $bug.source-file.swap-characters-on-line('.', '~', $bug.line-number);
         }
    ),
    Fix.new(
         prompt  => 'convert qr to rx',
         pattern => rx/'Unsupported use of qr for regex quoting; in Perl 6 please use rx'/,
         action  => sub ($bug) {
              $bug.source-file.swap-characters-on-line('qr', 'rx', $bug.line-number);
         }
    ),
    # ADD YOUR OWN FIXES HERE
);

There’s many more potential fixes (I’m just starting).

The perl6fix script wraps perl6, captures STDERR (if there is any), and then looks for a bug report in the output:

sub find-bug ($perl6-command) {

    return unless my $error-output = capture-stderr($perl6-command);

    # show the error
    note($error-output); 

    # re-run the command again - this time grabbing a JSON version of the bug
    # set RAKUDO_EXCEPTIONS_HANDLER env var to JSON 
    return unless my $error-as-json = capture-stderr('RAKUDO_EXCEPTIONS_HANDLER=JSON ' ~ $perl6-command);
    return unless my $bug-description = from-json($error-as-json);

    # just handle these exception types to start with 
    for 'X::Syntax::Confused', 'X::Obsolete' -> $bug-type {
        next unless my $bug = $bug-description{$bug-type}; 
        return Bug.new(
            description => $bug<message>,
            source-file => SourceFile.new($bug<filename>),
            line-number => $bug<line>,
            pre-context => $bug<pre>
        );
    }
}

The next step is to see if there are any fixes for this type of bug:

sub fix-bugs ($perl6-command-line) {

    my $bug = find-bug($perl6-command-line);
 
    unless $bug {
        say 'No bugs found to fix.'; 
        exit;
    }

    # find a potential list of fixes for this type of bug
    my @found-fixes = @fixes.grep(*.applies-to($bug));

    say $bug.description;
    say $bug.source-file.path ~ ' ' ~ $bug.line-number;
    say 'Suggested fixes found: ';

    my $fix-count = 0;

    for @found-fixes -> $fix {
        $fix-count++;
        my $option = ($fix-count == 1)
                   ?? "*[1] " ~ $fix.prompt
                   !! " [$fix-count] " ~ $fix.prompt;
        say ' ' ~ $option;
     }

     my $answer = prompt('apply fix [1]? ') || 1;
     my $fix = @found-fixes[$answer - 1];

     $fix.apply($bug);

     # look for more bugs! until we're done
     fix-bugs($perl6-command-line);
}

With the help of a shell alias you can even run it to fix the previous perl6 command. Like so:

fix-image

Just add an alias to your bash or zsh profile. For example:

alias fix='/home/nige/perl6/perl6fix prev $(fc -ln -1)'

Now it’s your turn

As you can see this is just a start.

You’re welcome to take the full script and evolve your own set of automatic Perl 6 fixes.

You could even adapt the script to apply fixes for Perl 5 or other languages? What about a version using grammars? Or a macro-powered version integrated directly into Perl 6?

Well maybe that last one is something to look forward to next Christmas!

Hope you have a happy Christmas and a flowing, Perl 6 powered 2017!

Day 18 – Asynchronous Workflow with Tinky

State Machines

The idea of “state” and the “state machine” is so ubiquitous in software programming that we tend to use it all the time without even thinking about it very much: that we have a fixed number of states of some object or system and that there are fixed allowable transitions from one state to another will be almost inevitable when we start modelling some physical process or business procedure.

I quite like the much quoted description of a state machine used in the erlang documentation:

If we are in state S and the event E occurs, we should perform the actions A and make a transition to the state S’.

once we string a number of these together we could consider it to form a workflow. Whole books, definition languages and software systems have been made about this stuff.

Managing State

Often we find ourselves implementing state management in an application on an ad-hoc basis, defining the constraints on state transition in code, often in separate part of the application and relying on the other parts of the application to do the right thing. I’m sure I’m not alone to have seen code in the wild that purports to implement a state machine which in fact has proved to be entirely non-deterministic in the face of the addition of new states or actions in the system. Or systems where a new action has to be performed on some object when it enters a new state and the state can be entered in different ways in different parts of the application so new code has to be implemented in a number of different places with the inevitable consequence that one gets missed and it doesn’t work as defined.

Using a single source of state management with a consistent definition within an application can alleviate these kinds of problems and can actually make the design of the application simpler and clearer than it might otherwise have been.

Thus I was inspired to write Tinky which is basically a state management system that allows you to create workflows in an application.

Tinky allows you to compose a number of states and the transitions between them into an application workflow which can make the transitions between the states available as a set of Supplies: providing the required constraints on the allowed transitions and allowing you to implement the actions on those transitions in the appropriate place or manner for your application.

Simple Workflow

Perhaps the canonical example used for similar software is that of the bug tracking software, so let’s start with the simplest possible example with three states and two transitions between them:

use Tinky;

my $state-new  = Tinky::State.new(name => "new");
my $state-open = Tinky::State.new(name => "open");
my $state-done = Tinky::State.new(name => "done");

my @states = ( $state-new, $state-open, $state-done);

my $new-open   = Tinky::Transition.new(name => "open", from => $state-new, to => $state-open);
my $open-done  = Tinky::Transition.new(name => "done", from => $state-open, to => $state-done);

my @transitions = ( $new-open, $open-done);


my $workflow = Tinky::Workflow.new(name => "support", :@states, :@transitions, initial-state => $state-new);

This defines our three states “new”, “open” and “done” and two transitions between them, from “new” to “open” and “open” to “done”. This defines a “workflow” in which the state must go through “open” before becoming “done”.

Obviously this doesn’t do very much without an object that can take part in this workflow, so Tinky provides a role Tinky::Object that can be applied to a class who’s state you want to manage:

class Ticket does Tinky::Object {
    has Str $.ticket-number = (^100000).pick.fmt("%08d");
}

my $ticket = Ticket.new;

$ticket.apply-workflow($workflow);

say $ticket.state.name;          # new
$ticket.next-states>>.name.say;  # (open)

$ticket.state = $state-open;
say $ticket.state.name;          # open

The Tinky::Object role provides the accessors state and next-states to the object, the latter returning a list of the possible states that the object can be transitioned to (in this example there is only one, but there could be as many as your workflow definition allows,) you’ll notice that the state of the object is defaulted to new which is the state provided as initial-state to the Tinky::Workflow constructor.

The assignment to state is constrained by the workflow definition, so if in the above you were to do:

$ticket.state = $state-done;

This would result in an exception “No Transition for ‘new’ to ‘done'” and the state of the object would not be changed.

As a convenience the workflow object defines a role which provides methods named for the transitions and which is applied to the object when apply-workflow is called, thus the setting of the state in the above could be written as:

$ticket.open

However this feature has an additional subtlety (that I unashamedly stole from a javascript library,) in that if there are two transitions with the same name then it will still create a single method which will use the current state of the object to select which transition to apply; typically you might do this where the to state is the same, so for example if we added a new state ‘rejected’ which can be entered from both ‘new’ and ‘open’:

use Tinky;

my $state-new      = Tinky::State.new(name => "new");
my $state-open     = Tinky::State.new(name => "open");
my $state-done     = Tinky::State.new(name => "done");
my $state-rejected = Tinky::State.new(name => "rejected");


my @states = ( $state-new, $state-open, $state-done, $state-rejected);

my $new-open   = Tinky::Transition.new(name => "open", from => $state-new, to => $state-open);
my $new-rejected   = Tinky::Transition.new(name => "reject", from => $state-new, to => $state-rejected);
my $open-done  = Tinky::Transition.new(name => "done", from => $state-open, to => $state-done);
my $open-rejected  = Tinky::Transition.new(name => "reject", from => $state-open, to => $state-rejected);

my @transitions = ( $new-open,$new-rejected, $open-done, $open-rejected);


my $workflow = Tinky::Workflow.new(name => "support", :@states, :@transitions, initial-state => $state-new);

class Ticket does Tinky::Object {
    has Str $.ticket-number = (^100000).pick.fmt("%08d");
}

my $ticket-one = Ticket.new;
$ticket-one.apply-workflow($workflow);
$ticket-one.next-states>>.name.say;
$ticket-one.reject;
say $ticket-one.state.name;

my $ticket-two = Ticket.new;
$ticket-two.apply-workflow($workflow);
$ticket-two.open;
$ticket-two.next-states>>.name.say;
$ticket-two.reject;
say $ticket-two.state.name;

You are not strictly limited to having the similarly named transitions enter the same state, but they must have different from states (otherwise the method generated wouldn’t know which transition to apply).   Obviously if the method is called on an object which is not in a state for which there are any transitions an exception will be thrown.

So what about this asynchronous thing

All of this might be somewhat useful if we are merely concerned with constraining the sequence of states an object might be in, but typically we want to perform some action upon transition from one state to another (and this is explicitly stated in the definition above). So, for instance, in our ticketing example we might want to send some notification, recalculate resource scheduling or make a branch in a version control system for example.

Tinky provides for the state transition actions by means of a set of  Supplies on the states and transitions,  to which the object for which the transition has been performed is emitted. These “events” are emitted  on the state that is being left, the state that is being entered and the actual transition that was performed. The supplies are conveniently aggregated at the workflow level.

So, if, in the example above, we wanted to log every transition of state of a ticket and additional send a message when the ticket enters the “open” state we can simply tap the appropriate Supply to perform these actions:

use Tinky;

my $state-new      = Tinky::State.new(name => "new");
my $state-open     = Tinky::State.new(name => "open");
my $state-done     = Tinky::State.new(name => "done");
my $state-rejected = Tinky::State.new(name => "rejected");


my @states = ( $state-new, $state-open, $state-done, $state-rejected);

my $new-open   = Tinky::Transition.new(name => "open", from => $state-new, to => $state-open);
my $new-rejected   = Tinky::Transition.new(name => "reject", from => $state-new, to => $state-rejected);
my $open-done  = Tinky::Transition.new(name => "done", from => $state-open, to => $state-done);
my $open-rejected  = Tinky::Transition.new(name => "reject", from => $state-open, to => $state-rejected);

my @transitions = ( $new-open,$new-rejected, $open-done, $open-rejected);


my $workflow = Tinky::Workflow.new(name => "support", :@states, :@transitions, initial-state => $state-new);

# Make the required actions

$workflow.transition-supply.tap(-> ($trans, $object) { say "Ticket '{ $object.ticket-number }' went from { $trans.from.name }' to '{ $trans.to.name }'" });
$state-open.enter-supply.tap(-> $object { say "Ticket '{ $object.ticket-number }' is opened, sending email" });

class Ticket does Tinky::Object {
    has Str $.ticket-number = (^100000).pick.fmt("%08d");
}

my $ticket-one = Ticket.new;
$ticket-one.apply-workflow($workflow);
$ticket-one.next-states>>.name.say;
$ticket-one.reject;
say $ticket-one.state.name;

my $ticket-two = Ticket.new;
$ticket-two.apply-workflow($workflow);
$ticket-two.open;
$ticket-two.next-states>>.name.say;
$ticket-two.reject;
say $ticket-two.state.name;

Which will give some output like

[open rejected]
Ticket '00015475' went from new' to 'rejected'
rejected
Ticket '00053735' is opened, sending email
Ticket '00053735' went from new' to 'open'
[done rejected]
Ticket '00053735' went from open' to 'rejected'
rejected

The beauty of this kind of arrangement, for me at least, is that the actions can be defined at the most appropriate place in the code rather than all in one place and can also be added and removed at run time if required, it also works nicely with other sources of asynchronous events in Perl 6 such as timers, signals or file system notifications.

Defining a Machine

Defining a large set of states and transitions could prove somewhat tiresome and error prone if doing it in code like the above, so you could choose to build it from some configuration file or from a database of some sort, but for convenience I have recently released Tinky::JSON which allows you to define all of your states and transitions in a single JSON document.

The above example would then become something like:

use Tinky;
use Tinky::JSON;

my $json = q:to/JSON/;
{
    "states" : [ "new", "open", "done", "rejected" ],
    "transitions" : [
        {
            "name" : "open",
            "from" : "new",
            "to"   : "open"
        },
        {
            "name" : "done",
            "from" : "open",
            "to"   : "done"
        },
        {
            "name" : "reject",
            "from" : "new",
            "to"   : "rejected"
        },
        {
            "name" : "reject",
            "from" : "open",
            "to"   : "rejected"
        }
    ],
    "initial-state" : "new"
}
JSON

my $workflow = Tinky::JSON::Workflow.from-json($json);

$workflow.transition-supply.tap(-> ($trans, $object) { say "Ticket '{ $object.ticket-number }' went from { $trans.from.name }' to '{ $trans.to.name }'" });
$workflow.enter-supply("open").tap(-> $object { say "Ticket '{ $object.ticket-number }' is opened, sending email" });

class Ticket does Tinky::Object {
    has Str $.ticket-number = (^100000).pick.fmt("%08d");
}

my $ticket-one = Ticket.new;
$ticket-one.apply-workflow($workflow);
$ticket-one.next-states>>.name.say;
$ticket-one.reject;
say $ticket-one.state.name;

my $ticket-two = Ticket.new;
$ticket-two.apply-workflow($workflow);
$ticket-two.open;
$ticket-two.next-states>>.name.say;
$ticket-two.reject;
say $ticket-two.state.name;

As well as providing the means of constructing the workflow object from a JSON description it adds methods for accessing the states and transitions and their respective supplies by name rather than having to have the objects themselves to hand, which may be more convenient in your application. I’m still working out how to provide the definition of actions in a similarly convenient declarative way.

It would probably be easy to make something similar that can obtain the definition from an XML file (probably using XML::Class,) so let me know if you might find that useful.

Making something useful

My prime driver for making Tinky in the first place was for a still-in-progress online radio management software, this could potentially have several different workflows for different types of objects: the media for streaming may need to be uploaded, it may possibly require encoding to a streamable format, have silence detection performed and its metadata normalised and so forth before it is usable in a show; the shows themselves need to have either media added or a live streaming source configured and then be scheduled at the appropriate time and possibly also be recorded (and then the recording fed back into the media workflow.) All of this might be a little too complex for a short example, but an example that ships with Tinky::JSON is inspired by the media portion of this and was actually made in response to something someone was asking about on IRC a while ago.

The basic idea is that a process waits for WAV files to appear in some directory and then copies them to another directory where they are encoded (in this case to FLAC.) The nice thing about using the workflow model for this is that the code is kept quite compact and clear, since failure conditions can be handled locally to the action for the step in the process so deeply nested conditions or early returns are avoided, also because it all happens asynchronously it makes best of the processor time.

So the workflow is described in JSON as:

{
    "states" : [ "new", "ready", "copied", "done", "failed", "rejected" ],
    "transitions" : [
        {
            "name" : "ready",
            "from" : "new",
            "to"   : "ready"
        },
        {
            "name" : "reject",
            "from" : "new",
            "to"   : "rejected"
        },
        {
            "name" : "copied",
            "from" : "ready",
            "to"   : "copied"
        },
        {
            "name" : "fail",
            "from" : "ready",
            "to"   : "failed"
        },
        {
            "name" : "done",
            "from" : "copied",
            "to"   : "done"
        },
        {
            "name" : "fail",
            "from" : "copied",
            "to"   : "failed"
        }
    ],
    "initial-state" : "new"
}

Which defines our six states and the transitions between them. The “rejected” state is entered if the file has been seen before (from state “new”,) and the “failed” state may occur if there was a problem with either the copying or the encoding.

The program expects this to be in a file called “encoder.json” in the same directory as the program itself.

This example uses the ‘flac’ encoder but you could alter this to something else if you want.

use Tinky;
use Tinky::JSON;
use File::Which;


class ProcessFile does Tinky::Object {
    has Str $.path      is required;
    has Str $.out-dir   is required;
    has Str $.new-path;
    has Str $.flac-file;
    has     @.errors;
    method new-path() returns Str {
        $!new-path //= $!out-dir.IO.child($!path.IO.basename).Str;
    }
    method flac-file() returns Str {
        $!flac-file //= self.new-path.subst(/\.wav$/, '.flac');
        $!flac-file;
    }
}


multi sub MAIN($dir, Str :$out-dir = '/tmp/flac') {
    my ProcessFile @process-files;

    my $json = $*PROGRAM.parent.child('encoder.json').slurp;
    my $workflow = Tinky::JSON::Workflow.from-json($json);

    my $flac = which('flac') or die "no flac encoder";
    my $cp   = which('cp');

    my $watch-supply = IO::Notification.watch-path($dir).grep({ $_.path ~~ /\.wav$/ }).unique(as => { $_.path }, expires => 5);

    say "Watching '$dir'";

    react {
        whenever $watch-supply -> $change {
            my $pf = ProcessFile.new(path => $change.path, :$out-dir);
            say "Processing '{ $pf.path }'";
            $pf.apply-workflow($workflow);
        }
        whenever $workflow.applied-supply() -> $pf {
            if @process-files.grep({ $_.path eq $pf.path }) {
                $*ERR.say: "** Already processing '", $pf.path, "' **";
                $pf.reject;
            }
            else {
                @process-files.append: $pf;
                $pf.ready;
            }
        }
        whenever $workflow.enter-supply('ready') -> $pf {
            my $copy = Proc::Async.new($cp, $pf.path, $pf.new-path, :r);
            whenever $copy.stderr -> $error {
                $pf.errors.append: $error.chomp;
            }
            whenever $copy.start -> $proc {
                if $proc.exitcode {
                    $pf.fail;
                }
                else {
                    $pf.copied;
                }
            }
        }
        whenever $workflow.enter-supply('copied') -> $pf {
            my $encode = Proc::Async.new($flac,'-s',$pf.new-path, :r);
            whenever $encode.stderr -> $error {
                $pf.errors.append: $error.chomp;
            }
            whenever $encode.start -> $proc {
                if $proc.exitcode {
                    $pf.fail;
                }
                else {
                    $pf.done;
                }
            }
        }
        whenever $workflow.enter-supply('done') -> $pf {
            say "File '{ $pf.path }' has been processed to '{ $pf.flac-file }'";
        }
        whenever $workflow.enter-supply('failed') -> $pf {
            say "Processing of file '{ $pf.path }' failed with '{ $pf.errors }'";
        }
        whenever $workflow.transition-supply -> ($trans, $pf ) {
            $*ERR.say("File '{ $pf.path }' went from '{ $trans.from.name }' to '{ $trans.to.name }'");
        }
    }
}

If you start this with an argument of the directory where you want to pick up the files ffrom, it will wait until new files appear then create a new ProcessFile object and apply the workflow to it, then every object to which the workflow is applied is sent to the applied-supply which is tapped to check whether the file has already been processed: if it has (and this can happen because the file directory watch may emit more than one event for the creation of the file,) the object is moved to state ‘rejected’ and no further processing happens, otherwise it is moved to state ‘ready’ whereupon it is copied, and if successfully encoded.

Additional states (and transitions to enter them,) could easily be added to, for instance, store the details of the encoded file in a database, or even start playing it, or new actions could be added for existing states by adding additional “whenever” blocks. As it stands this will block forever waiting for new files;  however this could be integrated into a larger program by starting this in a new thread for instance.

The program and the JSON file are in the examples directory for Tinky::JSON, please feel free to grab and tinker with them.

Not quite all

Tinky has a fair bit more functionality that I don’t think I have space to describe here: there are facilities for the run-time validation of transition application and additional supplies that are emitted to at various stages of the workflow lifecycle. Hopefully your interest is sufficiently picqued that you might look at the documentation.

I am considering adding a cookbook-style document for the module for some common patterns that might arise in programs that might use it. If you have any ideas or questions please feel free to drop me a note.

Finally, I chose a deliberately un-descriptive name for the module as I didn’t want to make a claim that this would be the last word in the problem space. There are probably many more ways that a state managed workflow could be implemented nicely in Perl 6. I would be equally delighted if you totally disagree with my approach and release your own design as I would be if you decide to use Tinky.

Tinky Winky is the purple Teletubby with a red bag.

Day 17 – Testing in virtual time

Over the last month, most of my work time has been spent building a proof of concept for a project that I’ll serve as architect for next year. When doing software design, I find spikes (time-boxed explorations of problems) and rapid prototyping really useful ways to gain knowledge of new problem spaces that I will need to work in. Finding myself with a month to do this before the “real” start of the project has been highly valuable. Thanks to being under NDA, I can’t say much about the problem domain itself. I can, however, say that it involves a reasonable amount of concurrency: juggling different tasks that overlap in time.

Perl’s “whipuptitude” – the ability to quickly put something together – is fairly well known. Figuring that Perl 6’s various built-in concurrency constructs would allow me to whip up concurrent things rapidly, I decided to build the proof of concept in Perl 6. I’m happy to report that the bet paid off pretty well, and by now the proof of concept has covered all of the areas I hoped to explore – and some further ones that turned out to matter but that weren’t obvious at the start.

To me, building rapid prototypes explicitly does not mean writing crappy code. For sure, simplifications and assumptions of things not critical to the problem space are important. But my prototype code was both well tested and well structured. Why? Because part of rapid prototyping is being able to evolve the prototype quickly. That means being able to refactor rapidly. Decent quality, well-structured, well-tested code is important to that. In the end, I had ~2,500 lines of code covered by ~3,500 lines of tests.

So, I’ve spent a lot of time testing concurrent code. That went pretty well, and I was able to make good use of Test::Mock in order to mock components that returned a Promise or Supply also. The fact that Perl 6 has, from the initial language release, had ways to express asynchronous values (Promise) or asynchronous streams of values (Supply) is in itself a win for testing. Concurrent APIs expressed via these standard data structures are easy to fake, since you can put anything you want behind a Promise or a Supply.

My work project didn’t involve a huge amount of dealing with time, but in the odd place it did, and I realized that testing this code effectively would be a challenge. That gave me the idea of writing about testing time-based code for this year’s Perl 6 advent, which in turn gave me the final nudge I needed to write a module that’s been on my todo list all year. Using it, testing things involving time can be a lot more pleasant.

Today’s example: a failover mechanism

Timeouts are one of the most obvious and familiar places that time comes up in fairly “everyday” code. To that end, let’s build a simple failover mechanism. It should be used as follows:

my $failover-source = failover($source-a, $source-b, $timeout);

Where:

  • $source-a is a Supply
  • $source-b is a Supply
  • $timeout is a timeout in seconds (any Real number)
  • The result, assigned to $failover-source, is also Supply

And it should function as follows:

  • The Supply passed as $source-a is immediately tapped (which means it is requested to do whatever is needed to start producing values)
  • If it produces its first value before $timeout seconds, then we simply emit every value it produces to the result Supply and ignore $source-b
  • Otherwise, after $timeout seconds, we also tap $source-b
  • Whichever source then produces a value first is the one that we “latch” on to; any results from the other should be discarded

Consider, for example, that $source-a and $source-b are supplies that, when tapped, will send the same query to two different servers, which will stream back results over time. Normally we expect the first result within a couple of seconds. However, if the server queried by $source-a is overloaded or has other issues, then we’d like to try using the other one, $source-b, to see if it can produce results faster. It’s a race, but where A gets a head start.

Stubbing stuff in

So, in a Failover.pm6, let’s stub in the failover sub as follows:

sub failover(Supply $source-a, Supply $source-b, Real $timeout --> Supply) is export {
    supply {
    }
}

A t/failover.t then starts off as:

use Failover;
use Test;

# Tests will go here

done-testing;

And we’re ready to dig in to the fun stuff.

The first test

The simplest possible case for failover is when $source-a produces its first value in time. In this case, $source-b should be ignored totally. Here’s a test case for that:

subtest 'When first Supply produces a value in time, second not used', {
    my $test-source-a = supply {
        whenever Promise.in(1) {
            emit 'a 1';
        }
        whenever Promise.in(3) {
            emit 'a 2';
        }
    }
    my $test-source-b = supply {
        die "Should never be used";
    }
    my $failover-supply = failover($test-source-a, $test-source-b, 2);
    my $output = $failover-supply.Channel;

    is $output.receive, 'a 1', 'Received first value from source A';
    is $output.receive, 'a 2', 'Received second value from source A';
}

Here, we set up $test-source-a as a Supply that, when tapped, will emit a 1 after 1 second, and a 2 after 3 seconds. If $test-source-b is ever tapped it will die. We expect that if this wrongly happens, it will be at the 2 second mark, which is why a 2 is set to be emitted after 3 seconds. We then obtain a Channel from the resulting $failover-supply, which we can use to pull values from at will and check we got the right things. (On coercing a Supply to a Channel, the Supply is tapped, starting the flow of values, and each result value is fed into the Channel. Both completion and errors are also conveyed.)

Making it pass

There are a couple of ways that we might make this test pass. The absolute easiest one would be:

sub failover(Supply $source-a, Supply $source-b, Real $timeout) is export {
    return $source-a;
}

Which feels like cheating, but in TDD the code that passes the first test case almost always does. (It sometimes feels pointless to write said tests. More than once, they’ve ended up saving me when – while making a hard thing work – I ended up breaking the trivial thing.)

An equivalent, more forward-looking solution would be:

sub failover(Supply $source-a, Supply $source-b, Real $timeout) is export {
    supply {
        whenever $source-a {
            .emit;
        }
    }
}

Which is the identity operator on a Supply (just spit out everything you get). For those not familiar with supplies, it’s worth noting that this supply block does 3 useful things for you for free:

  • Passes along errors from $source-a
  • Passes along completion from $source-a
  • Closes the tap on $source-a – thus freeing up resources – if the tap on the supply we’re defining here is closed

Subscription management and error management are two common places for errors in asynchronous code; the supply/whenever syntax tries to do the Right Thing for you on both fronts. It’s more than just a bit of tinsel on the Christmas callback.

When the timeout…times out

So, time for a more interesting test case. This one covers the case where the $source-a fails to produce a value by the timeout. Then, $source-b produces a value within 1 second of being tapped – meaning its value should be relayed. We also want to ensure that even if $test-source-a were to emit a value a little later on, we’d disregard it. Here’s the test:

subtest 'When timeout reached, second Supply is used instead if it produces value first', {
    my $test-source-a = supply {
        whenever Promise.in(4) {
            emit 'a 1';
        }
    }
    my $test-source-b = supply {
        whenever Promise.in(1) { # start time + 2 (timeout) + 1
            emit 'b 1';
        }
        whenever Promise.in(3) { # start time + 2 (timeout) + 3
            emit 'b 2';
        }
    }
    my $failover-supply = failover($test-source-a, $test-source-b, 2);
    my $output = $failover-supply.Channel;

    is $output.receive, 'b 1', 'Received first value from source B';
    is $output.receive, 'b 2', 'Received second value from source B';
}

We expect a 1 to be ignored, because we chose $source-b. So, how can we make this pass? Here goes:

sub failover(Supply $source-a, Supply $source-b, Real $timeout --> Supply) is export {
    supply {
        my $emitted-value = False;

        whenever $source-a {
            $emitted-value = True;
            .emit;
        }

        whenever Promise.in($timeout) {
            unless $emitted-value {
                whenever $source-b {
                    .emit;
                }
            }
        }
    }
}

Will this pass the test? Both subtests?

Think about it…

Well, no, it won’t. Why? Because it doesn’t do anything about disregarding $source-a after it has started spitting out values from $source-b. It needs to commit to one or the other. Didn’t spot that? Good job we have tests! So, here’s a more complete solution that makes both subests pass:

sub failover(Supply $source-a, Supply $source-b, Real $timeout --> Supply) is export {
    supply {
        my enum Committed ;
        my $committed = None;

        whenever $source-a -> $value {
            given $committed {
                when None {
                    $committed = A;
                    emit $value;
                }
                when A {
                    emit $value;
                }
            }
        }

        whenever Promise.in($timeout) {
            if $committed == None {
                whenever $source-b -> $value {
                    $committed = B;
                    emit $value;
                }
            }
        }
    }
}

So tired of waiting

You’d think I’d be happy with this progress. Two passing test cases. Surely the end is in sight! Alas, development is getting…tedious. Yes, after just two test cases. Why? Here’s why:

$ time perl6-m -Ilib t/failover-bad.t
    ok 1 - Received first value from source A
    ok 2 - Received second value from source A
    1..2
ok 1 - When first Supply produces a value in time, second not used
    ok 1 - Received first value from source B
    ok 2 - Received second value from source B
    1..2
ok 2 - When timeout reached, second Supply is used instead if it produces value first
1..2

real    0m8.694s
user    0m0.600s
sys     0m0.072s

Every time I run my tests I’m waiting around 9 seconds now. And when I add more tests? Even longer! Now imagine I was going to write a whole suite of these failover and timeout routines, as a nice module. Or I was testing timeouts in a sizable app and would have dozens, even hundreds, of such tests.

Ouch.

Maybe, though, I could just make the timeouts smaller. Yes, that’ll do it! Here is how the second test looks now, for example:

subtest 'When timeout reached, second Supply is used instead if it produces value first', {
    my $test-source-a = supply {
        whenever Promise.in(0.04) {
            emit 'a 1';
        }
    }
    my $test-source-b = supply {
        whenever Promise.in(0.01) { # start time + 2 (timeout) + 1
            emit 'b 1';
        }
        whenever Promise.in(0.03) { # start time + 2 (timeout) + 3
            emit 'b 2';
        }
    }
    my $failover-supply = failover($test-source-a, $test-source-b, 0.02);
    my $output = $failover-supply.Channel;

    is $output.receive, 'b 1', 'Received first value from source B';
    is $output.receive, 'b 2', 'Received second value from source B';
}

You want it faster? Divide by 100! Job done.

Of course, anybody who has actually done this knows precisely what comes next. The first 3 times I ran my tests after this change, all was well. But guess what happened on the forth time?

ok 1 - When first Supply produces a value in time, second not used
    not ok 1 - Received first value from source B

    # Failed test 'Received first value from source B'
    # at t/failover-short.t line 41
    # expected: 'b 1'
    #      got: 'a 1'
    not ok 2 - Received second value from source B

    # Failed test 'Received second value from source B'
    # at t/failover-short.t line 42
    # expected: 'b 2'
    #      got: 'b 1'
    1..2
    # Looks like you failed 2 tests of 2
not ok 2 - When timeout reached, second Supply is used instead if it produces value first

Mysteriously…it failed. Why? Bad luck. My computer is a busy little machine. It can’t just give my test programs all the CPU all of the time. It needs to decode that music I’m listening to, check if I need to install the 10th set of security updates so far this month, and cope with my web browser wanting to do stuff because somebody tweeted something or emailed me. And so, once in a while, just after the clock hits 0.01 seconds and a thread grabs the whenever block to work on, that thread will be dragged off the CPU. Then, before it can get back on again, the one set to run at 0.04 seconds gets to go, and spits out its value first.

Sufficiently large times mean slow tests. Smaller values mean unreliable tests. Heck, suspend the computer in the middle of running the test suite and even a couple of seconds is too short for reliable tests.

Stop! Virtual time!

This is why I wrote Test::Scheduler. It’s an implementation of the Perl 6 Scheduler role that virtualizes time. Let’s go back to our test code and see if we can do better. First, I’ll import the module:

use Test::Scheduler;

Here’s the first test, modified to use Test::Scheduler:

subtest 'When first Supply produces a value in time, second not used', {
    my $*SCHEDULER = Test::Scheduler.new;

    my $test-source-a = supply {
        whenever Promise.in(1) {
            emit 'a 1';
        }
        whenever Promise.in(3) {
            emit 'a 2';
        }
    }
    my $test-source-b = supply {
        die "Should never be used";
    }
    my $failover-supply = failover($test-source-a, $test-source-b, 2);
    my $output = $failover-supply.Channel;

    $*SCHEDULER.advance-by(3);
    is $output.receive, 'a 1', 'Received first value from source A';
    is $output.receive, 'a 2', 'Received second value from source A';
}

Perhaps the most striking thing is how much hasn’t changed. The changes amount to two additions:

  1. The creation of a Test::Scheduler instance and the assignment to the $*SCHEDULER variable. This dynamic variable is used to specify the current scheduler to use, and overriding it allows us to swap in a different one, much like you can declare a $*OUT to do stuff like capturing I/O.
  2. A line to advance the test scheduler by 3 seconds prior to the two assertions.

The changes for the second test are very similar:

subtest 'When timeout reached, second Supply is used instead if it produces value first', {
    my $*SCHEDULER = Test::Scheduler.new;

    my $test-source-a = supply {
        whenever Promise.in(4) {
            emit 'a 1';
        }
    }
    my $test-source-b = supply {
        whenever Promise.in(1) { # start time + 2 (timeout) + 1
            emit 'b 1';
        }
        whenever Promise.in(3) { # start time + 2 (timeout) + 3
            emit 'b 2';
        }
    }
    my $failover-supply = failover($test-source-a, $test-source-b, 2);
    my $output = $failover-supply.Channel;

    $*SCHEDULER.advance-by(6);
    is $output.receive, 'b 1', 'Received first value from source B';
    is $output.receive, 'b 2', 'Received second value from source B';
}

And what difference does this make to the runtime of my tests? Here we go:

$ time perl6-m -Ilib t/failover-good.t
    ok 1 - Received first value from source A
    ok 2 - Received second value from source A
    1..2
ok 1 - When first Supply produces a value in time, second not used
    ok 1 - Received first value from source B
    ok 2 - Received second value from source B
    1..2
ok 2 - When timeout reached, second Supply is used instead if it produces value first
1..2

real    0m0.679s
user    0m0.628s
sys     0m0.060s

From 9 seconds to sub-second – and much of that will be fixed overhead rather than the time running the tests.

One more test

Let’s deal with the final of the requirements, just to round off the test writing and get to a more complete solution to the original problem. The remaining test we need is for the case where the timeout is reached, and we tap $source-b. Then, before it can produce a value, $source-a emits its first value. Therefore, we should latch on to $source-a.

subtest 'When timeout reached, and second Supply tapped, first value still wins', {
    my $*SCHEDULER = Test::Scheduler.new;

    my $test-source-a = supply {
        whenever Promise.in(3) {
            emit 'a 1';
        }
        whenever Promise.in(4) {
            emit 'a 2';
        }
    }
    my $test-source-b = supply {
        whenever Promise.in(2) { # start time + 2 (timeout) + 2
            emit 'b 1';
        }
    }
    my $failover-supply = failover($test-source-a, $test-source-b, 2);
    my $output = $failover-supply.Channel;

    $*SCHEDULER.advance-by(4);
    is $output.receive, 'a 1', 'Received first value from source A';
    is $output.receive, 'a 2', 'Received second value from source A';
}

This fails, because the latch logic wasn’t included inside of the whenever block that subscribes to $source-b. Here’s the easy fix for that:

sub failover(Supply $source-a, Supply $source-b, Real $timeout --> Supply) is export {
    supply {
        my enum Committed ;
        my $committed = None;

        whenever $source-a -> $value {
            given $committed {
                when None {
                    $committed = A;
                    emit $value;
                }
                when A {
                    emit $value;
                }
            }
        }

        whenever Promise.in($timeout) {
            if $committed == None {
                whenever $source-b -> $value {
                    given $committed {
                        when None {
                            $committed = B;
                            emit $value;
                        }
                        when B {
                            emit $value;
                        }
                    }
                }
            }
        }
    }
}

The easy thing is just a little bit repetitive, however. It would be nice to factor out the commonality into a sub. Here goes:

sub failover(Supply $source-a, Supply $source-b, Real $timeout --> Supply) is export {
    supply {
        my enum Committed ;
        my $committed = None;

        sub latch($onto) {
            given $committed {
                when None {
                    $committed = $onto;
                    True
                }
                when $onto {
                    True
                }
            }
        }

        whenever $source-a -> $value {
            emit $value if latch(A);
        }

        whenever Promise.in($timeout) {
            if $committed == None {
                whenever $source-b -> $value {
                    emit $value if latch(B);
                }
            }
        }
    }
}

And in under a second, the tests can now assure us that this was indeed a successful refactor. Note that this does not yet cancel a discarded request, perhaps saving duplicate work. I’ll leave that as an exercise for the reader.

Safety and realism

One thing you might wonder about here is whether this code is really thread safe. The default Perl 6 scheduler will schedule code across a bunch of threads. What if $source-a and $source-b emit their first value almost simultaneously?

The answer is that supply (and react) blocks promise Actor-like semantics, processing just one message at a time. So, if we’re inside of the whenever block for $source-a, and $source-b emits a message on another thread, then it will be queued up for processing afterwards.

One interesting question that follows on from this is whether the test scheduler somehow serializes everything onto the test thread in order to do its job. The answer is that no, it doesn’t do that. It wraps the default ThreadPoolScheduler and always delegates to it to actually run code. This means that, just as with the real scheduler, the code will run across multiple threads and on the thread pool. This helps to avoid a couple of problems. Firstly, it means that testing code that relies on having real threads (by doing stuff that really blocks a thread) is possible. Secondly, it means that Test::Scheduler is less likely to hide real data race bugs that may exist in the code under test.

Of course, it’s important to keep in mind that virtual time is still very much a simulation of real time. It doesn’t account for the fact that running code takes time; virtual time stands still while code runs. At the same time, it goes to some amount of effort to get the right sequencing when a time-based event triggered in virtual time leads to additional time-based events being scheduled. For example, imagine we schedule E1 in 2s and E2 in 4s, and then advance the virtual time by 4s. If the triggering of E1 schedules E3 in 1s (so, 3s relative to the start point), we need to have it happen before E2. To have this work means trying to identify when all the consequences of E1 have been shaken out before proceeding (which is easier said than done). Doing this will, however, prevent some possible overlaps that could take place in real time.

In summary…

Unit tests for code involving time can easily end up being slow and/or unreliable. However, if we can virtualize time, it’s possible to write tests that are both fast and reliable – as good unit tests should be. The Test::Scheduler module provides a way to do this in Perl 6. At the same time, virtual time is not a simulation of the real thing. The usual rules apply: a good unit test suite will get you far, but don’t forget to have some integration tests too!

Day 16: The Meta spec, Distribution, and CompUnit::Repository explained-ish


This is a tool chain post. You have been warned.

In this post I’m going to explain the typical journey of a module from a request for a distribution (such as zef install My::Module) to the point you are able to perl6 -e 'use My::Module;'. Afterwards I’ll go over implementation details, and how they allow us to do cool things like use a .tar.gz archive or the github API for loading source code.

Most of this stuff is not documented yet, but the source code for these items is not difficult to grok and commented thoroughly. Don’t expect to understand how to actually do anything new after reading this. This is intended to give a high level description of the design considerations used and what they will allow you to do.

And I’ll be ignoring precompilation because that’s hard.

The Journey

We’ll start at zef install My::Module, but it could just as well be a URI or something like My::Module:ver<1.*>:auth<me@foo.com>. The module manager either proxies this request to an external recommendation manager (such as MetaCPAN) [1], or acts as the recommendation manager itself (grepping the current perl6 ecosystem package.json file).[2] The request returns a META6 hash for a matching distribution and likely includes some non-spec fields to hint at the download URI.

 ______                              ______________________
|client| 1==request My::Module====> |recommendation manager|
|______| <==META6 representation==2 |______________________|

The package manager then uses an appropriate Distribution object that understands the META6 fields and how to fetch the files it references. The Distribution will encapsulate all the behavior rakudo expects.

 ______                                     ___________________
|client| 3==META6 representation=========> |Distribution Lookup|
|______| <==Distribution implementation==4 |___________________|

The Distribution is passed to the rakudo core class CompUnit::Repository::Installation.install($dist) (CURI for short). CURI then saves all the files the Distribution represents to its own hierarchy and naming conventions.

 ______                                ______
|client| 5==.install(Distribution)==> | CURI |
|______|                              |______|

If you call .resolve($name) on CURI it will return a Distribution object that it creates from its own structure. .resolve($name) also has to decide what to do when multiple names match. In this way CompUnit::Repository acts as a basic recommendation manager.

 ______                                     ___________
|      | 1==.resolve('My::Module')=======> | CURI#site |
|client| <==Distribution implementation==2 |___________|
|      |                                   ____________ 
|      | 3==.install(Distribution)======> | CURI#home  |
|______|                                  |____________|

The Details

META6

{
    "name"        : "My::Module",
    "version"     : "0.001",
    "auth"        : "me@cpan.org",
    "provides"    : {
        "My::Module"      : "lib/My/Module.pm6",
        "My::Module::Foo" : "lib/My/Module/Foo.pm6",
    },
    "resources" : [
        "config.json",
        "scripts/clean.pl",
        "libraries/mylib",
    ],
}

In most cases a distributions meta data is stored in its root folder as META6.json, but where it comes from is irrelevant. We’ll only be interested in a few of the possible fields, which serve two different purposes:

  1. A “unique” identifierThe identifier is the full name of a distribution, and includes the name, version, and auth (there is also api but we’re ignoring this one). In this case its My::Module:ver<0.001>:auth<me@cpan.org> [3] If you were to install this distribution you could use it with:

    use My::Module:ver<0.001>:auth<me@cpan.org>; (although use My::Module; would probably suffice)

  2. File mapping
  • provides is a key/value mapping where the key is the package name and the value is a content-id (forward slash relative file path). The content-id, while being a file path, might not represent a file that exists yet.
  • resources is a list of resource content-ids included with your distribution, usually including all of the files in the resources/ directory. These files will be accessible by modules in your distribution via %?RESOURCES<$name-path>. In the example the first two items follow this pattern and would be resources/config.json and resources/scripts/clean.pl, but the last one is special. If the first path part of the content-id is “libraries/” then any path under it will have its name mangled to whatever naming convention rakudo thinks is right for the OS it is running on. This can be useful for distributions that compile/generate libs at build time and expect to be named a certain way; libraries/mylib.dll on windows, libraries/libmylib.so on linux, and libraries/libmylib.1.so on OSX. This allows you to reference this library in a (probably NativeCall) module as %?RESOURCES<mylib> instead of guessing if its %?RESOURCES<mylib.dll> or %?RESOURCES<libmylib.so>
  • files optional this would usually be populated automatically by the Distribution but i’ll mention it here because you can construct this manually. It is a key/value of $name-path => $content-id. These may or not be the same. Combined with provides this gives you a way to get a list of all content-ids that can be used with Distribution.content(...)
    # Before CURI.install - Usually generated by the Distribution itself
    "files: {
        "bin/my-script" => "bin/my-script",
        "resources/libraries/foolib" => "resources/libraries/libfoolib.so"
    }
    
    # After CURI.install
    "files: {
        "bin/my-script" => "SDfDFIHIUHuhfue9f3fJ930j",
        "resources/libraries/foolib" => "j98jf9fjFJLJFi3f.so"
    }
    

Distribution

role Distribution {
    method meta(--> Hash) {...}
    method content($content-id --> IO::Handle) {...}
}

Distribution is the IO interface a CompUnit::Repository uses. It only needs to implement two methods:

  • method meta Access to the meta data of a distribution. This does not have to be a local file:
  • method content Given a content-id such as lib/My/Module.pm or libraries/mylib return an IO::Handle from which the appropriate data can be read.

When you (or your module installer) pass a Distribution to CompUnit::Repository::Installation.install($dist) it will look at $dist.meta() to figure out all the content-ids it needs to install, and then calls $dist.content($content-id).slurp-rest to get the actual content. [4]

CompUnit::Repository

At the most basic level a CompUnit::Repository is used to store and/or lookup distributions. How the distribution is stored or loaded is up to the CompUnit::Repository.

CompUnit::Repository::Installation is unique among the core CompUnit::Repository classes in that is has an install method that takes a Distribution implementation and maps it to the file system. Currently this means changing all file names to a sha1 string. So it also returns its own implementation of Distribution that still allows us to access My::Module as $dist.content('lib/My/Module.pm'). It’s path-spec is inst#, so if you install the dependencies of a distribution to inst#local/ you could do one of:

  • perl6 -Iinst#local/ -Ilib -e 'use My::Module'
  • PERL6LIB=inst#local/ perl6 -Ilib -e 'use My::Module'

CompUnit::Repository::FileSystem on the other hand works with original path names, although it does not install – only loads and resolves identities. This is what gets used when you use a module found in -I mylib or use lib "mylib" (both short for file#mylib/). If a META6.json file is found it will use the provides field to map namespaces to paths, but if there is no META6.json file (such as when you start developing a module) it will try the usual perl 5 schematics of ($name =~ s{::}{/}g) . ".pm6.

CompUnit::Repository::AbsolutePath is different in that it represents a single module (and not an entire Distribution of modules), such as: require '/home/perl6/repos/my-module/lib/my/module.pm6'.

A gross oversimplification of the interface
role CompUnit::Repository {
    method id()  { ... }

    method path-spec() { ... } # file#, inst#, etc

    method need(CompUnit::DependencySpecification $spec, |c --> CompUnit) { ... }
    
    method load(|) { ... }
}

Cool stuff

Distribution implementations (non-core)

If you were to pass this to CompUnit::Repository::Installation.install($dist) it would make a http request for each source file (found in the META6 – also fetched with a http request) and save the content to the final installation path:

use Distribution::Common::Remote::Github;

my $github-dist = Distribution::Common::Remote::Github.new(
    user   => "zoffixznet",
    repo   => "perl6-CoreHackers-Sourcery", # XXX: No [missing] dependencies
    branch => "master",
);

say "Source code: " ~ $github-dist.content('lib/CoreHackers/Sourcery.pm6').open.slurp-rest;

my $installation-cur = CompUnit::RepositoryRegistry.repository-for-name('home');
exit $installation-cur.install($github-dist) ?? 0 !! 1;

Similarly you could pipe data from running a command such as tar: [5]

use Distribution::Common::Tar;
use Net::HTTP::GET;
use File::Temp;

my $distribution-uri       = 'https://github.com/zoffixznet/perl6-CoreHackers-Sourcery/archive/master.tar.gz';
my ($filepath,$filehandle) = tempfile("******", :unlink);
spurt $filepath, Net::HTTP::GET($distribution-uri).body;

my $tar-dist = Distribution::Common::Tar.new($filepath.IO);

say "Source code: " ~ $tar-dist.content('lib/CoreHackers/Sourcery.pm6').open.slurp-rest;

my $installation-cur = CompUnit::RepositoryRegistry.repository-for-name('home');
exit $installation-cur.install($tar-dist) ?? 0 !! 1;

Thats not to say you couldn’t just clone or untar the distribution and use the built-in Distribution::Path($path) – this simply makes other possibilities trivial to implement.

CompUnit::Repository implementations (non-core)

You can also make your own CompUnit::Repository, such as CompUnit::Repository::Tar.

use CompUnit::Repository::Tar;
use lib "CompUnit::Repository::Tar#perl6-repos/my-module.tar.gz";
use My::Module;

… which is very similar to CompUnit::Repository::FileSystem, but it uses Distribution::Common::Tar to interface with the distribution. This means you can reuse the loading code from core CompUnit::Repository::* modules with very little modification (and won’t be covered in this post).

It would not take too much effort to use Distribution::Common::Remote::Github as the Distribution interface used when loading/resolving, giving a CloudPAN-like way to load modules.


Some ideas for modules:

  • Distribution::Dpkg – Adapater for the dpkg package format
  • Distribution::Gist – Install a distribution from a gist, because…
  • CompUnit::Repository::IPFS – InterPlanetary File System content storage backend
  • CompUnit::Repository::Temp – This CUR will self destruct in…
  • CompUnit::Repository::Tar::FatPack – Read all dependencies for an application from a single tar archive

More reading

gpw2016 Stefan Seifert – A look behind the curtains – module loading in Perl 6

Synopsis 22: Distributions, Recommendations, Delivery and Installation (non-authoritative)

Slightly less basic Perl6 module management


  1. Transformations on the identity may need to be made before sending to a recommendation manager. MetaCPAN may not understand :ver<1.*>, but its only a matter of representing that as an elastic search parameter.
  2. The recommendation engine also gets to determine what to return for My::Module when it has both My::Module:auth<foo> and My::Module:auth<bar> indexed, so it may become a best practice to declare the auth when you use a module.
  3. It should be noted that auth does not tell you where a module should be downloaded from. For instance: My::Module:auth<github:me> does not mean “download My::Module from github.com/me” – its nothing more than an additional identifying part, which is why using an email address is a better example. That exact identity might be found on github, cpan, or a darkpan. Such recommendation managers could choose to only index distributions that use an auth it can verify.
  4. method content doesn’t actually constrain the return value to an IO::Handle but it does expect it to act like one. This was done so that a socket, while not an IO::Handle, could still be used with a thin wrapper allowing resources to be fetched at the moment they are to be installed:
  5. This is a lie, it actually extracts to a temporary file for files under resources/ but only because %?RESOURCES<...> has to return an IO::Path (instead of an IO::Handle). Without this constraint the temporary file is not needed.