Day 22 – Perl6 and CPAN

Here’s the short version.

Not yet; but efforts are underway. Stay tuned! News at http://pl6anet.org. And, as always, hit up #perl6 on freenode if you want to talk about it.

Please continue below if you’ve any interest in an overly detailed advent post.

First, I’d like to point out that the imminent Christmas Release is largely not concerned with the topic of this post. I’ll leave it to someone more qualified to qualify the release in more precise terms but suffice it to say its about the Perl6 language and at least one implementation. That does not include non-core ecosystem concerns such as: packaging, distribution, searching, installing, testing services, linters, etc… In the Perl5 world these things are collectively known as “CPAN” and are a huge part of what makes Perl5 useful to many.

The second item I’d like to bring to your attention is that we’ve had an ecosystem solution for quite some time now. Its basically a collection of repos hosted at github (https://github.com/perl6/ecosystem/blob/master/META.list), which can be searched (modules.perl6.org) and installed (panda or zef).
If you want to publish or use Perl6 modules now then this is the way to go. Its probably worth noting that versioning support is not fully implemented yet.

And now onto the future! What should a Perl6 ecosystem be? This is still an open question and an active area of experimentation. Should it be based on shiny things like github and travis? This could probably be made to work by adding versioning support and a few other things to the existing ecosystem. But then there’s the issue of being dependent on entities we don’t control. Should it be built from scratch? I’m aware of one example of that: http://cpan6.org/. Or should it be based on, or even built on top of, Perl5’s CPAN?

I decided a while back I wanted to explore the last of those options.
See http://jdv79.blogspot.com/2015/10/perl6-and-cpan.html and
http://jdv79.blogspot.com/2015/10/perl6-and-cpan-metacpan-status-as-of.html. Those two posts cover it pretty well actually.

Since then all I’ve done is fix Perl6 related bugs in MetaCPAN. Things like syntax highlighting, Pod6 rendering, and of course searching more or less work. One of the better working dists is
http://hack.p6c.org:5001/release/Data-Selector since that’s the one I did the most testing with. And here’s the full listing: http://hack.p6c.org:5001/author/JDV.

Also, just a few days ago, Ranguard decided to help move us forward a bit by basically uploading the ecosystem onto CPAN under the PSIXDISTS user. Unfortunately, or fortunately depending on how you look at it, this led to the discovery of a bug in PAUSE which is not yet fixed (https://github.com/andk/pause/issues/194). It’s probably best to wait until that’s fixed before uploading any Perl6 dists to CPAN.

In summary, we only have the beginnings of PAUSE and MetaCPAN support. Once we get the installers working we’ll have a useful Perl6 CPAN that we can begin to play with.  Or throw out – who knows:) Until then use the ecosystem at http://modules.perl6.org/ for all your Perl6 module needs.

UPDATE: I just noticed the aforementioned PAUSE bug may be fixed as of about an hour ago (~ 1am cet here). Andk++

Day 21 – NativeCall-backs and Beyond C

One of my favorite features in Perl 6 is the NativeCall interface, because it allows gluing virtually any native library into it relatively easily. There have even been efforts to interface with other scripting languages so that you can use their libraries as well.

There have already been a pair of advent posts on NativeCall already, one about the basics in 2010 and one about objectiness in 2011. So this one won’t repeat itself in that regard, and instead be about Native Callbacks and C++ libraries.

Callbacks

While C isn’t quite as good as Perl at passing around functions as data, it does let you pass around pointers to functions to use them as callbacks. It’s used extensively when dealing with event-like stuff, such as signals using signal(2).

In the NativeCall docs, there’s a short quip about callbacks. But they can’t be that easy, can they?

Let’s take the Expat XML library as an example, which we want to use to parse this riveting XML document:

<calendar>
    <advent day="21">
        <topic title="NativeCall Bits and Pieces"/>
    </advent>
</calendar>

The Expat XML parser takes callbacks that are called whenever it finds and opening or closing XML tag. You tell it which callbacks to use with the following function:

XML_SetElementHandler(XML_Parser parser,
                      void (*start)(void *userdata, char *name, char **attrs),
                      void (*end)(void* userdata, char *name));

It associates the given parser with two function pointers to the start and end tag handlers. Turning this into a Perl 6 NativeCall subroutine is straight-forward:

use NativeCall;

sub XML_SetElementHandler(OpaquePointer $parser,
                          &start (OpaquePointer, Str, CArray[Str]),
                          &end   (OpaquePointer, Str))
    is native('expat') { ... }

As you can see, the function pointers turn into arguments with the & sigil, followed by their signature. The space between the name and the signature is required, but you’ll get an awesome error message if you forget.

Now we’ll just define the callbacks to use, they’ll just print an indented tree of opening and closing tag names. We aren’t required to put types and names in the signature, just like in most of Perl 6, so we’ll just leave them out where we can:

my $depth = 0;

sub start-element($, $elem, $)
{
    say "open $elem".indent($depth * 4);
    ++$depth;
}

sub end-element($, $elem)
{
    --$depth;
    say "close $elem".indent($depth * 4);
}

Just wire it up with some regular NativeCallery:

sub XML_ParserCreate(Str --> OpaquePointer)               is native('expat') { ... }
sub XML_ParserFree(OpaquePointer)                         is native('expat') { ... }
sub XML_Parse(OpaquePointer, Buf, int32, int32 --> int32) is native('expat') { ... }

my $xml = q:to/XML/;
    <calendar>
        <advent day="21">
            <topic title="NativeCall Bits and Pieces"/>
        </advent>
    </calendar>
    XML

my $parser = XML_ParserCreate('UTF-8');
XML_SetElementHandler($parser, &start-element, &end-element);

my $buf = $xml.encode('UTF-8');
XML_Parse($parser, $buf, $buf.elems, 1);

XML_ParserFree($parser);

And magically, Expat will call our Perl 6 subroutines that will print the expected output:

open calendar
    open advent
        open topic
        close topic
    close advent
close calendar

So callbacks are pretty easy in the end. You can see a more involved example involving pretty-printing XML here.

C++

Trying to call into a C++ library isn’t as straight-forward as using C, even if you aren’t dealing with objects or anything fancy. Take this simple library we’ll call cpptest, which can holler a string to stdout:

#include <iostream>

void holler(const char* str)
{
    std::cout << str << "!\n";
}

When you try to unsuspectingly call this function with NativeCall:

sub holler(Str) is native('cpptest') { ... }
holler('Hello World');

You get a nasty error message like Cannot locate symbol 'holler' in native library 'cpptest.so'! Why can’t Perl see the function right in front of its face?

Well, C++ allows you to create multiple functions with the same name, but different parameters, kinda like multi in Perl 6. You can’t actually have identical names in a native library though, so the compiler instead mangles the function names into something that includes the argument and return types. Since I compiled the library with g++ -g, I can get the symbols back out of it:

$ nm cpptest.so | grep holler
0000000000000890 T _Z6hollerPKc

So somehow _Z6hollerPKc stands for “a function called holler that takes a const char* and returns void. Alright, so if we now tell NativeCall to use that weird gobbledegook as the function name instead:

sub holler(Str) is native('cpptest') is symbol('_Z6hollerPKc') { ... }

It works, and we get C++ hollering out Hello World!, as expected… if the libary was compiled with g++. The name mangling isn’t standardized in any way, and different compilers do produce different names. In Visual C++ for example, the name would be something like ?holler@@ZAX?BPDXZ instead.

The proper solution is to wrap your function like so:

extern "C"
{
    void holler(const char* str)
    {
        std::cout << str << "!\n";
    }
}

This will export the function name like C would as a non-multi function, which is standardized for all compilers. Now the original Perl 6 program above works correctly and hollers without needing strange symbol names.

You still can’t directly call into classes or objects like this, which you probably would want to do when you’re thinking about NativeCalling into C++, but wrapping the methods works just fine:

#include <vector>

extern "C"
{
    std::vector<int>* intvec_new() { return new std::vector<int>(); }
    void intvec_free(std::vector<int>* vec) { delete v; }
    // etc. pp.
}

There’s a more involved example again.

Some C++ libraries already provide a C wrapper like that, but in other cases you’ll have to write your own. Check out LibraryMake, which can help you compile native code in your Perl 6 modules. There’s also FFI::Platypus::Lang::CPP for Perl 5, which lets you do calls to C++ in a more direct fashion.

Update on 2015-12-22: as tleich points out in the comments, there is an is mangled attribute for mangling C++ function names. So you might be able to call the pure C++ function after all and have NativeCall mangle it for you like your compiler would do – if your compiler is g++ or Microsoft Visual C++:

sub holler(Str) is native('cpptest') is mangled { ... }
holler('Hello World');

It doesn’t seem to be working for me though and fails with a don't know how to mangle symbol error. I’ll amend this post again if I can get it running.

Update on 2015-12-23: the NativeCall API has changed (thanks to jczeus for pointing it out) and now automatically adds a lib prefix to library names. The code changed from is native('libexpat') to is native('expat'). It will also complain that a version should be added to the library name, but I don’t want to weld this code to an exact version of the used libraries.

Day 19 – Introspection

Perl 6 is an Object-Oriented Programming Language. Well, really, it’s multi-paradigm; but one of the options (that’s used throughout its core) is definitily Object Orientation.

However, as the (current) perl6.org tells us, Perl 6 supports “generics, roles and multiple dispatch”, which are some very nice features, and already covered in other advent calendar posts.

But the one we’re going to take a look at today is the MOP. “MOP” stands for “Meta-Object Protocol”. It means that, instead of objects, classes, etc defining the language; they’re actually a part you can change (or just look at) from the user’s side of things.

Indeed, in Perl 6, you can add methods to a type, remove some, wrap methods, augment classes with more capabilities (OO::Actors and OO::Monitors are two such examples), or you could totally redefine it (and, say, use a Ruby-like object system. Such an example here).

But today, instead, we’re gonna look at the first part: Introspection. Looking at a type after it’s been built, getting to know it, and use these informations.

The module we’re going to build together is based on a need for the Sixcheck module (a QuickCheck-like module): generate some random data for a type, then feed that data to the function we’re testing, and check some post-condition.

So, we write our first version:

my %special-cases{Mu} =
  (Int) => -> { (1..50).pick },
  (Str) => -> { ('a'..'z').pick(50).join('') },
;
sub generate-data(Mu:U \t) {
  %special-cases{t} ?? %special-cases{t}() !! t.new;
}
generate-data(Int);

okay, that’s the first version we wrote. We note a few things:

  • We specify the key type for %special-cases. That’s because, by default, the type is Str. Obviously, we don’t want our types to be stringified. What we actually do is specify they’re going to be subtypes of “Mu” (which is at the top of the types “food chain”)
  • We put parentheses around Int and Str, to avoid stringification.
  • We use the :U in the function’s argument’s type. That means the value has to be undefined. Type objects (as in Int, Str, etc) are undefined, so it serves us well (a different unknown value you’ve probably seen is Nil).
  • Type objects really are… Objects, like any other object (more on that later).
    That’s why we can call .new on them, like here. It’s the same as directly calling Int.new, for example (that’s useful for consistency, and autovivification).
  • We provide a fallback for Int and Str, because calling Int.new and Str.new (0 and “”) would not give us any randomness in the data we create
  • Perl 6 automatically returns the last expression in a function, so we didn’t have to put a return there.

That’s our code to generate the data, fair and square. But we’re gonna need to generate much more than those simple examples.

The least we need to support is classes with properties: we want to look at the list of attributes, generate data for their type, and feed them to the constructor (the only classes we support right now are those with a parameterless constructor).

We need to be able to look inside a class. What we’re going to reach for, in Perl 6 terms, is the Meta-Object Protocol (or MOP for short). First off, let’s define a sample data class for our, huh, blog (the author is sorry for such a lack of imagination).

class Article {
  has Str $.title;
  has Str $.content;
  has Int $.view-count;
}
# we can manually create instances this way:
Article.new(title => "Perl 6 Advent, Day 19",
            content => "Magic!",
            view-count => 0);

But we don’t want to create the article by hand. We want to pass the class Article to our generate-data function, and get an Article back (with random data inside). Let’s go back to our REPL…

> say Article.^attributes;
(Str $!title Str $!content Int $!view-count)
> say Article.^attributes[0].WHAT;
(Attribute)

If you’ve clicked on the MOP link, you shouldn’t be surprised that we get a 3-elements array. If you’re still surprised by the syntax, .^ is the meta-method call. What it means is that a.^b translates to a.HOW.b(a).

If we want to know what’s available to us, we could just ask (removing the anonymous ones):

> Attribute.^methods.grep(*.name ne '<anon>')
(compose apply_handles get_value set_value container readonly package inlined WHY set_why Str gist)
> Attribute.^attributes
Method 'gist' not found for invocant of class 'BOOTSTRAPATTR'

Whoops… Seems like this is a bit too meta. Thankfully, we can use a very nice property of Rakudo: a lot of it is written in Perl 6! To know what’s available to us, we can just look up the source:

#     has str $!name;
...
#     has Mu $!type;

We got the name for the key, and the type to generate the value. Let’s see…

> say Article.^attributes.map(*.name)
($!title $!content $!view-count)
> say Article.^attributes.map(*.type)
((Str) (Str) (Int))

Yep! Seems correct! (If you’re wondering why we get $! (private) twigils back, it’s because $. only means a getter method will be generated. The attribute itself is still private, and accessible in the class).

Now, the only thing we need to build is a loop…

my %args;
for Article.^attributes -> $attr {
 %args{$attr.name.substr(2)} = generate-data($attr.type);
}
say %args.perl;

This is an example of what could be printed:

{:content("muenglhaxrvykfdjzopqbtwisc"), :title("rfpjndgohmasuwkyzebixqtvcl"), :view-count(45)}

You get different results each time you run your code (I don’t think it’ll produce an article worth reading, however…).

Only thing left to do is pass them to Article‘s constructor:

say Article.new(|%args);

(the prefix pipe | allows us to pass %args as named arguments, instead of a single positional argument). Again, you should get something like this printed:

Article.new(title => "kyvphxqmejtuicrbsnfoldgzaw", content => "jqbtcyovxlngpwikdszfmeuahr", view-count => 26)

Yay! We managed to create an Article instance “blindly”, without knowing nothing special about Article at all. Our code can be used to generate data for any constructor that expects its class attributes to be passed. Done!

Before I go, I need to remind you of one things: with great power comes a lot of “WTF”. If you’re only taking a peek like we do here, and gathering information, it should be all fine. But don’t go around and change the internals of Rakudo’s classes, because no one knows what’d happen then :-). As always, have a moderate amount to fun introspecting.

PS: Left as an exercise to the reader is the recursive implementation, move the loop to generate-data, so that we could add a User $.author attribute to Article, and get this one constructed as well. Good luck!

Day 18 – Sized, Typed, Shaped

If you’ve used Perl 5, you probably know PDL (Perl Data Language), the library to manipulate highly efficient data structures.

In Perl 6, we do not (yet) have full-fledged support for PDL (or a PDL-like module), but we do have support for specific features: sized types, shaped variables (and typed arrays).

The first feature, sized types, are low-level types that are composed of a generic low-level type name, and the number of bytes they use. Such an example is int1, int64 (aka int on 64-bit machines), buf8 (a “normal” byte buffer). If you’re interesting in reading the spec, it’s all in S09.

The second feature, shaped variables (still in S09), allow us to specify the fixed size of an array:

my @dwarves[7];

This means the only available indices are 0 to 6 (inclusive). This is an example from the Perl 6’s REPL (Read-Eval-Print-Loop. The input lines start with “>”):

> my @dwarves[7];
[(Any) (Any) (Any) (Any) (Any) (Any) (Any)]
> @dwarves[0] = 'Sleepy';
Sleepy
> @dwarves[7];
Index 7 for dimension 1 out of range (must be 0..6)

The last line of input mentions “dimension 1”. Indeed, Perl 6’s shaped variables can be multi-dimensional (see S09):

> my @ints[4;2];
[[(Any) (Any)] [(Any) (Any)] [(Any) (Any)] [(Any) (Any)]]
> @ints[4;3]
Index 3 for dimension 2 out of range (must be 0..1)
> @ints[4;1]
Index 4 for dimension 1 out of range (must be 0..3)
> @ints[0;0] = 1;
1
> @ints
[[1 (Any)] [(Any) (Any)] [(Any) (Any)] [(Any) (Any)]]

The first output shows us that Perl 6 filled our array with Any at first: this means the whole array is “empty”, and that it can store anything (since Any is a supertype of IntStr, etc).

We might not want that: our array should contain integers (Int). Instead, we can declare a typed array:

> my Int @a[8] = 0..7;
[0 1 2 3 4 5 6 7]
> @a[8] = 8;
Index 8 for dimension 1 out of range (must be 0..7)
> @a[0].WHAT.say # print the type
(Int)

This is all very useful in itself, but there’s a special property we can benefit from, by using everything presented in this post together: contiguous storage!

Natively-typed shaped arrays in Perl 6 are specced to take contiguous memory, meaning we get a very good memory layout, and use little memory

> my int @a[2;4] = 1..4, 5..8;
[[1 2 3 4] [5 6 7 8]]
> @a[0;0]++;
2
> @a.WHAT.say
(array[int])

Because we know how much memory each element takes, and how many elements we have, we can very easily calculate the array upfront, once and for all – or even calculate the size necessary to store the elements: we know that int, on our 64-bit computers (that’s int64), we need 2*4*8 bytes for our 2*4 array.

Wrapping up; that’s 64 bytes total (…almost), and we get to still write Perl 6! We can use the operations we know and love (even meta/hyper-operators) and al – they’re still arrays – with our cool performance boost. Amazing!

(note about the size: we need to add some overhead for the runtime; on MoarVM, that’s something like 16 bytes for GC header, 16 bytes for the dimensions list. That’s a grand total of 16+16+64=96 bytes. pretty cool!)

Day 16 – Yak Shaving for Fun and Profit (or How I Learned to Stop Worry and Love Perl 6)

History

A little over a year and a half ago I decided to write a radio station management application in Perl 5, I won’t bore you with the detailed reasons but it involved hubris and not liking having to modify an existing application that was written in multiple languages that I didn’t enjoy working with.   Most of the required parts were available on CPAN and the requirements were clear, so it was going along, albeit slowly.

Anyway in the early part of this year a couple of things coincided to make me consider that there would be more value in making the application in Perl 6:  the probability of a release before Christmas 2015 tied in with my original estimate that it would take approximately a year to finish the application, the features of Perl 6 that I  was aware of would lead to a nice neat design, and frankly it seemed like a cool idea to get in there with a large, possibly useful, application right as Perl 6 was beginning to enter the mainstream. I largely stopped working on the Perl 5 version and started looking at what I needed to be able to make it in Perl 6.  Of course this would prove to be even more hubristic and deluded than the original idea, but I didn’t know that at the time.

Entering the Yak Farm

It was immediately clear that I was going to have to write a lot more code for myself: at the turn of the year there were approximately 275 distributions in the Perl 6 modules ecosystem, whereas there are somewhere in the region of 30,000 modules on CPAN going back some twenty years. There were bits and pieces that I was going to need: Database access was coming along, there was a RabbitMQ client, people were working on web toolkits so some of the heavy lifting was already being worked on.

Having determined that I was going to have to write some modules it seemed that porting some of my existing CPAN modules might be a good way of getting up to speed and doing something useful into the bargain (though I’m not anticipating using any of them in the larger project.)

I thought I’d start with Linux::Cpuinfo because it was a fairly simple proposition on the face of it and I haven’t been happy with the AUTOLOAD that the Perl 5 version uses since a few months after I wrote it nearly fifteen years ago.

Straight into MOP Land

As it turns out losing the AUTOLOAD in Perl 6 was a fairly simple proposition, infact it could be replaced directly with a method called  “FALLBACK” in the class which gets called with the name of the required method as the first argument thus not requiring the not quite so nice global variable in Perl 5.  But it seemed nicer to take advantage of Perl 6’s Meta Object Protocol (MOP) and build a class based on the fields found in the /proc/cpuinfo on the fly, so I ended up with something like:

multi method cpu_class() {
    if not $!cpu_class.isa(Linux::Cpuinfo::Cpu) {
         my $class_name = 'Linux::Cpuinfo::Cpu::' ~ $!arch.tc;
         $!cpu_class := Metamodel::ClassHOW.new_type(name => $class_name);
         $!cpu_class.^add_parent(Linux::Cpuinfo::Cpu);
         $!cpu_class.^compose;
    }
    $!cpu_class;
}

And then add the fields from the data with:

submethod BUILD(:%!fields ) {
     for %!fields.keys -> $field {
         if not self.can($field) {
             self.^add_method($field, { %!fields{$field} } );
         }
     }
}

In the parent class of the newly made class, works really nicely. This is all going swimmingly, you can construct and manipulate classes using a documented interface without any external dependencies.

Losing the XS

The fact that the Perl 6 NativeCall interface allows you to bind functions defined in a dynamic library directly without requiring any external code didn’t seem to really help with the next couple of modules I chose to look at (Sys::Utmp and Sys::Lastlog) as the majority of the code in the Perl 5 XS files is actually dealing with the differences in various operating systems ideas of the respective interfaces as much as providing the XSUBs to be used by the Perl code.  As it happens this isn’t really so much of a problem as, with all the XSisms stripped out, the code can be compiled and linked to a dynamic library that can be used via NativeCall, even better someone had already made  LibraryMake that makes integrating all this into the standard build mechanisms really quite easy.  Infact it all proved so easy I made both of those modules in a few days rather than just the one that I had intended to do in the first place.

Anyhow by this point it was probably time to start on something that I might need in order to make the radio software, so I settled on Audio::Sndfile – it seemed generally useful and libsndfile is robust and well understood. And this is where the yak shaving really began to kick in. The module itself was really quite easy thanks to NativeCall despite the size of the interface and a subsequently fixed bug in the native arrays, but it occurred to me that automated testing of the module would be somewhat compromised if the dynamic library it depends on is not installed.

Testing Times

In order to facilitate the nicer testing of modules that depend on optionally installed dynamic libraries, I made LibraryCheck, though to be honest it was so easy I was surprised that no-one had made it before.  It exports a single subroutine that returns a boolean to indicate whether the specified library can be loaded or not, so in a test you might do something like:

use LibraryCheck;
if !library-exists('libsndfile') {
    skip "Won't test because no libsndfile", 10;
    exit;
}

As an aside you’ll notice that the extension to the dynamic library isn’t specified because NativeCall works that out for you (be it “.so”, “.dll”, “.dylib” or whatever.)

I’ve actually taken to putting the check in a Build.pm (which is used by panda if present to take user specified actions as part of the build):

class Build is Panda::Builder {
    method build($workdir) {
        if !library-exists('libsndfile') {
            say "Won't build because no libsndfile";
            die "You need to have libsndfile installed";
        }
        True;
     }
}

This has the effect of aborting the build before the tests are attempted, which for automated testing has the benefit of not showing false positives where the tests could not be attempted.

Of course a radio software doesn’t only need to read audio files, it really also should be able to stream audio out to listeners, so I next decided to make Audio::Libshout which binds the streaming source client library for Icecast libshout.  Not only does the testing of this depend on the dynamic library, but also requires the network service supplied by Icecast, so it would be nice to check whether the service was actually available before performing some of the tests.  So to this end I made CheckSocket which does exactly that using what is actually a fairly common pattern in tests for network services. It can be used in a similar fashion to LibraryCheck:

use CheckSocket;
if not check-socket($port, $host) {
    diag "not performing live tests as no icecast server";
    skip-rest "no icecast server";
    exit;
}

I’ve subsequently added it to at least one other author’s tests, it nicely encapsulates a pattern which would otherwise be ten or so lines of boilerplate that would have to be copied and pasted into each test file.

Enter the Gates of Trait

A feature of the implementation of libshout is that it has an initialisation function that returns an opaque “handle” pointer, that gets passed to the other functions of the library, in a sort of object oriented fashion, additionally it provides getter and setter functions for all of the parameters, these would best be modelled as read/write accessors in a Perl 6 class, but there would be a lot of boilerplate code to write these out by hand.  Having looked at a number of similar libraries I concluded that this may be a common pattern, so I wrote AccessorFacade to encapsulate the pattern.

AccessorFacade is implemented as a Perl 6 trait (I won’t explain “trait” here as a previous advent post has already done that,)  it allowed me to turn:

     sub shout_set_host(Shout, Str) returns int32 is native('libshout') { * }
     sub shout_get_host(Shout) returns Str is native('libshout') { * }

     method host() is rw {
         Proxy.new(
                    FETCH => sub ($) {
                                       shout_get_host(self);
                    },
                    STORE => sub ($, $host is copy ) {
                        explicitly-manage($host);
                        shout_set_host(self, $host);
                    }
         );
     }

Of which there may  be over a dozen or so, into:

sub shout_set_host(Shout, Str) returns int32 is native('libshout') { * } 
sub shout_get_host(Shout) returns Str is native('libshout') { * }

method host() is rw is accessor-facade(&shout_set_host, &shout_get_host) { }

Thus saving tens of lines of boilerplate, copy and paste mistakes and simplified testing. It probably took me longer to write AccessorFacade nicely after I had worked out how to do traits, than I ended up doing for Audio::Libshout. Which is a result, as next up I decided I needed to write Audio::Encode::LameMP3 in order to stream audio data that wasn’t already MP3 encoded, and it transpired that the mp3lame library also used the getter/setter pattern that AccessorFacade targetted, having that library enabled me to finish it even quicker than anticipated.

Up to my Ears in Yaks

Digital audio data is typically represented by large arrays of  numbers and with the native bindings there is a lot of creating native CArrays of the correct size, copying Perl arrays to native arrays and copying native arrays to Perl arrays and so forth, this was clearly a generally useful pattern so I created NativeHelpers::Array to collect all the common use cases, refactored all the audio modules to use the subroutines it exports and found myself in the position of having written more modules to help me make the modules that I wanted to write than the modules I actually wanted to write.  So in order to get things in balance I wrote Audio::Convert::Samplerate that used some of the helpers above.

Around this time I decided that I probably needed to concentrate on some application infrastructure requirements rather than domain specific things if I wanted to get anywhere with the original plan and started on a logging framework that I had been thinking about for a while.  I immediately concluded that I needed a singleton logger object so I wrote Staticish.

Staticish is implemented as a class trait that basically does two things: it adds a role to the class that gives it a singleton constructor that will always only return the same object (which is created the first time it is called,) and applies a role to the classes MetaClass (the .HOW,) which will cause the methods (and public accessors) of the class to be wrapped such that if the method is called on the type object of the class the singleton object will be obtained from the constructor and it will be called on that instead, so you can do something like:

use Staticish;

class Foo is Static {
    has Str $.bar is rw;
}

Foo.bar = "There you go";
say Foo.bar; # > "There you go";

It does exactly what I need it to, but I still haven’t finished that logging framework.

All at Sea with JSON

Earlier in the year I had started working on a Couch DB interface module with a view to using a document rather than a relational database in the application which is part of the reason I had been helping to make the HTTP client modules do some of the things that were needed, but another part, for me at least, was the ability to round-trip a Perl 6 class marshalled to JSON and back again, or vice versa. Half of that already existed with JSON::Unmarshal so I proceeded to make JSON::Marshal to do the opposite and then JSON::Class which provides a role such that a class has  to-json and from-json methods, all so good so far and you can do:

 use JSON::Class;

 class Something does JSON::Class {
     has Str $.foo;
 }

 my Something $something = Something.from-json('{ "foo" : "stuff" }');
 my Str $json = $something.to-json(); # -> '{ "foo" : "stuff" }'

But it needed some real world application to test it in, and fortunately this presented itself in the Perl 6 ecosystem itself: the META6.json that is crucial to the proper installation of a module. It is quite easy to mis-type if you are manually creating JSON, so META6 which can read and write the META6 files using JSON::Class and Test::META which can provide a sanity test of the meta file seemed like a good idea. Almost inevitably JSON::Unmarshal and JSON::Marshal needed to be revisited to allow custom marshallers/un-marshallers to be defined, (using traits, natch,) for certain attributes.

I had actually already started making JSON::Infer a year ago in Perl 5 for reasons that I could make an entire post about, but having already started down the JSON path it seemed easy to finish in Perl 6 and use JSON::Class in the created classes to allow the creation of JSON web service API classes easily: unfortunately JSON::Class (or rather its depencies) didn’t even survive the encounter with the test data, so I made JSON::Name so JSON object attribute who’s name wasn’t a valid Perl 6 identifier could be marshalled or unmarshalled correctly.  All fixed up this gave me the impetus to finish the WebService::Soundcloud which I had been playing with for a year or so.  I still haven’t finished the CouchDB interface.

Nowhere near to Land

So now toward the end of the year, there are 478 modules in the ecosystem and somehow I have made 29 of them and I haven’t even started to make the application that I set out to make at the beginning of the year, but I’m happy with that as hopefully I’ve made some modules that may be useful to someone else, I’ve become a confident Perl 6 programmer and the things I’ve learned have helped improve the documentation and possibly some of the code. I will finish the radio software in Perl 6 and I still know I have a lot of software to write but it becomes easier every day.  So if you’re thinking of making an application in Perl 6 yourself enjoy the journey more than you may be frustrated by not reaching the destination when you expected as you are probably travelling a path along which few have been before.

 

Day 15 – 2015 The Year of The Great List Refactor

The Great List Refactor (GLR) was probably one of the most tricky and problematic changes in terms of design and implementation that Perl 6 has undergone recently.

It’s also fairly hard to explain! But hopefully some historical background might help with this. There was much GLR discussion at the 2014 Austrian Perl Workshop which was written up by Patrick Michaud on his blog.

The problems the GLR was intended to address were those of performance and consistency in the way lists and related types operated. It was also clear that changing such basic data types would be painful.

Traditionally Perl 5 flattens such that

% perl -dE 1 -MData::Dumper

[...snip...]

DB<1> @foos=((1,2),3)

DB<2> say Dumper \@foos
$VAR1 = [
         1,
         2,
         3
        ];

and initially much Perl 6 behaviour resembled this but by the end of last year there had been increased use of non-flattening behaviour which preserves the original data structure.

However this was not consistent with many edge cases and there was use of a data type Parcel which was heavily used internally in Rakudo but by then regarded as a Bad Idea – mainly for reasons of performance.

A draft twin of the design document covering lists “Synopsis 7” was imported by Patrick in June 2015 and became the official S07 the following month.

August was a very busy month for the GLR.  Jonathan Worthington experimented with a single gist based GLR code fragment which then became a separate GLR git branch under the Rakudo repository.  Many people worked on this branch with Stefan Seifert (nine) being particularly productive.  For a while the IRC channel had two bots camelia and GLRelia (the latter tracking the GLR branch) so people could quickly try and compare the behaviour of code under the old and new systems.  There were a huge number of changes and software such as panda had to be patched to remain functional.

The Parcel data type became List (which was immutable and used round brackets) and Array was a mutable subclass of List which used square brackets.  There was no longer implicit list flattening for arrays.  Simple arrays could be flattened with the .flat method.

Pre GLR

> my @array = 1,(2,3),4

1 2 3 4

> @array.elems

4

Post GLR

my @array = 1,(2,3),4

[1 (2 3) 4]

> @array.elems

3

> my @array = (1,(2,3),4).flat

[1 2 3 4]

It was also possible to “Slip” a list into an array

> my @a = 1, (2, 3).Slip, 4

[1 2 3 4]

Also

> my @a = 1, |(2, 3), 4
[1 2 3 4]

Sequences (to be consumed once only) were introduced

> my $grep = (1..4).grep(*>2)
(3 4)
> $grep.^name
Seq

and the .cache method can be used to prevent consumption of the Seq

The Single Argument Rule

The arguments passed to iterators such as the for loop obey “the single arg rule” which is that they are treated as a single arg even if they appear multiple arguments at first glance. Generally this makes for behave as the programmer would expect apart from one Gotcha example with a trailing comma.

my @a=1,2,3;

my ($i,$j);
for (@a) {
    $i++;
}

for (@a,) { # this is actually a single element list!
    $j++;
}

say :$i.gist; # => 3
say :$j.gist; # => 1

S07 was then rewritten by Jonathan Worthington in September 2015. The net result was by then S07 had under gone many changes eg. Parcel was removed, reintroduced and finally removed again!

https://github.com/perl6/specs/blob/master/S07-lists.pod

For such a radical change to Perl 6 the GLR went remarkably smoothly within Rakudo itself.  But of course much existing Perl 6 outside of Rakudo in the ecosystem and other places had to be rewritten in order to continue to work. An example is "perl6-examples" which seem to use lists and arrays particularly heavily.

For many months the GLR had been the Bugbear of Perl 6. A bugbear which has now happily departed.

Day 14 – A nice supplies: syntactic relief for working with asynchronous data

Asynchronous data is all around us. What is it? It’s data that arrives whenever it pleases, as opposed to when we ask for it. Some common examples are:

  • GUI events
  • Data arriving over an async socket
  • Messages arriving from a message queue
  • Ticks of a timer
  • File change notifications
  • Signals

These contrast with synchronous data, which we get when we ask for it. For example, when we query a database and iterate the resulting rows, or iterate through file system entries in a directory, we block until the data we wish to have is available.

Our programming languages tend to be pretty good about synchronous data, giving us a range of syntactic constructs for producing and processing it. For example, in Perl 6 we have for loops for consuming streams of synchronously produced values, and gather blocks for producing them – perhaps doing so by in turn using for loops to consume other synchronous data sources. Inside for and gather, we’re free to use state (kept in variables), and use our comfortable range of conditionals (if, when, etc.) That is to say, we’re free to act like normal, down to earth, imperative programmers in dealing with our synchronous data. Sure, sometimes something just looks far nicer using the functional style, and we map, grep, zip, and reduce our way to the solution. But some problems are just unnatural – at least for most of us, me included – to see in that way.

In Perl 6, we have a common interface for talking about things that produce values over time – that is, synchronous data. These things do the Iterable role, and work in terms of Iterator objects. You never really see an Iterator in day to day Perl 6, and instead see the Seq type. These Seqs are the things gather blocks produce, and for loops can consume.

In the last years, we introduced supplies to Perl 6. Supplies are our common interface for talking about asynchronous data. These have, from the beginning, been rather well received. With supplies, you can grep UI events, map file system notifications, and zip whatever you fancied with ticks of a timer. And that’s just great when it’s easy to think about your problem in functional terms. But what about the rest of the time?

In the last year, we’ve also added supply/react blocks, along with the whenever asynchronous looping construct. In this post, we’ll take a look at how they might be used.

STOMP!

As I wrote the list of sources of asynchronous data at the start of this post, I realized that there was only one of them that I’d not yet done in Perl 6: working with message queues. Hmm. Two hours until midnight. Advent post needs completing before I sleep. Challenge accepted!

10 minutes later, and I’ve RabbitMQ happily installed and am reading the STOMP specification. STOMP is a simple text-oriented message protocol – that is, a text-based way to deal with message queues. Within an hour, I’d got a working STOMP client handling the absolute basics of sending messages to a queue and subscribing to a queue to get messages. So, let’s take a look at how I did it.

First, I sketched in a Stomp::Client class. I figured it would need the host and port we want to connect to, the connection credentials (yes, yes, this is not the height of security engineering), and something to hold the socket representing the connection to the server.

class Stomp::Client {
    has Str $.host is required;
    has Int $.port is required;
    has Str $.login = 'guest';
    has Str $.password = 'guest';
    has $!connection;
}

So, let’s get connected. The connect method will contain a start block, because we want it to function asynchronously. This means the method will return a Promise that the caller can await. To connect to a stomp server, you establish a TCP connection and send a CONNECT frame:

method connect() {
    start {
        my $conn = await IO::Socket::Async.connect($!host, $!port);
        await $conn.print: qq:to/FRAME/;
            CONNECT
            accept-version:1.2
            login:$!login
            passcode:$!password
            
            \0
            FRAME
        True
    }
}

But…then what? Well, then you wait for the server to either return a CONNECTED frame or an ERROR frame. Hm, seems we’ve some work to do. First is to translate the BNF from the STOMP specification into a Perl 6 grammar:

my grammar StompMessage {
    token TOP {
        <command> \n
        [<header> \n]*
        \n
        <body>
        \n*
    }
    token command {
        < CONNECTED MESSAGE RECEIPT ERROR >
    }
    token header {
        <header-name> ":" <header-value>
    }
    token header-name {
        <-[:\r\n]>+
    }
    token header-value {
        <-[:\r\n]>*
    }
    token body {
        <-[\x0]>* )> \x0
    }
}

Note it’s lexically scoped inside of our STOMP::Client class. So is the following little message data structure:

my class Message {
    has $.command;
    has %.headers;
    has $.body;
}

An IO::Socket::Async connection exposes incoming data as a Supply. There are a few things to consider:

  • A message may be spread over multiple packets
  • Two messages may be in one packet
  • The next packet may arrive while we’re processing the current one, and since Perl 6 delivers I/O notifications on the thread pool – like various other languages – we might worry about data races

In fact, the third point is a non-problem: Perl 6 promises that, unless you go out of your way to avoid it, supplies are serial, and will only have you processing one message at a time on a given message pipeline. Supplies are a tool for taming concurrency rather than introducing it.

Now, if I was writing this synchronously, and I wanted to expose a Seq of the incoming messages, I’d use a gather block, keep myself a buffer, recv each incoming chunk of data from the socket, and try to parse a new message at the start of the buffer. Whenever I got a message I’d take it. What about with supplies? Here’s how it looks:

method !process-messages($incoming) {
    supply {
        my $buffer = '';
        whenever $incoming -> $data {
            $buffer ~= $data;
            while StompMessage.subparse($buffer) -> $/ {
                $buffer .= substr($/.chars);
                if $<command> eq 'ERROR' {
                    die ~$<body>;
                }
                else {
                    emit Message.new(
                        command => ~$<command>,
                        headers => $<header>
                            .map({ ~.<header-name> => ~.<header-value> })
                            .hash,
                        body => ~$<body>
                    );
                }
            }
        }
    }
}

Taking it from the top, $incoming will be the Supply of data arriving from the socket, decoded to strings. We wrap the code in this method up in a supply block, since we’d like to return a new Supply. We declare $buffer, to hold data we’ve yet to parse. In this case, our thread safety is fairly trivial: $incoming will deliver a message at a time. But what if we were going to bring together data from many supplies? We’d still be fine: a supply block promises that, per tapping of the Supply, only one thread will ever be allowed in the code inside of the supply block at a time.

The whenever construct is a asynchronous loop. It taps the supply we specify, and the loop body will run whenever a value is emitted by that supply. In our case, that is every time we get data from the socket. We concatenate the incoming $data onto the $buffer. We then try to parse a message at the start of the buffer (subparse means that we don’t mind if we don’t reach the end of the buffer, unlike parse which will complain if it doesn’t completely parse the incoming string). When we do parse data, we lop it off the start of the buffer.

For each message we parse, we first see if it’s an ERROR frame; if so, we simply die with the message. A die inside a whenever block will propagate the error to whatever tapped the supply, so errors will not be lost. Otherwise, we’ve some kind of more interesting message; we turn the parsed data into a Message instance and emit it.

A supply block gives an “on-demand” supply. Each tap performed on the block will run the supply block, which in turn will tap each of the whenevers. That’s the right thing in most cases, but for my Stomp::Client I’d like to have various bits of code tap the same messages, to look for interesting things. One of these bits of code will look for a single CONNECTED message. So, I’ll add a $!incoming attribute to my class:

has $!incoming;

And then, after establishing the connection to the message queue server, do this:

$!incoming = self!process-messages($conn.Supply).share;

That is, I share the result of processing messages out amongst everything that is interested. And interested things will tap the $!incoming supply and filter out what they care about. For example, here’s how I create a Promise that will be kept when we receive a CONNECTED frame:

my $connected = $!incoming
    .grep(*.command eq 'CONNECTED')
    .head(1)
    .Promise;

Here is the connect method in full:

method connect() {
    start {
        my $conn = await IO::Socket::Async.connect($!host, $!port);
        $!incoming = self!process-messages($conn.Supply).share;
        
        my $connected = $!incoming
            .grep(*.command eq 'CONNECTED')
            .head(1)
            .Promise;
        await $conn.print: qq:to/FRAME/;
            CONNECT
            accept-version:1.2
            login:$!login
            passcode:$!password
            
            \0
            FRAME
        await $connected;
        $!connection = $conn;
        
        True
    }
}

Sending messages is easy:

method send($topic, $body) {
    self!ensure-connected;
    $!connection.print: qq:to/FRAME/;
        SEND
        destination:/queue/$topic
        content-type:text/plain
 
        $body\0
        FRAME
}

And subscription is implemented using another supply block, and a bit of simple filtering on the incoming message:

method subscribe($topic) {
    self!ensure-connected;
    state $next-id = 0;
    supply {
        my $id = $next-id++;
        
        $!connection.print: qq:to/FRAME/;
            SUBSCRIBE
            destination:/queue/$topic
            id:$id
 
            \0
            FRAME
        
        whenever $!incoming {
            if .command eq 'MESSAGE' && .headers<subscription> == $id {
                emit .body;
            }
        }
    }
}

And with that, we have a working STOMP client. Let’s try it:

my $queue = Stomp::Client.new(host => 'localhost', port => 61613);
await $queue.connect();
await $queue.send("test", "hello world");
react {
    whenever $queue.subscribe('test') {
        .say;
    }
}

Here, for the first time, we encounter the react block. What’s that? Well, if you imagine that supply blocks are a little like gather – that is, used to create something that can produce values – a react block is for the places you might normally use a for loop when processing synchronous data. However, in the asynchronous world you’re usually interested in a range of things that might happen. A react block runs, sets up all the whenevers, and then blocks until either all of the supplies tapped by whenever blocks are done, or the “done” control operator is used somewhere inside of react.

A little example

Suppose that we wanted to build a simple monitoring system. Workers “sign on”, and must send us a ping every 10 seconds. If they fail to do so, then a monitor will report this. Workers can sign off when finished. Here is the worker process:

sub MAIN($name) {
    my $queue = Stomp::Client.new(host => 'localhost', port => 61613);
    await $queue.connect();
    $queue.send('signon', $name);
    react {
        whenever Supply.interval(10, 5) {
            if <y y n>.pick eq 'y' {
                $queue.send('ping', $name);
            }
        }
    
        whenever signal(SIGINT) {
            $queue.send('signoff', $name);
            done;
        }
    }
}

We start it from the command line with a name. Every 10 seconds it sends a ping – or at least, two thirds of the time it will. And if we hit Ctrl+C the worker signs off.

Next, here’s the monitor:

my $queue = Stomp::Client.new(host => 'localhost', port => 61613);
await $queue.connect();
react {
    my %last-active;
 
    whenever $queue.subscribe('signon') -> $name {
        %last-active{$name} = now;
    }
  
    whenever $queue.subscribe('ping') -> $name {
        %last-active{$name} = now;
    }
  
    whenever $queue.subscribe('signoff') -> $name {
        %last-active{$name}:delete;
    }
 
    whenever Supply.interval(10) {
        for %last-active.kv -> $name, $last-ping {
            if now - $last-ping > 11 {
                say "No ping from $name";
            }
        }
    }
}

Note that the react block, like supply block, enforces that only a single thread at a time may be operating across the various whenever blocks, so the %last-active hash is safe. We have three subscriptions to queues, and every ten seconds look for missing pings. We give an extra second’s grace on those.

Notice how, by exposing the message queue in terms of supplies, we are able to effortlessly compose this asynchronous data with both signals and time. This is where the real power of a uniform interface to asynchronous data comes in. And, with a little syntactic support, we are no longer required to put our functional hat on to wield that power.