Day 22 – Perl6 and CPAN

Here’s the short version.

Not yet; but efforts are underway. Stay tuned! News at http://pl6anet.org. And, as always, hit up #perl6 on freenode if you want to talk about it.

Please continue below if you’ve any interest in an overly detailed advent post.

First, I’d like to point out that the imminent Christmas Release is largely not concerned with the topic of this post. I’ll leave it to someone more qualified to qualify the release in more precise terms but suffice it to say its about the Perl6 language and at least one implementation. That does not include non-core ecosystem concerns such as: packaging, distribution, searching, installing, testing services, linters, etc… In the Perl5 world these things are collectively known as “CPAN” and are a huge part of what makes Perl5 useful to many.

The second item I’d like to bring to your attention is that we’ve had an ecosystem solution for quite some time now. Its basically a collection of repos hosted at github (https://github.com/perl6/ecosystem/blob/master/META.list), which can be searched (modules.perl6.org) and installed (panda or zef).
If you want to publish or use Perl6 modules now then this is the way to go. Its probably worth noting that versioning support is not fully implemented yet.

And now onto the future! What should a Perl6 ecosystem be? This is still an open question and an active area of experimentation. Should it be based on shiny things like github and travis? This could probably be made to work by adding versioning support and a few other things to the existing ecosystem. But then there’s the issue of being dependent on entities we don’t control. Should it be built from scratch? I’m aware of one example of that: http://cpan6.org/. Or should it be based on, or even built on top of, Perl5’s CPAN?

I decided a while back I wanted to explore the last of those options.
See http://jdv79.blogspot.com/2015/10/perl6-and-cpan.html and
http://jdv79.blogspot.com/2015/10/perl6-and-cpan-metacpan-status-as-of.html. Those two posts cover it pretty well actually.

Since then all I’ve done is fix Perl6 related bugs in MetaCPAN. Things like syntax highlighting, Pod6 rendering, and of course searching more or less work. One of the better working dists is
http://hack.p6c.org:5001/release/Data-Selector since that’s the one I did the most testing with. And here’s the full listing: http://hack.p6c.org:5001/author/JDV.

Also, just a few days ago, Ranguard decided to help move us forward a bit by basically uploading the ecosystem onto CPAN under the PSIXDISTS user. Unfortunately, or fortunately depending on how you look at it, this led to the discovery of a bug in PAUSE which is not yet fixed (https://github.com/andk/pause/issues/194). It’s probably best to wait until that’s fixed before uploading any Perl6 dists to CPAN.

In summary, we only have the beginnings of PAUSE and MetaCPAN support. Once we get the installers working we’ll have a useful Perl6 CPAN that we can begin to play with.  Or throw out – who knows:) Until then use the ecosystem at http://modules.perl6.org/ for all your Perl6 module needs.

UPDATE: I just noticed the aforementioned PAUSE bug may be fixed as of about an hour ago (~ 1am cet here). Andk++

Day 21 – NativeCall-backs and Beyond C

One of my favorite features in Perl 6 is the NativeCall interface, because it allows gluing virtually any native library into it relatively easily. There have even been efforts to interface with other scripting languages so that you can use their libraries as well.

There have already been a pair of advent posts on NativeCall already, one about the basics in 2010 and one about objectiness in 2011. So this one won’t repeat itself in that regard, and instead be about Native Callbacks and C++ libraries.

Callbacks

While C isn’t quite as good as Perl at passing around functions as data, it does let you pass around pointers to functions to use them as callbacks. It’s used extensively when dealing with event-like stuff, such as signals using signal(2).

In the NativeCall docs, there’s a short quip about callbacks. But they can’t be that easy, can they?

Let’s take the Expat XML library as an example, which we want to use to parse this riveting XML document:

<calendar>
    <advent day="21">
        <topic title="NativeCall Bits and Pieces"/>
    </advent>
</calendar>

The Expat XML parser takes callbacks that are called whenever it finds and opening or closing XML tag. You tell it which callbacks to use with the following function:

XML_SetElementHandler(XML_Parser parser,
                      void (*start)(void *userdata, char *name, char **attrs),
                      void (*end)(void* userdata, char *name));

It associates the given parser with two function pointers to the start and end tag handlers. Turning this into a Perl 6 NativeCall subroutine is straight-forward:

use NativeCall;

sub XML_SetElementHandler(OpaquePointer $parser,
                          &start (OpaquePointer, Str, CArray[Str]),
                          &end   (OpaquePointer, Str))
    is native('expat') { ... }

As you can see, the function pointers turn into arguments with the & sigil, followed by their signature. The space between the name and the signature is required, but you’ll get an awesome error message if you forget.

Now we’ll just define the callbacks to use, they’ll just print an indented tree of opening and closing tag names. We aren’t required to put types and names in the signature, just like in most of Perl 6, so we’ll just leave them out where we can:

my $depth = 0;

sub start-element($, $elem, $)
{
    say "open $elem".indent($depth * 4);
    ++$depth;
}

sub end-element($, $elem)
{
    --$depth;
    say "close $elem".indent($depth * 4);
}

Just wire it up with some regular NativeCallery:

sub XML_ParserCreate(Str --> OpaquePointer)               is native('expat') { ... }
sub XML_ParserFree(OpaquePointer)                         is native('expat') { ... }
sub XML_Parse(OpaquePointer, Buf, int32, int32 --> int32) is native('expat') { ... }

my $xml = q:to/XML/;
    <calendar>
        <advent day="21">
            <topic title="NativeCall Bits and Pieces"/>
        </advent>
    </calendar>
    XML

my $parser = XML_ParserCreate('UTF-8');
XML_SetElementHandler($parser, &start-element, &end-element);

my $buf = $xml.encode('UTF-8');
XML_Parse($parser, $buf, $buf.elems, 1);

XML_ParserFree($parser);

And magically, Expat will call our Perl 6 subroutines that will print the expected output:

open calendar
    open advent
        open topic
        close topic
    close advent
close calendar

So callbacks are pretty easy in the end. You can see a more involved example involving pretty-printing XML here.

C++

Trying to call into a C++ library isn’t as straight-forward as using C, even if you aren’t dealing with objects or anything fancy. Take this simple library we’ll call cpptest, which can holler a string to stdout:

#include <iostream>

void holler(const char* str)
{
    std::cout << str << "!\n";
}

When you try to unsuspectingly call this function with NativeCall:

sub holler(Str) is native('cpptest') { ... }
holler('Hello World');

You get a nasty error message like Cannot locate symbol 'holler' in native library 'cpptest.so'! Why can’t Perl see the function right in front of its face?

Well, C++ allows you to create multiple functions with the same name, but different parameters, kinda like multi in Perl 6. You can’t actually have identical names in a native library though, so the compiler instead mangles the function names into something that includes the argument and return types. Since I compiled the library with g++ -g, I can get the symbols back out of it:

$ nm cpptest.so | grep holler
0000000000000890 T _Z6hollerPKc

So somehow _Z6hollerPKc stands for “a function called holler that takes a const char* and returns void. Alright, so if we now tell NativeCall to use that weird gobbledegook as the function name instead:

sub holler(Str) is native('cpptest') is symbol('_Z6hollerPKc') { ... }

It works, and we get C++ hollering out Hello World!, as expected… if the libary was compiled with g++. The name mangling isn’t standardized in any way, and different compilers do produce different names. In Visual C++ for example, the name would be something like ?holler@@ZAX?BPDXZ instead.

The proper solution is to wrap your function like so:

extern "C"
{
    void holler(const char* str)
    {
        std::cout << str << "!\n";
    }
}

This will export the function name like C would as a non-multi function, which is standardized for all compilers. Now the original Perl 6 program above works correctly and hollers without needing strange symbol names.

You still can’t directly call into classes or objects like this, which you probably would want to do when you’re thinking about NativeCalling into C++, but wrapping the methods works just fine:

#include <vector>

extern "C"
{
    std::vector<int>* intvec_new() { return new std::vector<int>(); }
    void intvec_free(std::vector<int>* vec) { delete v; }
    // etc. pp.
}

There’s a more involved example again.

Some C++ libraries already provide a C wrapper like that, but in other cases you’ll have to write your own. Check out LibraryMake, which can help you compile native code in your Perl 6 modules. There’s also FFI::Platypus::Lang::CPP for Perl 5, which lets you do calls to C++ in a more direct fashion.

Update on 2015-12-22: as tleich points out in the comments, there is an is mangled attribute for mangling C++ function names. So you might be able to call the pure C++ function after all and have NativeCall mangle it for you like your compiler would do – if your compiler is g++ or Microsoft Visual C++:

sub holler(Str) is native('cpptest') is mangled { ... }
holler('Hello World');

It doesn’t seem to be working for me though and fails with a don't know how to mangle symbol error. I’ll amend this post again if I can get it running.

Update on 2015-12-23: the NativeCall API has changed (thanks to jczeus for pointing it out) and now automatically adds a lib prefix to library names. The code changed from is native('libexpat') to is native('expat'). It will also complain that a version should be added to the library name, but I don’t want to weld this code to an exact version of the used libraries.