Day 16: The Meta spec, Distribution, and CompUnit::Repository explained-ish


This is a tool chain post. You have been warned.

In this post I’m going to explain the typical journey of a module from a request for a distribution (such as zef install My::Module) to the point you are able to perl6 -e 'use My::Module;'. Afterwards I’ll go over implementation details, and how they allow us to do cool things like use a .tar.gz archive or the github API for loading source code.

Most of this stuff is not documented yet, but the source code for these items is not difficult to grok and commented thoroughly. Don’t expect to understand how to actually do anything new after reading this. This is intended to give a high level description of the design considerations used and what they will allow you to do.

And I’ll be ignoring precompilation because that’s hard.

The Journey

We’ll start at zef install My::Module, but it could just as well be a URI or something like My::Module:ver<1.*>:auth<me@foo.com>. The module manager either proxies this request to an external recommendation manager (such as MetaCPAN) [1], or acts as the recommendation manager itself (grepping the current perl6 ecosystem package.json file).[2] The request returns a META6 hash for a matching distribution and likely includes some non-spec fields to hint at the download URI.

 ______                              ______________________
|client| 1==request My::Module====> |recommendation manager|
|______| <==META6 representation==2 |______________________|

The package manager then uses an appropriate Distribution object that understands the META6 fields and how to fetch the files it references. The Distribution will encapsulate all the behavior rakudo expects.

 ______                                     ___________________
|client| 3==META6 representation=========> |Distribution Lookup|
|______| <==Distribution implementation==4 |___________________|

The Distribution is passed to the rakudo core class CompUnit::Repository::Installation.install($dist) (CURI for short). CURI then saves all the files the Distribution represents to its own hierarchy and naming conventions.

 ______                                ______
|client| 5==.install(Distribution)==> | CURI |
|______|                              |______|

If you call .resolve($name) on CURI it will return a Distribution object that it creates from its own structure. .resolve($name) also has to decide what to do when multiple names match. In this way CompUnit::Repository acts as a basic recommendation manager.

 ______                                     ___________
|      | 1==.resolve('My::Module')=======> | CURI#site |
|client| <==Distribution implementation==2 |___________|
|      |                                   ____________ 
|      | 3==.install(Distribution)======> | CURI#home  |
|______|                                  |____________|

The Details

META6

{
    "name"        : "My::Module",
    "version"     : "0.001",
    "auth"        : "me@cpan.org",
    "provides"    : {
        "My::Module"      : "lib/My/Module.pm6",
        "My::Module::Foo" : "lib/My/Module/Foo.pm6",
    },
    "resources" : [
        "config.json",
        "scripts/clean.pl",
        "libraries/mylib",
    ],
}

In most cases a distributions meta data is stored in its root folder as META6.json, but where it comes from is irrelevant. We’ll only be interested in a few of the possible fields, which serve two different purposes:

  1. A “unique” identifierThe identifier is the full name of a distribution, and includes the name, version, and auth (there is also api but we’re ignoring this one). In this case its My::Module:ver<0.001>:auth<me@cpan.org> [3] If you were to install this distribution you could use it with:

    use My::Module:ver<0.001>:auth<me@cpan.org>; (although use My::Module; would probably suffice)

  2. File mapping
  • provides is a key/value mapping where the key is the package name and the value is a content-id (forward slash relative file path). The content-id, while being a file path, might not represent a file that exists yet.
  • resources is a list of resource content-ids included with your distribution, usually including all of the files in the resources/ directory. These files will be accessible by modules in your distribution via %?RESOURCES<$name-path>. In the example the first two items follow this pattern and would be resources/config.json and resources/scripts/clean.pl, but the last one is special. If the first path part of the content-id is “libraries/” then any path under it will have its name mangled to whatever naming convention rakudo thinks is right for the OS it is running on. This can be useful for distributions that compile/generate libs at build time and expect to be named a certain way; libraries/mylib.dll on windows, libraries/libmylib.so on linux, and libraries/libmylib.1.so on OSX. This allows you to reference this library in a (probably NativeCall) module as %?RESOURCES<mylib> instead of guessing if its %?RESOURCES<mylib.dll> or %?RESOURCES<libmylib.so>
  • files optional this would usually be populated automatically by the Distribution but i’ll mention it here because you can construct this manually. It is a key/value of $name-path => $content-id. These may or not be the same. Combined with provides this gives you a way to get a list of all content-ids that can be used with Distribution.content(...)
    # Before CURI.install - Usually generated by the Distribution itself
    "files: {
        "bin/my-script" => "bin/my-script",
        "resources/libraries/foolib" => "resources/libraries/libfoolib.so"
    }
    
    # After CURI.install
    "files: {
        "bin/my-script" => "SDfDFIHIUHuhfue9f3fJ930j",
        "resources/libraries/foolib" => "j98jf9fjFJLJFi3f.so"
    }
    

Distribution

role Distribution {
    method meta(--> Hash) {...}
    method content($content-id --> IO::Handle) {...}
}

Distribution is the IO interface a CompUnit::Repository uses. It only needs to implement two methods:

  • method meta Access to the meta data of a distribution. This does not have to be a local file:
  • method content Given a content-id such as lib/My/Module.pm or libraries/mylib return an IO::Handle from which the appropriate data can be read.

When you (or your module installer) pass a Distribution to CompUnit::Repository::Installation.install($dist) it will look at $dist.meta() to figure out all the content-ids it needs to install, and then calls $dist.content($content-id).slurp-rest to get the actual content. [4]

CompUnit::Repository

At the most basic level a CompUnit::Repository is used to store and/or lookup distributions. How the distribution is stored or loaded is up to the CompUnit::Repository.

CompUnit::Repository::Installation is unique among the core CompUnit::Repository classes in that is has an install method that takes a Distribution implementation and maps it to the file system. Currently this means changing all file names to a sha1 string. So it also returns its own implementation of Distribution that still allows us to access My::Module as $dist.content('lib/My/Module.pm'). It’s path-spec is inst#, so if you install the dependencies of a distribution to inst#local/ you could do one of:

  • perl6 -Iinst#local/ -Ilib -e 'use My::Module'
  • PERL6LIB=inst#local/ perl6 -Ilib -e 'use My::Module'

CompUnit::Repository::FileSystem on the other hand works with original path names, although it does not install – only loads and resolves identities. This is what gets used when you use a module found in -I mylib or use lib "mylib" (both short for file#mylib/). If a META6.json file is found it will use the provides field to map namespaces to paths, but if there is no META6.json file (such as when you start developing a module) it will try the usual perl 5 schematics of ($name =~ s{::}{/}g) . ".pm6.

CompUnit::Repository::AbsolutePath is different in that it represents a single module (and not an entire Distribution of modules), such as: require '/home/perl6/repos/my-module/lib/my/module.pm6'.

A gross oversimplification of the interface
role CompUnit::Repository {
    method id()  { ... }

    method path-spec() { ... } # file#, inst#, etc

    method need(CompUnit::DependencySpecification $spec, |c --> CompUnit) { ... }
    
    method load(|) { ... }
}

Cool stuff

Distribution implementations (non-core)

If you were to pass this to CompUnit::Repository::Installation.install($dist) it would make a http request for each source file (found in the META6 – also fetched with a http request) and save the content to the final installation path:

use Distribution::Common::Remote::Github;

my $github-dist = Distribution::Common::Remote::Github.new(
    user   => "zoffixznet",
    repo   => "perl6-CoreHackers-Sourcery", # XXX: No [missing] dependencies
    branch => "master",
);

say "Source code: " ~ $github-dist.content('lib/CoreHackers/Sourcery.pm6').open.slurp-rest;

my $installation-cur = CompUnit::RepositoryRegistry.repository-for-name('home');
exit $installation-cur.install($github-dist) ?? 0 !! 1;

Similarly you could pipe data from running a command such as tar: [5]

use Distribution::Common::Tar;
use Net::HTTP::GET;
use File::Temp;

my $distribution-uri       = 'https://github.com/zoffixznet/perl6-CoreHackers-Sourcery/archive/master.tar.gz';
my ($filepath,$filehandle) = tempfile("******", :unlink);
spurt $filepath, Net::HTTP::GET($distribution-uri).body;

my $tar-dist = Distribution::Common::Tar.new($filepath.IO);

say "Source code: " ~ $tar-dist.content('lib/CoreHackers/Sourcery.pm6').open.slurp-rest;

my $installation-cur = CompUnit::RepositoryRegistry.repository-for-name('home');
exit $installation-cur.install($tar-dist) ?? 0 !! 1;

Thats not to say you couldn’t just clone or untar the distribution and use the built-in Distribution::Path($path) – this simply makes other possibilities trivial to implement.

CompUnit::Repository implementations (non-core)

You can also make your own CompUnit::Repository, such as CompUnit::Repository::Tar.

use CompUnit::Repository::Tar;
use lib "CompUnit::Repository::Tar#perl6-repos/my-module.tar.gz";
use My::Module;

… which is very similar to CompUnit::Repository::FileSystem, but it uses Distribution::Common::Tar to interface with the distribution. This means you can reuse the loading code from core CompUnit::Repository::* modules with very little modification (and won’t be covered in this post).

It would not take too much effort to use Distribution::Common::Remote::Github as the Distribution interface used when loading/resolving, giving a CloudPAN-like way to load modules.


Some ideas for modules:

  • Distribution::Dpkg – Adapater for the dpkg package format
  • Distribution::Gist – Install a distribution from a gist, because…
  • CompUnit::Repository::IPFS – InterPlanetary File System content storage backend
  • CompUnit::Repository::Temp – This CUR will self destruct in…
  • CompUnit::Repository::Tar::FatPack – Read all dependencies for an application from a single tar archive

More reading

gpw2016 Stefan Seifert – A look behind the curtains – module loading in Perl 6

Synopsis 22: Distributions, Recommendations, Delivery and Installation (non-authoritative)

Slightly less basic Perl6 module management


  1. Transformations on the identity may need to be made before sending to a recommendation manager. MetaCPAN may not understand :ver<1.*>, but its only a matter of representing that as an elastic search parameter.
  2. The recommendation engine also gets to determine what to return for My::Module when it has both My::Module:auth<foo> and My::Module:auth<bar> indexed, so it may become a best practice to declare the auth when you use a module.
  3. It should be noted that auth does not tell you where a module should be downloaded from. For instance: My::Module:auth<github:me> does not mean “download My::Module from github.com/me” – its nothing more than an additional identifying part, which is why using an email address is a better example. That exact identity might be found on github, cpan, or a darkpan. Such recommendation managers could choose to only index distributions that use an auth it can verify.
  4. method content doesn’t actually constrain the return value to an IO::Handle but it does expect it to act like one. This was done so that a socket, while not an IO::Handle, could still be used with a thin wrapper allowing resources to be fetched at the moment they are to be installed:
  5. This is a lie, it actually extracts to a temporary file for files under resources/ but only because %?RESOURCES<...> has to return an IO::Path (instead of an IO::Handle). Without this constraint the temporary file is not needed.

One thought on “Day 16: The Meta spec, Distribution, and CompUnit::Repository explained-ish

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.