Day 17 – Perl 6 from 30,000 feet

Many people have heard of Perl 6, especially in the greater Perl community.  However, Perl 6 has a complicated ecosystem which can be a littled daunting, so as a newcomer to the Perl 6 community myself, I thought I would share what I’ve learned.

How do I install Perl 6?

It’s simple; you can just download one of the existing implementations of the language (as Perl 6 is a specification), build it, and install it! There are several implementations out there right now, in various states of completion. Rakudo is an implementation that targets Parrot, and is the implementation that I will discuss most in this post. Niecza is another implementation that targets the CLR (the .NET runtime). For more information on these implementations and on other implementations, please see Perl 6 Compilers. Perl 6 is an ever-evolving language, and any compiler that passes the official test suite can be considered a Perl 6 implementation.

You mentioned “Parrot”; what’s that?

Parrot is a virtual machine that is designed to run dynamically typed languages. Along with the virtual machine, it includes tools for generating virtual machine code from intermediate languages (named PIR and PASM), as well as a suite of tools to make writing compilers easier.

What is Rakudo written in?

Rakudo itself is written primarly in Perl 6, with some bits of C for some of the lower-level operations, like binding method arguments and adding additional opcodes to the Parrot VM. It may seem strange to implement a Perl 6 compiler in Perl 6 itself; Rakudo uses NQP for building itself.

What’s NQP?

NQP (or Not Quite Perl 6) is an implementation of Perl 6 that is focused on creating compilers for the Parrot Compiler Toolkit. It is currently focused on targetting Parrot, but in the future, it may support various compilation targets, so you will be able to use Rakudo to compile your Perl 6 programs to Parrot opcodes, a JVM class file, or perhaps Javascript so you can run it in the browser. NQP is written in NQP, and uses a pre-compiled version of NQP to compile itself.

I hope that this information was useful to you, dear reader, and that it helps to clarify the different pieces of the Perl 6 ecosystem. As I learn more about each piece, I intend to write blog posts that will hopefully help others to get started contributing to Perl 6!


Day 16 – Operator precedence

All the precedence men

As I was taking a walk today, I realized one of the reasons why I like Perl. Five as well as six. I often hear praise such as “Perl fits the way I think”. And I have that feeling too sometimes.

If I were the president (or prime minister, as I’m Swedish), and had a bunch of advisers, maybe some of them would be yes-men, trying to give me advice that they think I will want to hear, instead of advice that would be useful to me. Some languages are like that, presenting us with an incomplete subset of the necessary tools. The Perl languages, if they were advisers, wouldn’t be yes-men. They’d give me an accurate view of the world, even if that view would be a bit messy and hairy sometimes.

Which, I guess, is why Perl five and six are so often used in handling messy data and turning it into something useful.

To give a few specific examples:

  • Perl 5 takes quotes and quoting very seriously. Not just strings but lists of strings, too. (See the qw keyword.) Perl 6 does the same, but takes quoting further. See see the recent post on quoting.
  • jnthn shows in yesterday’s advent post that Perl 6 takes compiler phases seriously, and allows us to bundle together code that belongs together conceptually but not temporally. We need to do this because the world is gnarly and running a program happens in phases.
  • Grammars in Perl 6 are not just powerful, but in some sense honest, too. They don’t oversimplify the task for the programmer, because then they would also limit the expressibility. Even though grammars are complicated and intricate, they should be, because they describe a process (parsing) that is complicated and intricate.


Perl is known for its many operators. Some would describe it as an “operator-oriented” language. Where many other language will try to guess how you want your operators to behave on your values, or perhaps demand that you pre-declare all your types so that there’ll be no doubt, Perl 6 carries much of the typing information in its operators:

my $a = 5;
my $b = 6;

say $a + $b;      # 11 (numeric addition)
say $a * $b;      # 30 (numeric multiplication)

say $a ~ $b;      # "56" (string concatenation)
say $a x $b;      # "555555" (string repetition)

say $a || $b;     # 5 (boolean disjunction)
say $a && $b;     # 6 (boolean conjunction)

Other languages will want to bunch together some of these for us, using the + operator for both numeric addition and string concatenation, for example. Not so Perl. You’re meant to choose yourself, because the choice matters. In return, Perl will care a little less about the types of the operands, and just deliver the appropriate result for you.

“The appropriate result” is most often a number if you used a numeric operator, and a string if you used a string operator. But sometimes it’s more subtle than that. Note that the boolean operators above actually preserved the numbers 5 and 6 for us, even though internally it treated them both as true values. In C, if we do the same, C will unhelpfully “flatten” these results down to the value 1, its spelling of the value true. Perl knows that truthiness comes in many flavors, and retains the particular flavor for you.

Operator precedence

“All operators are equal, but some operators are more equal than others.” It is when we combine operators that we realize that the operators have different “tightness”.

say 2 * 3 + 1;      # 7, because (2 * 3) + 1
say 1 + 2 * 3;      # 7, because 1 + (2 * 3), not 9

We can always be 100% explicit and surround enough of our operations with parentheses… but when we don’t, the operators seem to order themselves in some order, which is not just simple left-to-right evaluation. This ordering between operators is what we refer to as “precedence”.

No doubt you were taught in math class in school that multiplications should be evaluated before additions in the way we see above. It’s as if factors group together closer than terms do. The fact that this difference in precedence is useful is backed up by centuries of algebra notation. Most programming languages, Perl 6 included, incorporates this into the language.

By the way, this difference in precedence is found between other pairs of operators, even outside the realm of mathematics:

      Additive (loose)    Multiplicative (tight)
      ================    ======================
number      +                       *
string      ~                       x
bool        ||                      &&

It turns out that they make as much sense for other types as they do for numbers. And group theory bears this out: these other operators can be seen as a kind of addition and multiplication, if we squint.

Operator precedence parser

Deep in the bowels of the Perl 6 parser sits a smaller parser which is very good at parsing expressions. The bigger parser which parses your Perl 6 program is a really good recursive-descent parser. It works great for creating syntax trees out of the larger program structure. It works less well on the level of expressions. Essentially, what trips up a recursive-descent parser is that it always has to create AST nodes for all the possible precedence levels, whether they’re present or not.

So this smaller parser is an operator-table parser. It knows what to do with each type of operator (prefix, infix, postfix…), and kind of weaves all the terms and operators into a syntax tree. Only the precedence levels actually used show up in the tree.

The optable parser works by comparing each new operator to the top operator on a stack of operators. So when it sees an expression like this:

$x ** 2 + 3 * $x - 5

it will first compare ** against + and decide that the former is tighter, and thus $x ** 2 should be put together into a small tree. Later, it compares + against *, and decides to turn 3 * $x into a small tree. It goes on like this, eventually ending up with this tree structure:

 +-- infix:<+>
      +-- infix:<**>
      |    +-- term:<$x>
      |    +-- term:<2>
      +-- infix:<*>
           +-- term:<3>
           +-- term:<$x>

Because leaf nodes are evaluated first and the root node last, this tree structure determines the order of evaluation for the expression. The order ends up being the same as if the expression had these parentheses:

(($x ** 2) + (3 * $x)) - 5

Which, again, is what we’ve learned to expect.


Another factor also governs how these invisible parentheses are to be distributed: operator associativity. It’s the concern of how the operator combines with multiple copies of itself, or other sufficiently similar operators on the same precedence level.

Some examples serve to explain the difference:

$x = $y = $z;     # becomes $x = ($y = $z)
$x / $y / $z;     # becomes ($x / $y) / $z

In both of these cases, we may look at the way the parentheses are doled out, and say “well, of course”. Of course we must first assign to $y and only then to $x. And of course we first divide by $y and only then by $z. So operators naturally have different associativity.

The optable parser compares not just the precedence of two operators but also, when needed, their associativity. And it puts the parentheses in the right place, just as above.

User-defined operators

Now we come back to Perl not being a yes-man, and working hard to give you the appropriate tools for the job.

Perl 6 allows you to define operators. See my post from last year on the details of how. But it also allows you to specify precedence and associativity of each new operator.

As you specify a new operator, a new Perl 6 parser is automatically constructed for you behind the scenes, which contains your new operator. In this sense, the optable parser is open and extensible. And Perl 6 gives you exactly the same tools for talking about precedence and associativity as the compiler itself uses internally.

Perl treats you like a grown-up, and expects you to make good decisions based on a thorough understanding of the problem space. I like that.

Day 15 – Phasers set to stun

When writing programs, it’s important not only to separate the concerns that need separating, but also to try and keep related things close to each other. This gives the program a sense of cohesion, and helps to avoid the inevitable problems that arise when updating one part of a program demands an update in another far-away part. One especially tricky problem can be when the things we want to do are distributed over time. This can cause us to move related things apart in order to get them to happen at the times we want.

Phasers in Perl 6 help you keep related concepts together in your code, while also indicating that certain aspects of them should happen at different points during the lifetime of the current program, invocation or loop construct. Let’s take a look at some of them.


One of the things I had most fun writing in Perl 6 recently was the debugger. There are various things that need a little care. For example, the debugger needs to look out for exceptions and, when they are thrown, give the user a prompt to let them debug why the exception was thrown. However, there is also a feature where, at the prompt, you can evaluate an expression. The debugger shouldn’t re-enter itself if this expression throws, so we need to keep track of if we’re already showing the prompt. This meant setting and clearing a flag. Thing is, the prompt method is relatively lengthy; it has a given/when to identify the various different commands. I could, of course, have set the prompt flag at the start and cleared it at the end. But that would have spread out the concern of maintaining the flag. Here’s what I did instead:

method issue_prompt($ctx, $cur_file) {
    ENTER $in_prompt = True;
    LEAVE $in_prompt = False;

    # Lots of stuff here

This ensures the flag is set when we enter the method, cleared when we leave the method – and lets me keep the two together.


We’re writing a small utility and want to log what happens as we run it. Time wise, we want to:

  • Open the log file at the start of the program, creating it if needed and overwriting an existing one otherwise
  • Write log entries at various points during the program’s execution
  • Close the log file at the end

Those three actions are fairly spread out in time, but we’d like to collect them together. This time, the INIT and END phasers come to the rescue.

sub log($msg) {
    my $fh = INIT open("logfile", :w);
    END $fh.close;

Here, we use INIT to perform an action at program start time. It turns out that INIT also keeps around the value produced by the expression following it, meaning it can be used as an r-value. This means we have the file handle available to us, and can write to it during the program. Then, at the END of the program, we close the file handle. All of these have block forms, should you wish to do something more involved:

sub log($msg) {
    my $fh = INIT open("logfile", :w);
    END {
        $fh.say("Ran in {now - INIT now} seconds");

Note the second use of INIT in this example, to compute and remember the program start time so we can use it in the subtraction later on.


These phasers work with loops. They fire the first time the loop body executes, at the end of every loop body execution, and after the last loop body execution. FIRST and LAST are especially powerful in so far as they let us move code that wants to special-case the first and last time the loop body runs inside of the loop construct itself. This makes the relationship between these bits of code and the loop especially clear, and lessens the chance somebody moves or copies the loop and forgets the related bits it has.

As an example, let’s imagine we are rendering a table of scores from a game. We want to write a header row, and also do a little ASCII art to denote the start and end of the table. Furthermore, we’d like to keep track of the best score each time around the loop, and then at the end print out the best score. Here’s how we could write it.

for %scores.kv -> $player, $score {
    FIRST say "Score\tPlayer";
    FIRST say "-----\t------";
    LAST  say "-----\t------";

    NEXT (state $best_score) max= $score;
    LAST say "BEST SCORE: $best_score";

    say "$score\t$player";

Notice how we keep the header/footer code together, as well as being able to keep the best score tracking code together. It’s also all inside the loop, making its relationship to the loop clear. Note how the state variable also comes in useful here. It too is a construct that lets us keep a variable scoped inside a block even if its usage spans multiple invocations of the block.


These are variants of LEAVE that trigger conditional on the block being successful (KEEP) or not (UNDO). A successful block completes without unhandled exceptions and returns a defined value. An unsuccessful block exits due to an exception or because it returns an undefined value. Say we were processing a bunch of files and want to build up arrays of successful files and failed files. We could write something like:

sub process($file) {
    KEEP push @success, $file;
    UNDO push @failure, $file;

    my $fh = open($file);
    # ...

There are probably a bunch of transaction-like constructs that can also be very neatly implemented with these two.

And there’s more!

While I’ve covered a bunch of the phasers here, there are some others. For example, there’s also BEGIN, which lets you do some computation at compile time. Hopefully, though, this set of examples gives you some inspiration in how phasers can be used effectively, as well as a better grasp of the motivation for them. Bringing related things together and setting unrelated things apart is something we need to think carefully about every day as developers, and phasers help us keep related concerns together, even if they should take place at different phases of our program’s execution.

Day 14 – Primal Needs

Our brains are hard-wired to look for patterns, even where none exist. So, it’s no surprise that as soon as mankind started counting things, he would look for patterns in numbers. One group of numbers that have resisted the pattern matching capabilities of the human brain are the so-called “prime numbers”. These are numbers that can only be evenly divided by 1 or themselves–they have no other factors.

But you knew that already, so why am I talking about prime numbers instead of Perl 6? Because, just like our ancestors, the people that created Perl 6 and continue to shape it to be around for the next 100 years or more find prime numbers interesting. So interesting, in fact, that the language specification was modified to include a routine for determining whether or not a number is prime.


At first, implmementations of this prime-number-finder were pure Perl 6 and took advantage of other features of the language such as ranges and junctions. An example implementation is shown below:

    sub is-prime($n) { $n %% none 2..sqrt $n }

This implementation checks to see that none of numbers from 2 to the square root of $n will evenly divide $n. If this is the case, then the number is prime.

While the above implementation works fine, it is a little slow and it does suffer a little redundancy in the numbers it checks. For instance, if you know a number isn’t evenly divisible by 2, there’s no need to check if it’s evenly divisible by 4, yet the above algorithm does so anyway.


An improvement on the algorithm is to only check if the I between 2 and the square root of the number evenly divide the number. But … but … that’s like like defining a word in terms of itself. Thanks to ubiquitous lazy evaluation in Perl 6, that’s entirely possible. Here’s an implementation:

    my @primes := 2, 3, 5, -> $p { ($p+2, $p+4 ... &is-prime)[*-1] } ... *;
    sub is-prime($n) { $n %% none @primes ...^  * > sqrt $n }

The array @primes is an infinite, lazily evaluated sequence of numbers starting with 2, 3, and 5. The next number in the sequence is generated by creating a new sequence of odd numbers that start from the last odd number and continue until we reach a prime. That prime is the next number in the sequence. But how do we know if it’s a prime? We check with our handy C subroutine that actually uses the lazy list of primes up to the square root of the number we’re testing to see if any of them are factors.

There’s a kind of mutual recursion going on here where the @primes array effectively memoizes the primes we’ve seen so far. But … then there’s the problem that @primes will continue to grow as you check bigger and bigger numbers for prime-ness. Can we do better?

Indeed we can.

Gamma: Rabin-Miller test

Well … maybe we can. It depends on your idea of “better”. The Rabin-Miller primality test is probabalistic in nature. It doesn’t require storing an ever increasing cache of prime numbers to test if they are factors of the potential prime, but there is a chance that it will tell you that a number is prime when it actually isn’t. The good news is that we can adjust the odds so that we are reasonably confident that the number is prime. Here’s an implementation (taken from

sub expmod(Int $a is copy, Int $b is copy, $n) {
	my $c = 1;
	repeat while $b div= 2 {
		($c *= $a) %= $n if $b % 2;
		($a *= $a) %= $n;

subset PrimeCandidate of Int where { $_ > 2 and $_ % 2 };

my Bool multi sub is-prime(Int $n, Int $k)            { return False; }
my Bool multi sub is-prime(2, Int $k)                 { return True; }
my Bool multi sub is-prime(PrimeCandidate $n, Int $k) {
	my Int $d = $n - 1;
	my Int $s = 0;

	while $d %% 2 {
		$d div= 2;

	for (2 ..^ $n).pick($k) -> $a {
		my $x = expmod($a, $d, $n);

		next if $x == 1 or $x == $n - 1;

		for 1 ..^ $s {
			$x = $x ** 2 mod $n;
			return False if $x == 1;
			last if $x == $n - 1;
		return False if $x !== $n - 1;

	return True;

The third multi variant of is-prime with the signature (PrimeCandidate $n, Int $k) is where all of the magic happens. This multi is only triggered when the prime candidate ($n) is an odd number because of the definition of the PrimeCandidate type.

First, we factor out the powers of 2 from $n - 1. Since $n is an odd number, $n - 1 is even and so has at least one factor of 2. What we end up with is an odd number and some power-of-2 factors of $n - 1. We then use those factors to see if a random sample of $k numbers less than $n are congruent to the square roots of unity modulo $n (expmod handles the modular exponentiation). We repeat this for all of the powers of 2 we factored out of the original number. Fermat’s little theorem says that if we find any number where the congruence does not hold, then the number can not be prime.

The probability that this method will select a composite number as prime is based on how many numbers less than $n we choose to sample. If we select $k numbers to try, the probability is 4 ** -$k. By choosing to sample more numbers, we can quickly decrease the odds of a false positive to a negligible amount.

Wrap up

But … most people don’t really have to worry about the implementation details of is-prime. Not only have is-prime and expmod been added to the Perl 6 specification, but actual implementations (ala Rabin-Miller) have been added to the Rakudo and Niecza Perl 6 compilers. So, if you want to test your new cryptographic algorithm and need some large prime numbers, or if you’re developing a new random number generator and need some candidates for the modulus, or maybe you’re developing a new hashing algorithm, Perl 6 has a built-in is-prime that can help.

Day 13 – Bags and Sets

Over the years, I’ve written many variations on this code:

my %words;
for slurp.comb(/\w+/).map(*.lc) -> $word {

(Aside: slurp.comb(/\w+/).map(*.lc) does the standard Perl trick of reading files specified on the command line or standard in, goes through the data for words, and makes them lowercase.)

Perl 6 introduces two new Associative types for dealing with this sort of functionality. KeyBag is drop-in replacement for Hash in this sort of case:

my %words :=;
for slurp.comb(/\w+/).map(*.lc) -> $word {

Why would you prefer KeyBag over Hash in this case, considering that it’s a bit more code? Well, it does a better job of saying what you mean, if what you want is a positive Int-valued Hash. It actually enforces this as well:

> %words{"the"} = "green";
Unhandled exception: Cannot parse number: green

That’s Niecza’s error; Rakudo’s is less clear, but the important point is you get an error; Perl 6 detects that you’ve violated your contract and complains.

And KeyBag has a couple more tricks up its sleeve. First, four lines to initialize your KeyBag isn’t terribly verbose, but Perl 6 has no trouble getting it down to one line:

my %words :=\w+/).map(*.lc)); does its best to turn whatever it is given into the contents of a KeyBag. Given a List, each of the elements is added to the KeyBag, with the exact same result of our earlier block of code.

If you don’t need to modify the bag after its creation, then you can use Bag instead of KeyBag. The difference is Bag is immutable; if %words is a Bag, then %words{$word}++ is illegal. If immutability is okay for your application, then you can make the code even more compact:

my %words := bag slurp.comb(/\w+/).map(*.lc);

bag is a helper sub that just calls on whatever you give it. (I’m not sure why there is no equivalent keybag sub.)

Bag and KeyBag have a couple more tricks up their sleeve. They have their own versions of .roll and .pick which weigh their results according to the given values:

> my $bag = bag "red" => 2, "blue" => 10;
> say $bag.roll(10);
> say $bag.pick(*).join(" ");
blue blue blue blue blue blue red blue red blue
blue red blue blue red blue blue blue blue blue blue blue

This wouldn’t be too hard to emulate using a normal Array, but this version would be:

> $bag = bag "red" => 20000000000000000001, "blue" => 100000000000000000000;
> say $bag.roll(10);
> say $bag.pick(10).join(" ");
blue blue blue blue red blue red blue blue blue
blue blue blue red blue blue blue red blue blue

They also work with all the standard Set operators, and have a few of their own as well. Here’s a simple demonstration:

sub MAIN($file1, $file2) {
    my $words1 = bag slurp($file1).comb(/\w+/).map(*.lc);
    my $words2 = set slurp($file2).comb(/\w+/).map(*.lc);
    my $unique = ($words1 (-) $words2);
    for $unique.list.sort({ -$words1{$_} })[^10] -> $word {
        say "$word: { $words1{$word} }";

Passed two filenames, this makes a Bag from the words in the first file, a Set from the words in the second file, uses the set difference operator (-) to compute the set of words which are only in the first file, sorts those words by their frequency of appearance, and then prints out the top ten.

This is the perfect point to introduce Set. As you might guess from the above, it works much like Bag. Where Bag is a Hash from Any to positive Int, Set is a Hash from Any to Bool::True. Set is immutable, and there is also a mutable KeySet.

Between Set and Bag we have a very rich collection of operators:

Operation Unicode “Texas” Result Type
is an element of (elem) Bool
is not an element of !(elem) Bool
contains (cont) Bool
does not contain !(cont) Bool
union (|) Set or Bag
intersection (&) Set or Bag
set difference (-) Set
set symmetric difference (^) Set
subset (<=) Bool
not a subset !(<=) Bool
proper subset (<) Bool
not a proper subset !(<) Bool
superset (>=) Bool
not a superset !(>=) Bool
proper superset (>) Bool
not a proper superset !(>) Bool
bag multiplication (.) Bag
bag addition (+) Bag

Most of these are self-explanatory. Operators that return Set promote their arguments to Set before doing the operation. Operators that return Bag promote their arguments to Bag before doing the operation. Operators that return Set or Bag promote their arguments to Bag if at least one of them is a Bag or KeyBag, and to Set otherwise; in either case they return the type promoted to.

Please note that while the set operators have been in Niecza for some time, they were only added to Rakudo yesterday, and only in the Texas variations.

A bit of a word may be needed for the different varieties of unions and intersections of Bag. The normal union operator takes the max of the quantities in either bag. The intersection operator takes the min of the quantities in either bag. Bag addition adds the quantities from either bag. Bag multiplication multiplies the quantities from either bag. (There is some question if the last operation is actually useful for anything — if you know of a use for it, please let us know!)

> my $a = bag <a a a b b c>;
> my $b = bag <a b b b>;

> $a (|) $b;
bag("a" => 3, "b" => 3, "c" => 1)

> $a (&) $b;
bag("a" => 1, "b" => 2)

> $a (+) $b;
bag("a" => 4, "b" => 5, "c" => 1)

> $a (.) $b;
bag("a" => 3, "b" => 6)

I’ve placed my full set of examples for this article and several data files to play with on Github. All the sample files should work on the latest very latest Rakudo from Github; I think all but and should work with the latest proper Rakudo releases. Meanwhile those two scripts will work on Niecza, and with any luck I’ll have the bug stopping the rest of the scripts from working there fixed in the next few hours.

A quick example of getting the 10 most common words in Hamlet which are not found in Much Ado About Nothing:

> perl6 bin/ data/Hamlet.txt data/Much_Ado_About_Nothing.txt
ham: 358
queen: 119
hamlet: 118
hor: 111
pol: 86
laer: 62
oph: 58
ros: 53
horatio: 48
clown: 47

Day 12 – Exceptions

Sometimes things go horribly wrong, and the only thing you can do is not to go on. Then you throw an exception.

But of course the story doesn’t end there. The caller (or the caller’s caller) must somehow deal with the exception. To do that in a sensible manner, the caller needs to have as much information as possible.

In Perl 6, exceptions should inherit from the type Exception, and by convention they go into the X:: namespace.

So for example if you write a HTTP client library, and you decide that an exception should be thrown when the server returns a status code starting with 4 or 5, you could declare your exception class as

 class X::HTTP is Exception {
     has $.request-method;
     has $.url;
     has $.status;
     has $.error-string;

     method message() {
         "Error during $.request-method request"
         ~ " to $.url: $.status $.error-string";

And throw an exception as

     request-method  => 'GET',
     url             => '',
     status          => 404,
     error-string    => 'Not found',

The error message then looks like this:

 Error during GET request to 404 Not found

(line wrapped for the benefit of small browser windows).

If the exception is not caught, the program aborts and prints the error message, as well as a backtrace.

There are two ways to catch exceptions. The simple Pokemon style “gotta catch ’em all” method catches exception of any type with try:

 my $result = try do-operation-that-might-die();
 if ($!) {
     note "There was an error: $!";
     note "But I'm going to go on anyway";

Or you can selectively catch some exception types and handle only them, and rethrow all other exceptions to the caller:

 my $result =  do-operation-that-might-die();
     when X::HTTP {
         note "Got an HTTP error for URL $_.url()";
         # do some proper error handling
     # exceptions not of type X::HTTP are rethrown

Note that the CATCH block is inside the same scope as the one where the error might occur, so that by default you have access to all the interesting varibles from that scope, which makes it easy to generate better error messages.

Inside the CATCH block, the exception is available as $_, and is matched against all when blocks.

Even if you don’t need to selectively catch your exceptions, it still makes sense to declare specific classes, because that makes it very easy to write tests that checks for proper error reporting. You can check the type and the payload of the exceptions, without having to resort to checking the exact error message (which is always brittle).

But Perl 6 being Perl, it doesn’t force you to write your own exception types. If you pass a non-Exception objects to die(), it simply wraps them in an object of type X::AdHoc (which in turn inherits from Exception), and makes the argument available with the payload method:

    sub I-am-fatal() {
        die "Neat error message";
    try I-am-fatal();
    say $!;             # Neat error message;
    say $!.perl;        # => "Neat error message")

To find out more about exception handling, you can read the documentation of class Exception and Backtrace.

Day 11 – Parrot threads

Editors note: I, rurban, does know almost nothing about threads. Any errors are probably mine. I just tested them, fixed some deadlocks, added the numcpu code and merged the threads branch to master.

Parrot now supports fast and lightweight OS threads, based on Nat Tucks’s initial GSoC work together with Andrew “whiteknight” Whitworth on green threads and finally Stefan Seifert’s extension to true parallel OS threads as hybrid threads.
See and


Parrot always supported “threads”, i.e. concurrency models over the last years, but we identified various problems with the particular designs and were continuously improving them. In our case without changing the API too much, as the pdd25 concurrency spec is pretty high level, describing the various models parrot should support, and also pretty low-level in describing the two PMC’s which export the threads API, the Task and the Scheduler classes.

Being born at a time when Perl 6 still looked much more similar to Perl 5 than it does nowadays, Parrot’s threading support initially was very close to Perl’s ithreads model. Previous attempts to change this into the more conventional model of data shared by default or implementing new technologies like STM “Software Transactional Memory” failed. For example Parrot has never supported running multiple threads and having garbage collection at the same time.

In the year 2005 development of faster Central Processing Units (CPUs) shifted from increased speed of a single core to adding more cores. Modern processors contain up to 12 cores with even mobile phones having up to four. To utilize a modern CPU’s power, code needs to be run in parallel. In UNIX (and thus Perl) tradition, this is accomplished using multiple processes being a good solution for many use cases. For many others like auto threading of hyper operators in Perl 6, the cost of process setup and communication would be prohibitively high except for very large data sets.

During the years of back and forth and failed attempts of adding threading support to Parrot, the Perl 6 specification evolved to a point where the largest parts of the language were covered and its features were implemented in the compilers. The lack of concurrency primitives in Parrot however prevents any progress in the area of concurrency support.


Green threads were used to simplify the implementation of a nearly lock free multithreading implementation. 

Parrot supports now native Win32 threads and POSIX threads. Win32 alarm, sleep and premption is unified with POSIX, it is handled on a common timer thread.

Parrot creates at startup a thread pool of --numthreads threads, which defaults to the number of available CPU cores. Activating a new thread at runtime causes no run-time penalties, until the number of cores is utilized. When a user starts a new task, the scheduler first looks for an idle thread. If one can be found, the task is scheduled on the thread’s interpreter. If more tasks are started than the maximum number of threads, the tasks are distributed evenly among the running interpreters. This is effectively an implementation of the N:M threading model.

Green threads

Our GSOC student Nan “Chandor” Tuck worked in summer 2010 on green threads.

What I have working now is a pre-emptively scheduled green threads system for Parrot that allows programs to be written in a concurrent style. Individual green threads can do basic blocking file input without stopping other threads from running. These logical threads are accessed using the Task API that I described a couple weeks ago. This functionality makes Parrot similarly powerful at threading as the standard version of Ruby or Python: Threads can do pretty much everything except run at the same time.

What was missing from this green threads branch was true parallel execution in OS threads, one global_interpreter structure that is shared and protected by locks or other concurrent access rules and many local_interpreters that run simultaneously in separate OS threads.

OS threads

From Fall 2011 to Summer 2012 Stefan “nine” Seifert implemented true OS threads on top of green threads to finally allow true parallel execution of Tasks, to implement blocking IO, and to give perl6 some more advantages over perl5.

The lightweight “green” threads are used as messages in a system where reading shared variables is allowed but only the one owner thread may write to it. That’s why we call it hybrid threads.

Why is multithreading support so difficult to implement?

Low level programming languages like C provide only the bare necessities, leaving the responsibility for preventing data corruption and synchronization entirely to the user. A high-level language like Perl 6 on the other hand provides complex and compound data types, handles garbage collection and a very dynamic object system. Even seemingly simple things like a method call can become very complex. In a statically typed programming language the definition of a class is immutable. Thus, calling a method on an object contains just the steps of determining the object’s class, fetching the required method from this class and calling it. Calling the same method again can then even omit the first two steps since their results cannot change.

In a dynamic language, the object may change its class at runtime. The inheritance hierarchy of the class may be changed by adding or removing parent classes. Methods may be added to or removed from classes (or objects) at runtime and even the way how to find a method of a class may change. So a simple method call results in the following steps:

    ·  determining the class of the object,
    ·  determining the method resolution method of the class,
    ·  finding the actual method to call,
    ·  calling the method.

These steps have to be repeated for every method call, because the results may change any time. In a threaded environment, a thread running in parallel may change the underlying data and meta data in between those sequences and even between those steps. As a consequence, this meta data has to be protected from corruption introducing the need for locks in a performance critical area.

Many interpreters for dynamic languages like Python or Ruby handle this problem by using a global interpreter lock (GIL) to effectively serialize all operations. This is a proven and reliable way but leaves much of the hardware’s potential unused.


In Java, the user is responsible for preventing concurrency issues. The language provides synchronization primitives like mutexes, but the interpreter (the Java Virtual Machine, JVM) does not protect the consistency of the provided data structures. The class library provides the user with high-level data structures explicitly designed for multithreaded scenarios.

Java version 1.1 used green threads to support multithreaded execution of Java programs. Green threads are threads simulated by the virtual machine (VM) but unable to use more than one CPU core for processing. Version 1.2 introduced native Operating System (OS) threading support which since has become the standard way to do multithreading in Java.


The CPython implementation of the Python runtime uses a Global Interpreter Lock (GIL) to protect its internal consistency. This is a single lock taken whenever the interpreter executes Python bytecode. Because of this lock, only one thread can execute bytecode at any time so all built-in types and the object model are implicitly type safe. The drawback is that Python code cannot benefit from having multiple CPU cores available. However I/O operations and calls to external libraries are executed without holding the GIL, so in applications with multiple I/O bounded threads, there may still be a performance benefit from using multithreading.

To run Python code in parallel, multiple processes have to be used. The multiprocessing module provides support for spawning processes exposed through an API similar to the threading module. Since processes may not directly access other processes’ memory, the multiprocessing module provides several means of communication between processes: Queues, Pipes and shared memory support.


Much of Parrot’s previous threading related code has been removed to clean up the code and improve performance. Since the existing threading support was known to be unreliable and seriously flawed, this was no trade off. The final parts were removed by the merging of the kill_threads branch on September, 21st 2011.

In 2010, Nat Tuck began working on a green_threads branch during his Google Summer of Code internship. The feature got prototyped using pure PIR and then implemented in Parrot’s core. He got it to work in simple cases and started to work on OS thread support but the internship ended before the code was ready to be merged into the master branch. The code lay dormant until the work on hybrid threads in the threads branch started in 2011.

In Parrot, green threads are called Tasks. Each task is assigned a fixed amount of execution time. After this time is up a timer callback sets a flag which is checked at execution of every branch operation. Since the interpreter’s state is well defined at this point, its internal consistency is guaranteed. The same holds for the GC. Since task preemption is only done while executing user-level code, the GC can do its work undisturbed and without the need for measures like locking. Since user-level code is allowed to disable the scheduler, it can be guaranteed to run undisturbed through critical sections.

The scheduler is implemented as a PMC type. This allows the user to subclass this PMC thus allowing fine-grained control over the scheduling policy. Features, a user could add this way would be for example giving different priorities to tasks or implementing the possibility to suspend and resume a task.

Shared data

Cross-thread writes to shared variables may endanger the internal consistency of the interpreter. Traditionally, the solution to this problem is the use of locks of varying granularity. Fine-grained locking allows code to run in parallel but taking and releasing locks costs performance. It not only increases the instruction count and memory accesses but it also forces the CPU cores to coordinate and thus communicate. Even a seemingly simple operation like an atomic increment can take two orders of magnitude longer than a normal increment. While the gain through being able to utilize multiple CPU cores may offset this cost, it is still impacting the common case of having only a single thread running.

Too coarse locking on the other hand would reduce scalability and the performance gains through parallel execution by having threads wait for extended periods for locks to become available. In the extreme case of having a global interpreter lock it would effectively serialize all computations costing much of the benefits of using threads in the first place.

The other problem with locking is the possibility of introducing deadlocks. For example, two functions F1 and F2 both use two resources A and B protected by locks. If F1 first locks A and then tries to lock B while F2 has already locked B and is now trying to lock A, the program would come to a halt. Both functions would be left waiting for the other to unlock the resource which will never happen. With fine-grained locking, the possibilities for such bugs grow quickly. At the same time, it is easy to miss a case where a lock would be appropriate leading to difficult to diagnose corruption bugs.

The solution for these problems implemented in hybrid threads is to sidestep them altogether by disallowing write access to shared variables. The programmer (or in most cases the compiler) is obliged to declare a list of all shared variables before a newly created task is started. The interpreter would then create proxy objects for these variables which the task can use to access the data. These proxies contain references to the original objects. They use these references to forward all reading vtable functions to the originals. Write access on the other hand would lead to a runtime error.

In other words, all data is owned by the thread creating it and only the owner may write to it. Other threads have only read access.

For threads to be able to communicate with their creators and other threads, they still need to write to shared variables. This is where green threads come into play. Since green threads are light weight, it is feasible for a thread to create a task just for updating a variable. This task is scheduled on the interpreter owning this variable. To reduce latency, the task is flagged to run immediately. The data-owning interpreter will preempt the currently running task and process the new write task. Put another way, the data-owning interpreter is told what to write to its variables, so other threads don’t have to.


Proxies are the arbiters between threads. They are the only means for a thread to access another thread’s data and are implemented by the Proxy PMC type.

Proxy has default implementations for all functions, writing functions raise a cant_do_write_method runtime exception.  If a method returns a PMC from the target’s interp, another proxy object has to be created and wrapped around it so it can be safely returned to the caller.


The Sub PMC represents executable subroutines. A Sub does not only contain the code to execute but also the context in which to execute the code such as visible globals and namespaces.  If a proxy to such a Sub were created and invoke called on it, the code would access this context directly since it belongs to the same interp as the proxied Sub itself.  Thus, an operation like get_global fetches a global from an unproxied namespace and an unproxied global is be put into the target register.  Since this is happening while running invoke on the original Sub, Proxy cannot intercept the call and create a Proxy for the result.

This is the reason why Parrot_thread_create_proxy does not create a Proxy for a Sub but uses Parrot_thread_create_local_sub to create a copy on the thread’s interp with proxies for all PMC attributes.

Writing to shared variables

To write to shared variables, a thread creates a task and schedules it on the data owning interpreter. An example task looks like this:

    .sub write_to_variable
         .param pmc variable
         variable = 1

This is a subroutine with just one parameter. The variable passed as this parameter is the one the task should write to. In this case the constant value 1 would be written to the variable. In PIR, an assignment to a PMC gets translated to a method call. In this case, the set_integer_native is called changing the variable’s value. Since PMCs are passed by reference, it is the original variable which gets written to.

Code to create the task looks like:

    1    write_task = new ['Task']
    2    setattribute write_task, 'code', write_to_variable
    3    setattribute write_task, 'data', shared_variable
    4    interp.'schedule_proxied'(write_task, shared_variable)

Line 1 creates a new task object. The example subroutine is used for the task’s code attribute. shared_variable is used for data. At this point, shared_variable is actually the proxy object created for the shared integer PMC. The interpreter object contains a schedule_proxied method which is used to schedule the write_task on the thread owning the original variable.

schedule_proxied uses Parrot_thread_create_local_task which in this case detects that the data given as parameter for the task’s code is actually a proxy already and unwraps the proxied object. Parrot_cx_schedule_immediate is then used to make the data owning interpreter execute the task as soon as possible.

To protect a critical section, preemption can be disabled so the critical section runs uninterrupted:

    1 .sub swap_variables
    2     .param pmc a, b
    3     .local temp
    4     disable_preemption
    5     temp = a
    6     a = b
    7     b = temp
    8     enable_preemption
    9 .end


Using tasks to write to shared variables makes such actions inherently asynchronous. This is not always what is needed by the implemented algorithm. For example, when the shared variable is a lock, processing should continue as soon as it’s acquired. The wait operation is used to wait for a task’s completion. The waiting task is added to the waited for task’s waiters list and preempted immediately. When a task finishes, all the tasks in the waiters list are scheduled again for execution. Since for each task a local copy is created on the target thread, the running task not only checks its own waiters list but also its partner’s.

If a task on the main thread was waiting for a task on another thread to finish and no other tasks are in the scheduler’s queue on the main thread, the main thread exits if no alarms are pending. To prevent this unintended exit, all tasks are added to the scheduler’s foreign_tasks list when they are scheduled on other threads. To end the program with other threads still running, an explicit exit operation has to be used.


Preliminary benchmarks have shown Parrot’s performance to be within an order of magnitude of that of an optimized implementation in Perl 5.

Since Parrot does not yet offer the user any synchronization primitives, locks had to be implemented using a shared variable which is written to only by the main thread. Replacing this primitive method with a native semaphore implementation would probably reduce runtime to a small fraction.

Runtime comparison for matrix multiplication

                singlethreaded  computation      multithreaded   computation
    1. run          28.522 s       19.530 s        17.543 s          8.478 s
    2. run          28.427 s       19.463 s        17.320 s          8.283 s
    3. run          28.200 s       19.235 s        17.489 s          8.473 s
    average         28.383 s       19.409 s        17.451 s          8.411 s

This test implements matrix multiplication using four threads. For simplicity the second matrix only has one column. The program is written in the Winxed programming language. Winxed is a low-level language with Javascript like syntax and the possibility to include sections of PIR code verbatim making it possible to try experimental opcodes while writing more readable and concise code than with PIR alone. The complete source code is available in examples/threads/matrix_part.winxed

The program consists of the parts initialization, computation and verification. Computation is parallelized using four tasks each calculating one fourth of the result vector. Runtime is compared to a simple singlethreaded implementation. Run times were measured using the time command and are recorded in the above table.

As can be seen, the multithreaded implementation gives an average speedup of 2.31 for the computation and 1.61 in total.

Runtime comparison for Mandelbrot set calculation

                 singlethreaded  1 thread    2 threads   4 threads    8 threads
    1. run           89.931 s    89.978 s    45.813 s     24.028 s     17.445 s
    2. run           89.707 s    89.871 s    45.906 s     24.048 s     17.695 s
    3. run           90.318 s    89.839 s    45.951 s     24.049 s     17.573 s
    average          89.985 s    89.896 s    45.890 s     24.042 s     17.571 s
    speedup           1.000        1.001       1.959       3.739        5.116

The complete source code is available in examples/pir/mandel.pir

Calculating an image of the Mandelbrot set is a common benchmark for multithreading implementations since calculations of points are independent of each other and are thus easily parallelizable. A simple implementation of the escape time algorithm written in Winxed has been used to determine scalability properties of the threading implementation. The image is split into lines which are calculated alternatedly by a configured number of tasks. Run times were measured using the time command on an Intel Core i7 3770K processor with 16 GiB RAM running openSUSE 12.1 and are recorded in the Mandelbrot table. As can be seen, the implementation scales nearly linearly up to four threads reflecting the CPU’s four physical cores. Using eight threads, the speedup is only 1.368 compared to four threads but this seems to be more a limitation of the hardware than the implementation.


On IRC and on the mailing list some detailed questions have been asked.

See here: