Day 21 – Collatz Variations

The Collatz sequence is one of those interesting “simple” math problems that I’ve run into a number of times. Most recently a blog post on programming it in Racket showed up on Hacker News. As happens so often, I instantly wanted to implement it in Perl 6.

sub collatz-sequence(Int $start) { 
    $start, { when * %% 2 { $_ / 2 }; when * !%% 2 { 3 * $_ + 1 }; } ... 1;

sub MAIN(Int $min, Int $max) {
    say [max] ($min..$max).map({ +collatz-sequence($_) });        

This is a very straightforward implementation of the Racket post’s max-cycle-length-range as a stand-alone p6 script. collatz-sequence generates the sequence using the p6 sequence operator. Start with the given number. If it is divisible by two, do so: when * %% 2 { $_ / 2 }. If it is not, multiply by three and add 1: when * !%% 2 { 3 * $_ + 1 }. Repeat this until the sequence reaches 1.

MAIN(Int $min, Int $max) sets up our main function to take two integers. Many times I don’t bother with argument types in p6, but this provides a nice feedback for users:

> perl6 blue red
Usage: <min> <max> 

The core of it just maps the numbers from $min to $max (inclusive) to the length of the sequence (+collatz-sequence) and then says the max of the resulting list ([max]).

Personally I’m a big fan of using the sequence operator for tasks like this; it directly represents the algorithm constructing the Collatz sequence in a simple and elegant fashion. On the other hand, you should be able to memoize the recursive version for a speed increase. Maybe that would give it an edge over the sequence operator version?

Well, I was wildly wrong about that.

sub collatz-length($start) {
    given $start {
        when 1       { 1 }
        when * !%% 2 { 1 + collatz-length(3 * $_ + 1) } 
        when * %% 2  { 1 + collatz-length($_ / 2) } 

sub MAIN($min, $max) {
    say [max] ($min..$max).map({ collatz-length($_) });        

This recursive version, which makes no attempt whatsoever to be efficient, is actually better than twice as fast as the sequence operator version. In retrospect, this makes perfect sense: I was worried about the recursive version making a function call for every iteration, but the sequence version has to make two, one to calculate the next iteration and the other to check and see if the ending condition has been reached.

Well, once I’d gotten this far, I thought I’d better do things correctly. I wrote two framing scripts, one for timing all the available scripts, the other for testing them to make sure they work!

my @numbers = 1..200, 10000..10200;

sub MAIN(Str $perl6, *@scripts) {
    my %results;
    for @scripts -> $script {
        my $start = now;
        qqx/$perl6 $script { @numbers }/;
        my $end = now;

        %results{$script} = $end - $start;

    for %results.pairs.sort(*.value) -> (:key($script), :value($time)) {
        say "$script: $time seconds";

This script takes as an argument a string that can be used to call a Perl 6 executable and a list of scripts to run. It runs the scripts using the specified executable, and times them using p6’s now function. It then sorts the results into order and prints them. (A similar script I won’t post here tests each of them to make sure they are returning correct results.)

In the new framework, the Collatz script has changed a bit. Instead of taking a min and a max value and finding the longest Collatz sequence generated by a number in that range, it takes a series of numbers and generates and reports the length of the sequence for each of them. Here’s the sequence operator script in its full new version:

sub collatz-length(Int $start) { 
    +($start, { when * %% 2 { $_ / 2 }; when * !%% 2 { 3 * $_ + 1 }; } ... 1);

sub MAIN(*@numbers) {
    for @numbers -> $n {
        say "$n: " ~ collatz-length($n.Int);

For the rest of the scripts I will skip the MAIN sub, which is exactly the same in each of them.

Framework established, I redid the recursive version starting from the new sequence operator code.

sub collatz-length(Int $n) {
    given $n {
        when 1       { 1 }
        when * %% 2  { 1 + collatz-length($_ div 2) }
        when * !%% 2 { 1 + collatz-length(3 * $_ + 1) }

The sharp-eyed will notice this version is different from the first recursive version above in two significant ways. This time I made the argument Int $n, which instantly turned up a bit of a bug in all implementations thus far: because I used $_ / 2, most of the numbers in the sequence were actually rationals, not integers! This shouldn’t change the results, but is probably less efficient than using Ints. Thus the second difference about, it now uses $_ div 2 to divide by 2. This version remains a great improvement over the sequence operator version, running in 4.7 seconds instead of 13.3. Changing when * !%% 2 to a simple default shaves another .3 seconds off the running time.

Once I started wondering how much time was getting eaten up by the when statements, rewriting that bit using the ternary operator was an obvious choice.

sub collatz-length(Int $start) { 
    +($start, { $_ %% 2 ?? $_ div 2 !! 3 * $_ + 1 } ... 1);

Timing results: Basic sequence 13.4 seconds. Sequence with div 11.5 seconds. Sequence with div and ternary 9.7 seconds.

That made me wonder what kind of performance I could get from a handcoded loop.

sub collatz-length(Int $n is copy) {
    my $length = 1;
    while $n != 1 {
        $n = $n %% 2 ?? $n div 2 !! 3 * $n + 1;

That’s by far the least elegant of these, I think, but it gets great performance: 3 seconds.

Switching back to the recursive approach, how about using the ternary operator there?

sub collatz-length(Int $n) {
    return 1 if $n == 1;
    1 + ($n %% 2 ?? collatz-length($n div 2) !! collatz-length(3 * $n + 1));

This one just edges out the handcoded loop, 2.9 seconds.

Can we do better than that? How about memoization? is cached is supposed to be part of Perl 6; neither implementation has it yet, but last year’s Advent calendar has a Rakudo implementation that still works. Using the last version changed to sub collatz-length(Int $n) is cached { works nicely, but takes 3.4 seconds to execute. Apparently the overhead of caching slows it down a bit. Interestingly, the non-ternary recursive version does speed up with is cached, from 4.4 seconds to 3.6 seconds.

Okay, instead of using a generic memoization, how about hand-coding one?

sub collatz-length(Int $n) {
    return 1 if $n == 1;
    state %lengths;
    return %lengths{$n} if %lengths{$n}:exists;
    %lengths{$n} = 1 + ($n %% 2 ?? collatz-length($n div 2) !! collatz-length(3 * $n + 1));

Bingo! 2.7 seconds.

I’m sure there are lots of other interesting approaches for solving this problem, and encourage people to send them in. In the meantime, here’s my summary of results so far:

Script Rakudo Niecza
bin/ 2.5 1.7
bin/ 3 1.7
bin/ 3.1 1.7
bin/ 3.2 N/A
bin/ 3.5 N/A
bin/ 4.4 1.8
bin/ 4.9 1.9
bin/ 9.9 3.3
bin/ 11.6 3.5
bin/ 13.5 3.8

The table was generated from

Day 13 – Bags and Sets

Over the years, I’ve written many variations on this code:

my %words;
for slurp.comb(/\w+/).map(*.lc) -> $word {

(Aside: slurp.comb(/\w+/).map(*.lc) does the standard Perl trick of reading files specified on the command line or standard in, goes through the data for words, and makes them lowercase.)

Perl 6 introduces two new Associative types for dealing with this sort of functionality. KeyBag is drop-in replacement for Hash in this sort of case:

my %words :=;
for slurp.comb(/\w+/).map(*.lc) -> $word {

Why would you prefer KeyBag over Hash in this case, considering that it’s a bit more code? Well, it does a better job of saying what you mean, if what you want is a positive Int-valued Hash. It actually enforces this as well:

> %words{"the"} = "green";
Unhandled exception: Cannot parse number: green

That’s Niecza’s error; Rakudo’s is less clear, but the important point is you get an error; Perl 6 detects that you’ve violated your contract and complains.

And KeyBag has a couple more tricks up its sleeve. First, four lines to initialize your KeyBag isn’t terribly verbose, but Perl 6 has no trouble getting it down to one line:

my %words :=\w+/).map(*.lc)); does its best to turn whatever it is given into the contents of a KeyBag. Given a List, each of the elements is added to the KeyBag, with the exact same result of our earlier block of code.

If you don’t need to modify the bag after its creation, then you can use Bag instead of KeyBag. The difference is Bag is immutable; if %words is a Bag, then %words{$word}++ is illegal. If immutability is okay for your application, then you can make the code even more compact:

my %words := bag slurp.comb(/\w+/).map(*.lc);

bag is a helper sub that just calls on whatever you give it. (I’m not sure why there is no equivalent keybag sub.)

Bag and KeyBag have a couple more tricks up their sleeve. They have their own versions of .roll and .pick which weigh their results according to the given values:

> my $bag = bag "red" => 2, "blue" => 10;
> say $bag.roll(10);
> say $bag.pick(*).join(" ");
blue blue blue blue blue blue red blue red blue
blue red blue blue red blue blue blue blue blue blue blue

This wouldn’t be too hard to emulate using a normal Array, but this version would be:

> $bag = bag "red" => 20000000000000000001, "blue" => 100000000000000000000;
> say $bag.roll(10);
> say $bag.pick(10).join(" ");
blue blue blue blue red blue red blue blue blue
blue blue blue red blue blue blue red blue blue

They also work with all the standard Set operators, and have a few of their own as well. Here’s a simple demonstration:

sub MAIN($file1, $file2) {
    my $words1 = bag slurp($file1).comb(/\w+/).map(*.lc);
    my $words2 = set slurp($file2).comb(/\w+/).map(*.lc);
    my $unique = ($words1 (-) $words2);
    for $unique.list.sort({ -$words1{$_} })[^10] -> $word {
        say "$word: { $words1{$word} }";

Passed two filenames, this makes a Bag from the words in the first file, a Set from the words in the second file, uses the set difference operator (-) to compute the set of words which are only in the first file, sorts those words by their frequency of appearance, and then prints out the top ten.

This is the perfect point to introduce Set. As you might guess from the above, it works much like Bag. Where Bag is a Hash from Any to positive Int, Set is a Hash from Any to Bool::True. Set is immutable, and there is also a mutable KeySet.

Between Set and Bag we have a very rich collection of operators:

Operation Unicode “Texas” Result Type
is an element of (elem) Bool
is not an element of !(elem) Bool
contains (cont) Bool
does not contain !(cont) Bool
union (|) Set or Bag
intersection (&) Set or Bag
set difference (-) Set
set symmetric difference (^) Set
subset (<=) Bool
not a subset !(<=) Bool
proper subset (<) Bool
not a proper subset !(<) Bool
superset (>=) Bool
not a superset !(>=) Bool
proper superset (>) Bool
not a proper superset !(>) Bool
bag multiplication (.) Bag
bag addition (+) Bag

Most of these are self-explanatory. Operators that return Set promote their arguments to Set before doing the operation. Operators that return Bag promote their arguments to Bag before doing the operation. Operators that return Set or Bag promote their arguments to Bag if at least one of them is a Bag or KeyBag, and to Set otherwise; in either case they return the type promoted to.

Please note that while the set operators have been in Niecza for some time, they were only added to Rakudo yesterday, and only in the Texas variations.

A bit of a word may be needed for the different varieties of unions and intersections of Bag. The normal union operator takes the max of the quantities in either bag. The intersection operator takes the min of the quantities in either bag. Bag addition adds the quantities from either bag. Bag multiplication multiplies the quantities from either bag. (There is some question if the last operation is actually useful for anything — if you know of a use for it, please let us know!)

> my $a = bag <a a a b b c>;
> my $b = bag <a b b b>;

> $a (|) $b;
bag("a" => 3, "b" => 3, "c" => 1)

> $a (&) $b;
bag("a" => 1, "b" => 2)

> $a (+) $b;
bag("a" => 4, "b" => 5, "c" => 1)

> $a (.) $b;
bag("a" => 3, "b" => 6)

I’ve placed my full set of examples for this article and several data files to play with on Github. All the sample files should work on the latest very latest Rakudo from Github; I think all but and should work with the latest proper Rakudo releases. Meanwhile those two scripts will work on Niecza, and with any luck I’ll have the bug stopping the rest of the scripts from working there fixed in the next few hours.

A quick example of getting the 10 most common words in Hamlet which are not found in Much Ado About Nothing:

> perl6 bin/ data/Hamlet.txt data/Much_Ado_About_Nothing.txt
ham: 358
queen: 119
hamlet: 118
hor: 111
pol: 86
laer: 62
oph: 58
ros: 53
horatio: 48
clown: 47

Day 17: Gtk Mandelbrot

Two years ago today, the Advent post was on making Mandelbrot sets in Perl 6. At the time, they were in black and white, slow to produce, Rakudo was prone to crashing, and the only user interface thing you could control was how big the resulting PPM file was.

As they say, that was then. This is now.

Full Mandelbrot set

The new script is 423 lines of Perl 6 code — targeted at Niecza, threaded, and using the GtkSharp library. It allows you to move and resize the windows, zoom in (left mouse button, drag across image to define zoom boundaries), create Julia set images (right click on a Mandelbrot set image), increase the number of iterations (press ‘m’), and output a PPM file for a window (press ‘s’).

The threading doesn’t actually improve performance on my MacBook Pro (still looking into why) but it does make the script much more responsive.

It would be far too long to go through all the code, but lets hit on the highlights. The core is almost unchanged:

        sub julia(Complex $c, Complex $z0) {
            my $z = $z0;
            my $i;
            loop ($i = 0; $i < $max_iterations; $i++) {
                if $z.abs > 2 {
                    return $i + 1;
                $z = $z * $z + $c;
            return 0;

It’s named julia instead of mandel now, because it is more general. If you call it with $z0 set to 0, it calculates the same thing the old mandel did. Allowing $z0 to vary allows you to calculate Julia sets as well.

The code around it is very different, though! Stefan O’Rear wrote the threading code, using Niecza’s Threads library, which is a thin wrapper on C#’s threading libraries, and probably not very close to what Perl 6’s built-in threading will look like when it is ready to go. He establishes a WorkQueue with a list of the work that needs to be done, and then starts N running threads, where N comes from the environment variable THREADS if it is present, and the reported processor count otherwise:

for ^(%*ENV<THREADS> // CLR::System::Environment.ProcessorCount) {{ });
} is pretty simple:

    method run() {
        loop {
            my $item = self.shift;
            next if $item.cancelled;

This is an infinite loop that starts by getting the next WorkItem off the queue, checks to see if it has been cancelled, and if it hasn’t, calls the .run Callable attribute and then the mark-done method.

The WorkItems on the queue look like this:

class WorkItem {
    has Bool $!done = False;
    has Bool $!cancelled = False;

    has Callable &.run;
    has Callable &.done-cb;

    method is-done() { WorkQueue.monitor.lock({ $!done }) }
    method mark-done() {
        &.done-cb.() unless WorkQueue.monitor.lock({ $!done++ })

    method cancelled() { WorkQueue.monitor.lock({ $!cancelled }) }
    method cancel() { WorkQueue.monitor.lock({ $!cancelled = True }) }

Each WorkItem has two flags, $!done and $!cancelled, and two Callable attributes, &.run, already mentioned as what is called by, and &.done-cb, which is the callback function to be called when the &.run method finishes.

The two methods (for now) we use in our WorkItem are relatively simple:

            sub row() {
                my $row = @rows[$y];
                my $counter = 0;
                my $counter_end = $counter + 3 * $width;
                my $c = $ur - $y * $delta * i;

                while $counter < $counter_end {
                    my $value = $is-julia ?? julia($julia-z0, $c) !! julia($c, 0i);
                    $row.Set($counter++, @red[$value % 72]);
                    $row.Set($counter++, @green[$value % 72]);
                    $row.Set($counter++, @blue[$value % 72]);
                    $c += $delta;

            sub done() {
                Application.Invoke(-> $ , $ {
                    $.draw-area.QueueDrawArea(0, $y, $width, 1);

            my $wi = => &row, done-cb => &done);
            push @.line-work-items, $wi;

As you might expect, row calculates one line of the set we are working on. It may look like it is using global variables, but these subs are actually local to the FractalSet.start-work method and the variables are local to it from there. The done invokes a Gtk function noting that a portion of the window needs to be redrawn (namely the portion we just calculated).

The above block of code is called once for each row of the fractal window being generated, which has the effect of queuing up all of the fractal to be handled as there are available threads.

Moving upward in the code’s organization, each fractal window we generate is managed by an instance of the FractalSet class.

class FractalSet {
    has Bool $.is-julia;
    has Complex $.upper-right;
    has Real $.delta;
    has Int $.width;
    has Int $.height;
    has Int $.max_iterations;
    has Complex $.c;
    has @.rows;
    has @.line-work-items;
    has $.new-upper-right;

    has $.draw-area;

$.is-julia and $.max_iterations are self-explanatory. $.upper-right is the fixed complex number anchoring the image. $.delta is the amount of change in the previous number per-pixel; we assume the pixels are square. $.width and $.height are the size of the window in pixels. $.c only has meaning for Julia sets, where it is the value $c in the equation $new-z = $z * $z + $c. @.rows the pixel information generated by the row sub above; @.line-work-items saves a reference to all of the WorkItems generating those rows. $.new-upper-right is temporary used during the zoom mouse operation. $.draw-area is the Gtk.DrawingArea for the related window.

Once all that is set up, the rest of the code is pretty straightforward. The Gtk windowing code is set up in

    method build-window()
        my $index = +@windows;

        my $window =$.is-julia ?? "julia $index" !! "mandelbrot $index");
        $window.Resize($.width, $.height);  # TODO: resize at runtime NYI

        my $event-box =;

        my $drawingarea = $.draw-area =;


We store a global array @windows tracking all the FractalSets in play. Each of the different objects here gets the "Id" data set to this set’s index into the @windows array so we can easily look up the FractalSet from callback functions. The rest of the method is just plugging the right callback into each component — simple conceptually but it took this Gtk novice a lot of work figuring it all out.

As an example, consider the KeyReleaseEvent callback, which responds to presses on the keyboard.

sub KeyReleaseEvent($obj, $args) {
    my $index = $obj.GetData("Id").ToInt32();
    my $set = @windows[$index];
    given $args.Event.Key {
        when 'm' | 'M' {
        when 's' | 'S' {

First we lookup the index into @windows, then we get the $set we’re looking at. Then we just call the appropriate FractalSet method, for instance

    method increase-max-iterations() {
        $.max_iterations += 32;

.stop-work cancels all the pending operations for this FractalSet, then we bump up the number of iterations, and then we .start-work again to queue up a new set of rows with the new values.

The full source code is here. As of this writing it agrees with the code here, but this is an active project, and probably will change again in the not-too-distant future. Right now my biggest goals are figuring out how to get the threading to actually improve performance on my MacBook Pro and cleaning up the code. Both suggestions and questions are welcome.

Day 9: Contributing to Perl 6

This time, instead of sharing some cool feature of Perl 6, I’d like to talk about how easy it is to contribute usefully to the project. So I’m going to walk you through the process of making a change to Niecza. It does require a bit of domain knowledge (which the fine folks on #perl6 will be happy to help you with) but it’s definitely not rocket science. It’s not even particularly deep computer science, for the most part.

A few days ago, Radvendii asked on #perl6 if there was a round function in the core. The correct answer is “There should be one”, and it lead to a couple of bug fixes in Rakudo. But it got me to thinking — is Niecza supporting round (and its relatives ceiling, floor, and truncate) correctly?

Perl 6 has a huge suite of tests to see if an implementation is conforming to the spec, including a file for the round tests, S32-num/rounders.t. My first step then was to check the spectests currently being run by Niecza. Just like in Rakudo, this is stored in a file named t/ So

Wynne:niecza colomon$ grep round t/ 
Wynne:niecza colomon$ 

Okay, clearly we’re not running the S32-num/rounders.t test file. (Note, in case you’re getting confused — the links in this post are to the latest versions of the files, which include all the changes I made writing this post.) That’s a sign that something is not properly supported yet. So let’s go ahead and run it by hand to see what happens. Both Niecza and Rakudo use a fudging process, allowing you to mark the bits of a test file that don’t work yet in a particular compiler. So we need to use a special fudging tool to run the code:

Wynne:niecza colomon$ t/fudgeandrun t/spec/S32-num/rounders.t
not ok 1 - floor(NaN) is NaN
# /Users/colomon/tools/niecza/t/spec/S32-num/rounders.t line 16
#    Failed test
#           got: -269653970229347386159395778618353710042696546841345985910145121736599013708251444699062715983611304031680170819807090036488184653221624933739271145959211186566651840137298227914453329401869141179179624428127508653257226023513694322210869665811240855745025766026879447359920868907719574457253034494436336205824

That’s followed by about 15 similar errors, then

Unhandled exception: Unable to resolve method truncate in class Num
  at /Users/colomon/tools/niecza/t/spec/S32-num/rounders.t line 34 (mainline @ 32) 
  at /Users/colomon/tools/niecza/lib/CORE.setting line 2224 (ANON @ 2) 
  at /Users/colomon/tools/niecza/lib/CORE.setting line 2225 (module-CORE @ 58) 
  at /Users/colomon/tools/niecza/lib/CORE.setting line 2225 (mainline @ 1) 
  at <unknown> line 0 (ExitRunloop @ 0) 

Okay, so that’s at least two errors that need fixing.

We’ll go in order here, even though it means tackling what is most likely the most complicated error first. (If you do think this part of the problem is too hard to tackle, please skip ahead, because the last few improvements I made really were incredibly easy to do.) Opening src/CORE.setting, we find the following definition for round:

sub round($x, $scale=1) { floor($x / $scale + 0.5) * $scale }

Okay, so the real problem is in floor:

sub floor($x) { Q:CgOp { (floor {$x}) } }

What the heck does Q:CgOp mean? It means floor is actually implemented in C#. So we open up lib/Builtins.cs and search for floor, eventually finding public static Variable floor(Variable a1). I won’t print the full source code here, because it is on the long side, with a case for each of the different number types. We’re only interested in the floating point case here:

        if (r1 == NR_FLOAT) {
            double v1 = PromoteToFloat(r1, n1);
            ulong bits = (ulong)BitConverter.DoubleToInt64Bits(v1);
            BigInteger big = (bits & ((1UL << 52) - 1)) + (1UL << 52);
            int power = ((int)((bits >> 52) & 0x7FF)) - 0x433;
            // note: >>= has flooring semantics for signed values
            if ((bits & (1UL << 63)) != 0) big = -big;
            if (power > 0) big <<= power;
            else big >>= -power;
            return MakeInt(big);

We don’t actually need to understand how all that works to fix this problem. The important bit is the PromoteToFloat line, which sets v1 to the floating point value which is the input to our floor. If we add a trap right after that, it should fix this bug. A quick C# websearch shows me that Double has member functions IsNaN, IsNegativeInfinity, and IsPositiveInfinity. Looking a bit around the Niecza source shows there is a MakeFloat function for returning floating point values. Let’s try:

if (Double.IsNaN(v1) || Double.IsNegativeInfinity(v1) || Double.IsPositiveInfinity(v1)) {
    return MakeFloat(v1);

One quick call to make later, I can try the test file again:

Wynne:niecza colomon$ t/fudgeandrun t/spec/S32-num/rounders.t
ok 1 - floor(NaN) is NaN
ok 2 - round(NaN) is NaN
ok 3 - ceiling(NaN) is NaN
not ok 4 - truncate(NaN) is NaN
# /Users/colomon/tools/niecza/t/spec/S32-num/rounders.t line 19
#    Failed test
#           got: -269653970229347386159395778618353710042696546841345985910145121736599013708251444699062715983611304031680170819807090036488184653221624933739271145959211186566651840137298227914453329401869141179179624428127508653257226023513694322210869665811240855745025766026879447359920868907719574457253034494436336205824

Progress! Apparently truncate uses a separate method, so we’ll have to fix it separately.

sub truncate($x) { $x.Int }
method Int() { Q:CgOp { (coerce_to_int {self}) } }
    public static Variable coerce_to_int(Variable a1) {
        int small; BigInteger big;
        return GetAsInteger(a1, out small, out big) ?
            MakeInt(big) : MakeInt(small);

Oooo, this is perhaps a little bit trickier. Still a basic variant on the previous method, grabbing boilerplate code from a nearby function:

        int r1;
        P6any o1 = a1.Fetch();
        P6any n1 = GetNumber(a1, o1, out r1);

        if (r1 == NR_FLOAT) {
            double v1 = PromoteToFloat(r1, n1);
            if (Double.IsNaN(v1) || Double.IsNegativeInfinity(v1) || Double.IsPositiveInfinity(v1)) {
                return MakeFloat(v1);

I skipped the HandleSpecial2 bit in the boilerplate, because I’m never quite sure how that works. Luckily, we have the spectests to check and see if I have broken something by doing this.

Now the first 15 tests in rounders.t pass, leaving us with the

Unhandled exception: Unable to resolve method truncate in class Num

error. That should be easy to handle! If we go back to lib/CORE.setting and search for ceiling, we see it appears two times: in the catch-all base class Cool and as a stand-alone sub. If we look at the neighboring subs, we see floor, ceiling, round, and truncate are all defined. If we look in Cool, however, only floor, ceiling, and round defined. That’s the source of our trouble!

The method definitions of the others in Cool are really simple; all they do is forward to the sub versions. It’s very easy to add a truncate that does that:

    method truncate() { truncate self }

And poof! This time when we run rounders.t, we pass all 108 tests.

At this point we’ve got three things left to do. First, now that rounders.t passes, we need to add it to t/ The list of tests there is ordered, so I just find the S32-num section and add S32-num/rounders.t in alphabetical order.

Next I will commit all the changes to my copy of the git repo. (I won’t explain how to do that, there are lots of git tutorials on the web.) Then I run make spectest to make sure I haven’t broken anything with these changes. (Hmm… actually a few TODO passing, bugs elsewhere that this patch has fixed! Oh, and one test broken, but it’s one which we were only passing by accident before, so I won’t feel bad about fudging it.)

Once that is done, you need to send the patch on to the Niecza developers; I believe the easiest way to do this is via github.

I’ve got one more little change to make that popped into my head while I was working on this. One naive way of implementing, say floor would be to convert the input into a floating point value (a Num in Perl 6) and then do Num.floor. That doesn’t work for all numbers, however, as most of the other number types are capable of storing numbers larger than will fit in a standing floating point double. So we probably need tests in the test suite to check for these cases. Let’s add them.

The tests in rounders.t are weirdly organized for my taste. But hey, we can always add our tests at the bottom.

    my $big-int = 1234567890123456789012345678903;
    is $big-int.floor, $big-int, "floor passes bigints unchanged";
    is $big-int.ceiling, $big-int, "ceiling passes bigints unchanged";
    is $big-int.round, $big-int, "round passes bigints unchanged";
    is $big-int.truncate, $big-int, "truncate passes bigints unchanged";

That passes okay in Niecza. (Probably out of courtesy we should check it on Rakudo as well and fudge it appropriately to make sure we’re not breaking their spectest!) We need to remember to add the count of new tests to the plan at the top of the test file. And then we can push that fix to github as well.

In conclusion, contributing to Perl 6 is easy. Anyone who tries writing Perl 6 code and reports problems they have to #perl6 is helping in a very real way. If you can write even fairly simple Perl 6 code, then you can write useful spec tests. It’s only marginally harder to write new methods for the setting in Perl 6. And even when you have to get down and dirty and start dealing with the language the compiler is implemented in, it’s still quite possible to do useful work without any deep understanding of how the compiler works.

Day 1: Catching Up With Perl 6

When we started the Perl 6 Advent Calendar back in 2009, Rakudo was really the only game in town if you wanted to play with Perl 6. But Perl 6 was intended from the start to be a language with multiple implementations, and at the moment there are four different Perl 6 implementations of interest. Because there are so many implementations, I’m not going to give instructions for getting each; instead I’m linking to those instructions.

The most stable and complete implementation is Rakudo Star. This is currently based on the last major revision of Rakudo. It’s been frozen since July, and so lags a bit behind the current Perl 6 spec. It’s slow. But it’s also pretty reliable.

The current Rakudo development version is called “Nom”. It’s full of great improvements over the last Rakudo Star release, notably native types, improved performance, and a much better metamodel. (For example, check out the Grammar::Tracer module, which takes advantage of the new metamodel to add regex tracing in just 44 lines of code.) It’s not quite ready for prime time yet, as it still misses some features that work in Rakudo Star, but progress has been incredible, and it’s quite possible a new Rakudo Star based on Nom will be released during this month.

Stefan O’Rear’s Niecza was just a fledging compiler during last year’s Advent calendar, but it’s a serious contender these days. Built to run on the CLR (.NET and Mono), it is relatively zippy, implements a significant portion of Perl 6, and works easily with existing CLR libraries.

Lastly, ingy and Mäsak have plans afoot to revive Pugs, the original Perl 6 implementation in Haskell. So far they’ve just got it building again on current Haskell compilers, but the long-term goal is to get it running on the spec tests again and bring it closer to the current spec.

Which implementation should you use? If you’re looking for a stable, fairly complete Perl 6, Rakudo Star is it. If you just want to explore the language, try Rakudo Nom — you will probably run into bugs, but it’s significantly more advanced than Rakudo Star, and exposing the bugs is a big help to Rakudo’s development. If you have an idea which would benefit from being able to use CLR libraries, Niecza is fantastic. There’s a handy comparison chart of the different features available.

Personally, I have all three of these installed on my machine, and have different projects underway on each of them.

Finally, please don’t hesitate to ask for help, either in the comments here or on the #perl6 IRC channel on Freenode. The Perl 6 community is very friendly.

Perl 6 Advent Calendar 2011

For the third year in a row, we are going to post something about Perl 6 every day until Christmas. This post will serve as a table of contents for the entire month.

Day 1: Catching Up With Perl 6
Day 2: Grammar::Tracer and Grammar::Debugger
Day 3: Buffers and Binary IO
Day 4: Traits — Meta Data with Character
Day 5: The Flip-Flop operator
Day 6: Tetris on Niecza
Day 7: Adventures in writing a simple grammar profiler
Day 8: Lexicality and Optimizability
Day 9: Contributing to Perl 6
Day 10: Documenting Perl 6
Day 11: Privacy and OOP
Day 12: Exploratory Parsing with Perl 6
Day 13: Bailador — A small Dancer clone
Day 14: Meta-programming: what, why and how
Day 15: Something Exceptional
Day 16: Where Have All The References Gone?
Day 17: Gtk Mandelbrot
Day 18: The view from the inside: using meta-programming to implement Rakudo
Day 19: Abstraction and why it’s good
Day 20: Paired up Hashes
Day 21: Native libraries, native objects
Day 22: Operator overloading, revisited
Day 23: Idiomatic Perl 6
Day 24: Subs are Always Better in multi-ples
Day 25: Merry Christmas!

Day 18 – ABC Module

Instead of focusing on a particular feature of Perl 6 today, I’d like to take you on a quick tour of the ABC module. I think it’s a nice example of some of the strengths of Perl 6.

ABC Notation is a simple text file format designed to make it easy to support musical notation. It’s widely used in the world of traditional dance music because it is very lightweight and more than powerful enough to support notating jigs and reels. Here’s an example in honor of the season:

T:While Shepherds Watched Their Flocks
O:Green's Harbour, Newfoundland
K:A major
E|:[M:5/4] A A/B/ B2 A|B c/B/ A2 A/B/|
[M:6/4]c d/c/ B2 B2|[M:4/4] A3 E|AB AG|
FE FE|AB AG|F2 F2|E2 G2|A/G/ F/E/ DF|
[1 [M:6/4] E C/B,/ A,3 E:|[2 [M:5/4] E C/B,/ A,3|]

I won’t get into the details — here’s a tutorial if you’d like to know more — but the structure of the file is simple. The first section is the header, with general information about the tune. The remainder is the tune itself. This one is a bit more complicated than many because of all the embedded time signature changes, like [M:6/4].

I was always surprised there wasn’t a CPAN module for handling ABC notation, and even seriously considered writing one myself at one point. But parsing ABC is a complicated process, and I gave up in frustration before getting very far.

Enter Perl 6 and its grammars. About 60 lines of simple regexes is all that is needed to parse most of the tunes I am interested in. (Several of the more complicated features of ABC, like lyrics and multi-staff music are not implemented yet.) Here’s a snatch of it:

    regex basenote { <[a..g]+[A..G]> }
    regex octave { "'"+ | ","+ }
    regex accidental { '^' | '^^' | '_' | '__' | '=' }
    regex pitch { <accidental>? <basenote> <octave>? }

Compare that to the model BNF grammar for ABC:

basenote ::= %x43 / %x44 / %x45 / %x46 / %x47 / %x41 / %x42 / %x63 / %x64 / %x65 / %x66 / %x67 / %x61 / %x62 ; CDEFGABcdefgab
octave ::= 1*"'" / 1*"," 
accidental ::= "^" / "^^" / "_" / "__" / "=" 
pitch ::= [accidental] basenote [octave]

It’s clearly a very straightforward translation process.

By default, parsing with a Perl 6 grammar just gives you a basic Match object. Adding an object to specify actions to go with the grammar allows you to easily process the information as it is parsed. A simple example is

    method rest($/) {

Whenever the rest regex fires, it returns a new ABC::Rest object. The constructor is passed the string form of the rest_type regex, and an ABC::Duration object created by the action for note_length.

Speaking of durations, another feature of Perl 6 comes very handy here. Durations are represented exactly using the rational number Rat type. If we didn’t have them available, we’d have to code up something similar by hand in order to handle things like triplets whose durations cannot be exactly represented by a floating point number.

So far there is only one significant application using these tools — the script included in the ABC module. It converts ABC files to the Lilypond music notation format. Lilypond is a very powerful open source music notation system which produces gorgeous sheet music output. This is a great way to print out good looking music notation from ABC files. (I know there is an abc2ly program included with Lilypond, but last time I checked its output looked tragically bad. is already working well enough that I’m hoping to produce a book of sheet music using it in 2011.) So let me leave you with the PDF of the above carol, produced using these tools, Rakudo Perl 6, and Lilypond.

Day 11 – Markov Sequence

On Day 4, I teased with mention of non-numeric sequences. Today I’d like to explore one such sequence, based on the common idea of using Markov chains on text. We will determine the next letter in the sequence randomly based on the previous two letters. The distribution follows the patterns contained in a model text, so that the result approximates the language of the model.

use v6;
use List::Utils;

my $model-text = $*;
$model-text .=subst(/<[_']>/, "", :global);
$model-text .=subst(/<-alpha>+/, " ", :global);

my %next-step;
for sliding-window($model-text.comb, 3) -> $a, $b, $c {
    %next-step{$a ~ $b}{$c}++;

my $first = $model-text.substr(0, 1);
my $second = $model-text.substr(1, 1);
my @chain := $first, $second, -> $a, $b { %next-step{$a ~ $b}.roll.key } ... *;
say @chain.munch(80);

After the initial use statements, the code divides neatly into three sections. The first section inputs the model text and gets rid of the non-alphabetic characters. Line 4 uses slurp to read standard input ($*IN) into a single string, and lc makes it all lowercase. The first subst (line 5) removes all underscores and apostrophes from the text. The second (line 6) replaces each string of non-alphabetic characters with a single space.

The second section combines the sliding-window sub from List::Utils with a bit of the good old Perl hash magic. (You can get List::Utils using neutro.)

$model-text.comb splits the text into individual characters.

sliding-windows iterates through a list, giving you the next N (3, in this case) elements in the list starting with each element in the list. (That is, you get the 1st, 2nd, and 3rd elements, then the 2nd, 3rd, and 4th elements, then the 3rd, 4th, and 5th elements, etc.) In this case we use it to get every set of three consecutive characters in the text.

In that loop, we construct a hash table of hash tables. The keys to the outer are the first two of those three consecutive characters. The keys to the inner are the third consecutive character, and its value is how many times that character follows the first two. So, for instance, if I feed the lyrics to the Aqualung album into this program, then %next-step{"qu"} looks like this:

{"a" => 5, "e" => 2}

That is to say, if you have “q” and “u”, they are followed by “a” five times (presumably all in the name Aqualung) and “e” twice (“requests” and “question”).

The third section of the code, then uses the knowledge we’ve just accumulated to build the sequence. First we get the first and second characters of the model space, just so we can start with a pair of letters we know for sure will have a letter coming after them. Then we construct a sequence which starts with those two, and uses -> $a, $b { %next-step{$a ~ $b}.roll } as a generator. The generator uses the previous two characters in the sequence to look up the appropriate frequency hash for the next character. The roll method randomly returns one of the keys of that hash, weighted by the value. (In the “qu” example above, you could think of this as rolling a seven-sided die, five of whose faces say “a” and two “e”.) If there is no known character which follows the previous two (for instance, if the last two characters in the model text are a pair unique in the text and you reach them both in order), then an undefined value is returned, which stops the sequence. We get the first 80 characters from the sequence using the munch method, chosen because it is well-behaved if the sequence terminates early.

Running the script on the lyrics to Aqualung produces sequences like
“t carealven thead you he sing i withe and upon a put saves pinsest to laboonfeet” and “t steall gets sill a creat ren ther he crokin whymn the gook sh an arlieves grac”. (Why does it start with “t “? My Aqualung lyrics file starts with some ASCII-art apparently attempting to imitate the original liner notes, and when we removed the non-alphabetic characters it boils down to “t “.)

Note that nowhere here does this script make assumptions about the character set it is working with. Anything that Perl 6 recognizes as alphabetic can be processed this way. If you feed it the standard “Land der Berge” file that p6eval uses as stdin, you’ll get strings like “laß in ber bist brüften las schören zeites öst froher land der äckerzeichöne lan”. (Fingers crossed that WordPress and your browser handle the non-ASCII characters correctly!)

One word of warning: As I was finishing this, the #perl6 channel raised the question of what Hash.roll should actually do. Right now in Rakudo it has the functionality of the (not yet implemented) KeyBag.roll method. Once it is implemented, KeyBag could be substituted if Hash.roll ends up spec’d differently.

There is a simple alternative which works today, however. If you change to %next-step{$a ~ $b}{$c}++ to %next-step.push($a ~ $b, $c), %next-step will be constructed as a Hash of Arrays. Each array will list all the characters $c which appear after $a and $b, with each distinct character repeated the number of times it appears in the file. This naturally acts as a weighting for .roll to use in the sequence generator.

I need to give a big thank you to Moritz Lenz, who was a huge help cleaning up and simplifying this script.

Day 4 – The Sequence Operators

Last year, there was a brief tease of the sequence operator (tweaked slightly to be correct after a year’s worth of changes to the spec):

my $even-numbers  := 0, 2 ... *;    # arithmetic seq
my $odd-numbers   := 1, 3 ... *;
my $powers-of-two := 1, 2, 4 ... *; # geometric seq

This now works in Rakudo:

> my $powers-of-two := 1, 2, 4 ... *; 1;
> $powers-of-two[^10]
1 2 4 8 16 32 64 128 256 512

(Note: All the code examples in this post have been run in Rakudo’s REPL, which you can reach by running the perl6 executable with no command line arguments. Lines that start with > I what I typed; the other lines are Rakudo’s response, which is generally the value of the last expression in the line. Because the variable $powers-of-two is an infinite lazy list, I’ve added 1; at the end of the line, so the REPL prints that instead of going into an infinite loop.)

We need to trim the infinite list so that Rakudo doesn’t spend an infinitely long time calculating it. In this case, I used [^10], which is a quick way of saying “Give me the first ten elements.” (Note that when you bind a lazy list to an array variable like this, values which have been calculated are remembered; it’s a quick form of memoization.)

The sequence operator ... is a very powerful tool for generating lazy lists. The above examples just start to hint at what it can do. Given one number, it just starts counting up from that number (unless the terminal end of the sequence is a lower number, in which case it counts down). Given two numbers to start a sequence, it will treat it as an arithmetic sequence, adding the difference between those first two numbers to the last number generated to generate the next one. Given three numbers, it checks to see if they represent the start of an arithmetic or a geometric sequence, and will continue it.

Of course, many interesting sequences are neither arithmetic nor geometric, in which case you need to explicitly provide the sub to generate the next number in the sequence:

> my $Fibonacci := 0, 1, -> $a, $b { $a + $b } ... *; 1;
> $Fibonacci[^10]
0 1 1 2 3 5 8 13 21 34

The -> $a, $b { $a + $b } there is a pointy block (ie a lambda function) which takes two arguments and returns their sum. The sequence operator figures out how many arguments the block takes, and passes the needed arguments from the end of the sequence so far to generate the next number in the sequence. And so on, forever.

Or not forever. So far all these examples have had the Whatever star on the right hand side, which means “There is no terminating condition.” If you instead have a number there, the list will terminate when that number is exactly reached.

> 1, 1.1 ... 2
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
> 1, 1.1 ... 2.01
... Rakudo spins its wheels, because this is an infinite list ...
> (1, 1.1 ... 2.01)[^14]
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2 2.3

The first one of those terminates naturally, but the second one missed the terminator and kept right on going. The result is an infinite list, so I limited it to the first 14 elements so that we could see what it was doing.

Those of you with backgrounds doing floating point math are probably sputtering about the dangers of assuming that adding .1 repeatedly will add up to exactly 2. In Perl 6, that’s not quite such an issue because it will use Rat (ie fractional) math where possible. But the general point is still very solid. If I want to find all the Fibonacci numbers below 10000, needing to know exactly the number to stop on is a big hassle. Luckily, just as you can use a block to specify how to generate the next element in a sequence, you can also use one to test to see whether the sequence should end yet:

> 0, 1, -> $a, $b { $a + $b } ... -> $a { $a > 10000 };
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946

The pointy block -> $a { $a > 10000 } creates a block which takes one argument, and returns true when that argument is greater than 10000; just the test we want.

Except we were looking for all the Fibonacci less than 10000. We generated that plus the first Fibonacci number greater than 10000. When passed a block as a termination test, the sequence operator returns all its elements until that block returns true, then it returns that last element and stops. But there is alternative form of the sequence operator that will do the trick:

> 0, 1, -> $a, $b { $a + $b } ...^ -> $a { $a > 10000 };
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765

Switching from ... to ...^ means the resulting list does not include the first element for which the termination test returned true.

Two side notes on this. This is actually a long-winded way of specifying these sequences in Perl 6. I don’t have space to explain Whatever Closures here, but this post from last year talks about them. Using them, you can rewrite that last sequence as

> 0, 1, * + * ...^ * > 10000;
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 676

It’s up to you whether or not you think this is clearer; there’s more than one way to do it.

Also, the left-hand-side of the sequence operator can be any list, even lazy ones. This means you can easily use a terminating block to get a limited portion of an existing lazy list:

> my $Fibonacci := 0, 1, * + * ... *; 1;
> $Fibonacci ...^ * > 10000
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765
> $Fibonacci[30]

(I stuck the last check there just to demonstrate that $Fibonacci still goes on past 10000.)

This only begins to scratch the surface of what sequences can do. For more information, see “List infix precedence” in the spec, and scroll down to the sequence operator. (Though note that it is still not completely implemented! It is an extremely complex operator.)

One particular twist I’d like to leave you with: the sequence operator is not constrained to working with numeric values. If you explicitly specify your own generator, you can make a sequence out of any type at all. But I’d like to leave that for a future Advent present…

Day 17: Making Snowmen

I started out planning this day to be about complex numbers in Perl 6. But after I thought about it a bit, I decided that the complex number implementation is so straightforward explaining it would make a pretty boring gift. Instead, let’s explore the Mandelbrot set, which will let us do a bit of complex math, look at pretty pictures, and hint at some advanced features of Perl 6, too.

Without further ado, here’s the first version of the script:

use v6;

my $height = @*ARGS[0] // 31;
my $width = $height;
my $max_iterations = 50;

my $upper-right = -2 + (5/4)i;
my $lower-left = 1/2 - (5/4)i;

sub mandel(Complex $c) {
    my $z = 0i;
    for ^$max_iterations {
        $z = $z * $z + $c;
        return 1 if ($z.abs > 2);
    return 0;

sub subdivide($low, $high, $count) {
    (^$count).map({ $low + ($_ / ($count - 1)) * ($high - $low) });

say "P1";
say "$width $height";

for subdivide($, $, $height) -> $re {
    my @line = subdivide($re + ($, $re + 0i, ($width + 1) / 2).map({ mandel($_) });
    my $middle = @line.pop;
    (@line, $middle, @line.reverse).join(' ').say;

So, lines 3-5 set up the pixel size of the graphic we will create. @*ARGS is the new name of the command line argument array. The // operator is the new “defined” operator; it returns its first argument if that argument is defined, its second otherwise. In other words, line 3 sets $height to be the first argument on the command line, or 31 if no such argument was set. $width is set equal to $height — the code is set up to generate a square graphic right now, but the variables are set apart for ease of future hacking. $max_iterations sets how many times the core Mandelbrot loop will iterate before it concludes a point is in the set. (Because we’re relying on the symmetry of the Mandelbrot set, $width must be odd.)

Lines 7-8 set the boundaries of our image on the complex plane. Introducing the imaginary component of a number is as simple as giving a number (or numeric expression) followed by i; this creates a number of the Complex type. Complex math works pretty much the way you would expect it to, for example, (as we see here) adding a Complex to an Int or a Rat yields another Complex.

Lines 10-17, then, are the core Mandelbrot function. To quickly explain, a complex number c is in the Mandrelbrot set if the equation z = z * z + c (with initial z of 0) stays bounded as we iterate the equation. This function implements exactly that in Perl 6. We set up a loop to iterate $max_iterations times. It is known that once |z| grows bigger than 2 it will not stay bounded, so we use $z.abs > 2 to check for that condition. If it is true, we leave the loop early, returning 1 from the function to indicate the corresponding pixel should be black. If the loop finishes the number of iterations without exceeding those bounds, we return 0 for the color white.

Lines 19-21 are a simple helper function to return a list of a simple arithmetic progression from $low to $high with $count elements. Note that $low and $high have no specified type, so any type (or even pair of types) that the basic arithmetic operators will work on will work here. (In this script, we use it first for Num, and then for Complex.)

Lines 23-24 print the header for the header for a PBM file.

Lines 26-30 print the actual image data. $ is the real part of the complex number $upper-right, and $ is the imaginary part. The loop iterates over the real part of the range. Inside the loop, we subdivide again along the imaginary part to generate a list of the complex values we are interested in examining for one half of this row of the image. We then run that list through the mandel function using map, generating a list of 0s and 1s for half of the row, including the midpoint.

We do it this way because the Mandelbrot set is symmetric about the imaginary axis. So we then pop that midpoint, and make a new list which is the old list (minus the midpoint), the midpoint, and the list (minus the midpoint) reversed. We then feed that to join to make a string for the entire line, and finally say to print it out.

Note that doing it this way rotates the Mandelbrot set 90 degrees from the way it is normally displayed, giving us a lovely snowman shape:
White on black Mandelbrot set

With the current Rakudo, this is quite slow, and prone to crash randomly depending on the size of the image you are generating. However, imagine for a minute that happy future day when Rakudo is not only snappy, but handles automatic hyperoperator threading as well. At that point, it will be easy to make a parallel version of this script by changing the map call to a hyperoperator.

There’s only one tricky bit: there’s no way to have a hyperoperator call a normal sub. They only call class methods and operators. So, as a first approximation which works in current Rakudo, we can “augment” the Complex class to have a .mandel.


augment class Complex {
    method mandel() {
        my $z = 0i;
        for ^$max_iterations {
            $z = $z * $z + self;
            return 1 if ($z.abs > 2);
        return 0;

for subdivide($, $, $height) -> $re {
    my @line = subdivide($re + ($, $re + 0i, ($width + 1) / 2)>>.mandel;
    my $middle = @line.pop;
    (@line, $middle, @line.reverse).join(' ').say;

The only difference to mandel is it is now a method, and the role of the former $c argument is taken by self. Then instead of map({mandel($_)}) we use the hyperoperator.

As I said, this version works now. But personally, I’m a little uncomfortable augmenting an existing class like that; it feels dirty to my old C++ instincts. We can avoid it by turning mandel into an operator:

sub postfix:<☃>(Complex $c) {
    my $z = 0i;
    for ^$max_iterations {
        $z = $z * $z + $c;
        return 1 if ($z.abs > 2);
    return 0;

for subdivide($, $, $height) -> $re {
    my @line = subdivide($re + ($, $re + 0i, ($width + 1) / 2)>>☃;
    my $middle = @line.pop;
    (@line, $middle, @line.reverse).join(' ').say;

This takes advantage of Perl 6’s Unicode ability to have a little fun, defining the operator using the snowman symbol. This ☃ operator works fine in Rakudo today, but alas the >>☃ hyperoperator does not work yet.

Thanks to Moritz and TimToady for suggests and help with this code. Two versions of the script (one full color) are up at github, if you’d like to play around with them.

Update (4/18/2010): I’ve ported the color version of the script at github to the latest version of Rakudo. It’s quite slow, and uses phenomenal amounts of memory, but unlike the previous version it is rock-solid stable. Here’s a 1001×1001 full color Mandelbrot set that took it 14 hours and 6.4 GB of memory to compute.