Day 21 – Collatz Variations

The Collatz sequence is one of those interesting “simple” math problems that I’ve run into a number of times. Most recently a blog post on programming it in Racket showed up on Hacker News. As happens so often, I instantly wanted to implement it in Perl 6.

```sub collatz-sequence(Int \$start) {
\$start, { when * %% 2 { \$_ / 2 }; when * !%% 2 { 3 * \$_ + 1 }; } ... 1;
}

sub MAIN(Int \$min, Int \$max) {
say [max] (\$min..\$max).map({ +collatz-sequence(\$_) });
}
```

This is a very straightforward implementation of the Racket post’s `max-cycle-length-range` as a stand-alone p6 script. `collatz-sequence` generates the sequence using the p6 sequence operator. Start with the given number. If it is divisible by two, do so: `when * %% 2 { \$_ / 2 }`. If it is not, multiply by three and add 1: `when * !%% 2 { 3 * \$_ + 1 }`. Repeat this until the sequence reaches 1.

`MAIN(Int \$min, Int \$max)` sets up our main function to take two integers. Many times I don’t bother with argument types in p6, but this provides a nice feedback for users:

```> perl6 collatz.pl blue red
Usage:
collatz.pl <min> <max>
```

The core of it just maps the numbers from `\$min` to `\$max` (inclusive) to the length of the sequence (`+collatz-sequence`) and then says the max of the resulting list (`[max]`).

Personally I’m a big fan of using the sequence operator for tasks like this; it directly represents the algorithm constructing the Collatz sequence in a simple and elegant fashion. On the other hand, you should be able to memoize the recursive version for a speed increase. Maybe that would give it an edge over the sequence operator version?

Well, I was wildly wrong about that.

```sub collatz-length(\$start) {
given \$start {
when 1       { 1 }
when * !%% 2 { 1 + collatz-length(3 * \$_ + 1) }
when * %% 2  { 1 + collatz-length(\$_ / 2) }
}
}

sub MAIN(\$min, \$max) {
say [max] (\$min..\$max).map({ collatz-length(\$_) });
}
```

This recursive version, which makes no attempt whatsoever to be efficient, is actually better than twice as fast as the sequence operator version. In retrospect, this makes perfect sense: I was worried about the recursive version making a function call for every iteration, but the sequence version has to make two, one to calculate the next iteration and the other to check and see if the ending condition has been reached.

Well, once I’d gotten this far, I thought I’d better do things correctly. I wrote two framing scripts, one for timing all the available scripts, the other for testing them to make sure they work!

```my @numbers = 1..200, 10000..10200;

sub MAIN(Str \$perl6, *@scripts) {
my %results;
for @scripts -> \$script {
my \$start = now;
qqx/\$perl6 \$script { @numbers }/;
my \$end = now;

%results{\$script} = \$end - \$start;
}

for %results.pairs.sort(*.value) -> (:key(\$script), :value(\$time)) {
say "\$script: \$time seconds";
}
}
```

This script takes as an argument a string that can be used to call a Perl 6 executable and a list of scripts to run. It runs the scripts using the specified executable, and times them using p6’s `now` function. It then sorts the results into order and prints them. (A similar script I won’t post here tests each of them to make sure they are returning correct results.)

In the new framework, the Collatz script has changed a bit. Instead of taking a min and a max value and finding the longest Collatz sequence generated by a number in that range, it takes a series of numbers and generates and reports the length of the sequence for each of them. Here’s the sequence operator script in its full new version:

```sub collatz-length(Int \$start) {
+(\$start, { when * %% 2 { \$_ / 2 }; when * !%% 2 { 3 * \$_ + 1 }; } ... 1);
}

sub MAIN(*@numbers) {
for @numbers -> \$n {
say "\$n: " ~ collatz-length(\$n.Int);
}
}
```

For the rest of the scripts I will skip the `MAIN` sub, which is exactly the same in each of them.

Framework established, I redid the recursive version starting from the new sequence operator code.

```sub collatz-length(Int \$n) {
given \$n {
when 1       { 1 }
when * %% 2  { 1 + collatz-length(\$_ div 2) }
when * !%% 2 { 1 + collatz-length(3 * \$_ + 1) }
}
}
```

The sharp-eyed will notice this version is different from the first recursive version above in two significant ways. This time I made the argument `Int \$n`, which instantly turned up a bit of a bug in all implementations thus far: because I used `\$_ / 2`, most of the numbers in the sequence were actually rationals, not integers! This shouldn’t change the results, but is probably less efficient than using `Int`s. Thus the second difference about, it now uses `\$_ div 2` to divide by 2. This version remains a great improvement over the sequence operator version, running in 4.7 seconds instead of 13.3. Changing ` when * !%% 2` to a simple `default` shaves another .3 seconds off the running time.

Once I started wondering how much time was getting eaten up by the `when` statements, rewriting that bit using the ternary operator was an obvious choice.

```sub collatz-length(Int \$start) {
+(\$start, { \$_ %% 2 ?? \$_ div 2 !! 3 * \$_ + 1 } ... 1);
}
```

Timing results: Basic sequence 13.4 seconds. Sequence with `div` 11.5 seconds. Sequence with `div` and ternary 9.7 seconds.

That made me wonder what kind of performance I could get from a handcoded loop.

```sub collatz-length(Int \$n is copy) {
my \$length = 1;
while \$n != 1 {
\$n = \$n %% 2 ?? \$n div 2 !! 3 * \$n + 1;
\$length++;
}
\$length;
}
```

That’s by far the least elegant of these, I think, but it gets great performance: 3 seconds.

Switching back to the recursive approach, how about using the ternary operator there?

```sub collatz-length(Int \$n) {
return 1 if \$n == 1;
1 + (\$n %% 2 ?? collatz-length(\$n div 2) !! collatz-length(3 * \$n + 1));
}
```

This one just edges out the handcoded loop, 2.9 seconds.

Can we do better than that? How about memoization? `is cached` is supposed to be part of Perl 6; neither implementation has it yet, but last year’s Advent calendar has a Rakudo implementation that still works. Using the last version changed to `sub collatz-length(Int \$n) is cached {` works nicely, but takes 3.4 seconds to execute. Apparently the overhead of caching slows it down a bit. Interestingly, the non-ternary recursive version does speed up with `is cached`, from 4.4 seconds to 3.6 seconds.

```sub collatz-length(Int \$n) {
return 1 if \$n == 1;
state %lengths;
return %lengths{\$n} if %lengths{\$n}:exists;
%lengths{\$n} = 1 + (\$n %% 2 ?? collatz-length(\$n div 2) !! collatz-length(3 * \$n + 1));
}
```

Bingo! 2.7 seconds.

I’m sure there are lots of other interesting approaches for solving this problem, and encourage people to send them in. In the meantime, here’s my summary of results so far:

 Script Rakudo Niecza bin/collatz-recursive-ternary-hand-cached.pl 2.5 1.7 bin/collatz-recursive-ternary.pl 3 1.7 bin/collatz-loop.pl 3.1 1.7 bin/collatz-recursive-ternary-cached.pl 3.2 N/A bin/collatz-recursive-default-cached.pl 3.5 N/A bin/collatz-recursive-default.pl 4.4 1.8 bin/collatz-recursive.pl 4.9 1.9 bin/collatz-sequence-ternary.pl 9.9 3.3 bin/collatz-sequence-div.pl 11.6 3.5 bin/collatz-sequence.pl 13.5 3.8

The table was generated from timing-table-generator.pl.

Day 13 – Bags and Sets

Over the years, I’ve written many variations on this code:

```my %words;
for slurp.comb(/\w+/).map(*.lc) -> \$word {
%words{\$word}++;
}
```

(Aside: `slurp.comb(/\w+/).map(*.lc)` does the standard Perl trick of reading files specified on the command line or standard in, goes through the data for words, and makes them lowercase.)

Perl 6 introduces two new Associative types for dealing with this sort of functionality. `KeyBag` is drop-in replacement for `Hash` in this sort of case:

```my %words := KeyBag.new;
for slurp.comb(/\w+/).map(*.lc) -> \$word {
%words{\$word}++;
}
```

Why would you prefer `KeyBag` over `Hash` in this case, considering that it’s a bit more code? Well, it does a better job of saying what you mean, if what you want is a positive `Int`-valued `Hash`. It actually enforces this as well:

```> %words{"the"} = "green";
Unhandled exception: Cannot parse number: green
```

That’s Niecza’s error; Rakudo’s is less clear, but the important point is you get an error; Perl 6 detects that you’ve violated your contract and complains.

And `KeyBag` has a couple more tricks up its sleeve. First, four lines to initialize your `KeyBag` isn’t terribly verbose, but Perl 6 has no trouble getting it down to one line:

```my %words := KeyBag.new(slurp.comb(/\w+/).map(*.lc));
```

`KeyBag.new` does its best to turn whatever it is given into the contents of a `KeyBag`. Given a `List`, each of the elements is added to the `KeyBag`, with the exact same result of our earlier block of code.

If you don’t need to modify the bag after its creation, then you can use `Bag` instead of `KeyBag`. The difference is `Bag` is immutable; if `%words` is a `Bag`, then `%words{\$word}++` is illegal. If immutability is okay for your application, then you can make the code even more compact:

```my %words := bag slurp.comb(/\w+/).map(*.lc);
```

`bag` is a helper sub that just calls `Bag.new` on whatever you give it. (I’m not sure why there is no equivalent `keybag` sub.)

`Bag` and `KeyBag` have a couple more tricks up their sleeve. They have their own versions of `.roll` and `.pick` which weigh their results according to the given values:

```> my \$bag = bag "red" => 2, "blue" => 10;
> say \$bag.roll(10);
> say \$bag.pick(*).join(" ");
blue blue blue blue blue blue red blue red blue
blue red blue blue red blue blue blue blue blue blue blue
```

This wouldn’t be too hard to emulate using a normal `Array`, but this version would be:

```> \$bag = bag "red" => 20000000000000000001, "blue" => 100000000000000000000;
> say \$bag.roll(10);
> say \$bag.pick(10).join(" ");
blue blue blue blue red blue red blue blue blue
blue blue blue red blue blue blue red blue blue
```

They also work with all the standard `Set` operators, and have a few of their own as well. Here’s a simple demonstration:

```sub MAIN(\$file1, \$file2) {
my \$words1 = bag slurp(\$file1).comb(/\w+/).map(*.lc);
my \$words2 = set slurp(\$file2).comb(/\w+/).map(*.lc);
my \$unique = (\$words1 (-) \$words2);
for \$unique.list.sort({ -\$words1{\$_} })[^10] -> \$word {
say "\$word: { \$words1{\$word} }";
}
}
```

Passed two filenames, this makes a `Bag` from the words in the first file, a `Set` from the words in the second file, uses the set difference operator `(-)` to compute the set of words which are only in the first file, sorts those words by their frequency of appearance, and then prints out the top ten.

This is the perfect point to introduce `Set`. As you might guess from the above, it works much like `Bag`. Where `Bag` is a `Hash` from `Any` to positive `Int`, `Set` is a `Hash` from `Any` to `Bool::True`. `Set` is immutable, and there is also a mutable `KeySet`.

Between `Set` and `Bag` we have a very rich collection of operators:

 Operation Unicode “Texas” Result Type is an element of ∈ (elem) Bool is not an element of ∉ !(elem) Bool contains ∋ (cont) Bool does not contain ∌ !(cont) Bool union ∪ (|) Set or Bag intersection ∩ (&) Set or Bag set difference (-) Set set symmetric difference (^) Set subset ⊆ (<=) Bool not a subset ⊈ !(<=) Bool proper subset ⊂ (<) Bool not a proper subset ⊄ !(<) Bool superset ⊇ (>=) Bool not a superset ⊉ !(>=) Bool proper superset ⊃ (>) Bool not a proper superset ⊅ !(>) Bool bag multiplication ⊍ (.) Bag bag addition ⊎ (+) Bag

Most of these are self-explanatory. Operators that return `Set` promote their arguments to `Set` before doing the operation. Operators that return `Bag` promote their arguments to `Bag` before doing the operation. Operators that return `Set` or Bag promote their arguments to `Bag` if at least one of them is a `Bag` or `KeyBag`, and to `Set` otherwise; in either case they return the type promoted to.

Please note that while the set operators have been in Niecza for some time, they were only added to Rakudo yesterday, and only in the Texas variations.

A bit of a word may be needed for the different varieties of unions and intersections of `Bag`. The normal union operator takes the max of the quantities in either bag. The intersection operator takes the min of the quantities in either bag. Bag addition adds the quantities from either bag. Bag multiplication multiplies the quantities from either bag. (There is some question if the last operation is actually useful for anything — if you know of a use for it, please let us know!)

```> my \$a = bag <a a a b b c>;
> my \$b = bag <a b b b>;

> \$a (|) \$b;
bag("a" => 3, "b" => 3, "c" => 1)

> \$a (&) \$b;
bag("a" => 1, "b" => 2)

> \$a (+) \$b;
bag("a" => 4, "b" => 5, "c" => 1)

> \$a (.) \$b;
bag("a" => 3, "b" => 6)
```

I’ve placed my full set of examples for this article and several data files to play with on Github. All the sample files should work on the latest very latest Rakudo from Github; I think all but `most-common-unique.pl` and `bag-union-demo.pl` should work with the latest proper Rakudo releases. Meanwhile those two scripts will work on Niecza, and with any luck I’ll have the bug stopping the rest of the scripts from working there fixed in the next few hours.

```> perl6 bin/most-common-unique.pl data/Hamlet.txt data/Much_Ado_About_Nothing.txt
ham: 358
queen: 119
hamlet: 118
hor: 111
pol: 86
laer: 62
oph: 58
ros: 53
horatio: 48
clown: 47
```

Day 17: Gtk Mandelbrot

Two years ago today, the Advent post was on making Mandelbrot sets in Perl 6. At the time, they were in black and white, slow to produce, Rakudo was prone to crashing, and the only user interface thing you could control was how big the resulting PPM file was.

As they say, that was then. This is now.

The new `gtk-mandelbrot.pl` script is 423 lines of Perl 6 code — targeted at Niecza, threaded, and using the GtkSharp library. It allows you to move and resize the windows, zoom in (left mouse button, drag across image to define zoom boundaries), create Julia set images (right click on a Mandelbrot set image), increase the number of iterations (press ‘m’), and output a PPM file for a window (press ‘s’).

The threading doesn’t actually improve performance on my MacBook Pro (still looking into why) but it does make the script much more responsive.

It would be far too long to go through all the code, but lets hit on the highlights. The core is almost unchanged:

```        sub julia(Complex \$c, Complex \$z0) {
my \$z = \$z0;
my \$i;
loop (\$i = 0; \$i < \$max_iterations; \$i++) {
if \$z.abs > 2 {
return \$i + 1;
}
\$z = \$z * \$z + \$c;
}
return 0;
}
```

It’s named `julia` instead of `mandel` now, because it is more general. If you call it with `\$z0` set to `0`, it calculates the same thing the old `mandel` did. Allowing `\$z0` to vary allows you to calculate Julia sets as well.

The code around it is very different, though! Stefan O’Rear wrote the threading code, using Niecza’s Threads library, which is a thin wrapper on C#’s threading libraries, and probably not very close to what Perl 6’s built-in threading will look like when it is ready to go. He establishes a `WorkQueue` with a list of the work that needs to be done, and then starts N running threads, where N comes from the environment variable THREADS if it is present, and the reported processor count otherwise:

```for ^(%*ENV<THREADS> // CLR::System::Environment.ProcessorCount) {
}
```

`WorkQueue.run` is pretty simple:

```    method run() {
loop {
my \$item = self.shift;
next if \$item.cancelled;
\$item.run.();
\$item.mark-done;
}
}
```

This is an infinite loop that starts by getting the next `WorkItem` off the queue, checks to see if it has been cancelled, and if it hasn’t, calls the .run `Callable` attribute and then the `mark-done` method.

The `WorkItem`s on the queue look like this:

```
class WorkItem {
has Bool \$!done = False;
has Bool \$!cancelled = False;

has Callable &.run;
has Callable &.done-cb;

method is-done() { WorkQueue.monitor.lock({ \$!done }) }
method mark-done() {
&.done-cb.() unless WorkQueue.monitor.lock({ \$!done++ })
}

method cancelled() { WorkQueue.monitor.lock({ \$!cancelled }) }
method cancel() { WorkQueue.monitor.lock({ \$!cancelled = True }) }
}
```

Each `WorkItem` has two flags, `\$!done` and `\$!cancelled`, and two `Callable` attributes, `&.run`, already mentioned as what is called by `WorkQueue.run`, and `&.done-cb`, which is the callback function to be called when the `&.run` method finishes.

The two methods (for now) we use in our WorkItem are relatively simple:

```            sub row() {
my \$row = @rows[\$y];
my \$counter = 0;
my \$counter_end = \$counter + 3 * \$width;
my \$c = \$ur - \$y * \$delta * i;

while \$counter < \$counter_end {
my \$value = \$is-julia ?? julia(\$julia-z0, \$c) !! julia(\$c, 0i);
\$row.Set(\$counter++, @red[\$value % 72]);
\$row.Set(\$counter++, @green[\$value % 72]);
\$row.Set(\$counter++, @blue[\$value % 72]);
\$c += \$delta;
}
}

sub done() {
Application.Invoke(-> \$ , \$ {
\$.draw-area.QueueDrawArea(0, \$y, \$width, 1);
});
}

my \$wi = WorkItem.new(run => &row, done-cb => &done);
WorkQueue.push(\$wi);
push @.line-work-items, \$wi;
```

As you might expect, `row` calculates one line of the set we are working on. It may look like it is using global variables, but these subs are actually local to the `FractalSet.start-work` method and the variables are local to it from there. The `done` invokes a Gtk function noting that a portion of the window needs to be redrawn (namely the portion we just calculated).

The above block of code is called once for each row of the fractal window being generated, which has the effect of queuing up all of the fractal to be handled as there are available threads.

Moving upward in the code’s organization, each fractal window we generate is managed by an instance of the `FractalSet` class.

```class FractalSet {
has Bool \$.is-julia;
has Complex \$.upper-right;
has Real \$.delta;
has Int \$.width;
has Int \$.height;
has Int \$.max_iterations;
has Complex \$.c;
has @.rows;
has @.line-work-items;
has \$.new-upper-right;

has \$.draw-area;
```

`\$.is-julia` and `\$.max_iterations` are self-explanatory. `\$.upper-right` is the fixed complex number anchoring the image. `\$.delta` is the amount of change in the previous number per-pixel; we assume the pixels are square. `\$.width` and `\$.height` are the size of the window in pixels. `\$.c` only has meaning for Julia sets, where it is the value `\$c` in the equation `\$new-z = \$z * \$z + \$c`. `@.rows` the pixel information generated by the `row` sub above; `@.line-work-items` saves a reference to all of the `WorkItem`s generating those rows. `\$.new-upper-right` is temporary used during the zoom mouse operation. `\$.draw-area` is the `Gtk.DrawingArea` for the related window.

Once all that is set up, the rest of the code is pretty straightforward. The Gtk windowing code is set up in `FractalSet.build-window`:

```    method build-window()
{
my \$index = +@windows;
@windows.push(self);
self.start-work;

my \$window = Window.new(\$.is-julia ?? "julia \$index" !! "mandelbrot \$index");
\$window.SetData("Id", SystemIntPtr.new(\$index));
\$window.Resize(\$.width, \$.height);  # TODO: resize at runtime NYI

my \$event-box = GtkEventBox.new;
\$event-box.SetData("Id", SystemIntPtr.new(\$index));

my \$drawingarea = \$.draw-area = GtkDrawingArea.new;
\$drawingarea.SetData("Id", SystemIntPtr.new(\$index));

\$window.ShowAll;
}
```

We store a global array `@windows` tracking all the `FractalSet`s in play. Each of the different objects here gets the `"Id"` data set to this set’s index into the `@windows` array so we can easily look up the `FractalSet` from callback functions. The rest of the method is just plugging the right callback into each component — simple conceptually but it took this Gtk novice a lot of work figuring it all out.

As an example, consider the `KeyReleaseEvent` callback, which responds to presses on the keyboard.

```sub KeyReleaseEvent(\$obj, \$args) {
my \$index = \$obj.GetData("Id").ToInt32();
my \$set = @windows[\$index];

given \$args.Event.Key {
when 'm' | 'M' {
\$set.increase-max-iterations;
}
when 's' | 'S' {
\$set.write-file;
}
}
}
```

First we lookup the index into `@windows`, then we get the `\$set` we’re looking at. Then we just call the appropriate `FractalSet` method, for instance

```    method increase-max-iterations() {
self.stop-work;
\$.max_iterations += 32;
self.start-work;
}
```

`.stop-work` cancels all the pending operations for this FractalSet, then we bump up the number of iterations, and then we `.start-work` again to queue up a new set of rows with the new values.

The full source code is here. As of this writing it agrees with the code here, but this is an active project, and probably will change again in the not-too-distant future. Right now my biggest goals are figuring out how to get the threading to actually improve performance on my MacBook Pro and cleaning up the code. Both suggestions and questions are welcome.

Day 9: Contributing to Perl 6

This time, instead of sharing some cool feature of Perl 6, I’d like to talk about how easy it is to contribute usefully to the project. So I’m going to walk you through the process of making a change to Niecza. It does require a bit of domain knowledge (which the fine folks on `#perl6` will be happy to help you with) but it’s definitely not rocket science. It’s not even particularly deep computer science, for the most part.

A few days ago, Radvendii asked on `#perl6` if there was a `round` function in the core. The correct answer is “There should be one”, and it lead to a couple of bug fixes in Rakudo. But it got me to thinking — is Niecza supporting `round` (and its relatives `ceiling`, `floor`, and `truncate`) correctly?

Perl 6 has a huge suite of tests to see if an implementation is conforming to the spec, including a file for the `round` tests, `S32-num/rounders.t`. My first step then was to check the spectests currently being run by Niecza. Just like in Rakudo, this is stored in a file named `t/spectest.data`. So

```Wynne:niecza colomon\$ grep round t/spectest.data
Wynne:niecza colomon\$
```

Okay, clearly we’re not running the `S32-num/rounders.t` test file. (Note, in case you’re getting confused — the links in this post are to the latest versions of the files, which include all the changes I made writing this post.) That’s a sign that something is not properly supported yet. So let’s go ahead and run it by hand to see what happens. Both Niecza and Rakudo use a fudging process, allowing you to mark the bits of a test file that don’t work yet in a particular compiler. So we need to use a special fudging tool to run the code:

```Wynne:niecza colomon\$ t/fudgeandrun t/spec/S32-num/rounders.t
1..108
not ok 1 - floor(NaN) is NaN
# /Users/colomon/tools/niecza/t/spec/S32-num/rounders.t line 16
#    Failed test
#           got: -269653970229347386159395778618353710042696546841345985910145121736599013708251444699062715983611304031680170819807090036488184653221624933739271145959211186566651840137298227914453329401869141179179624428127508653257226023513694322210869665811240855745025766026879447359920868907719574457253034494436336205824
```

That’s followed by about 15 similar errors, then

```Unhandled exception: Unable to resolve method truncate in class Num
at /Users/colomon/tools/niecza/t/spec/S32-num/rounders.t line 34 (mainline @ 32)
at /Users/colomon/tools/niecza/lib/CORE.setting line 2224 (ANON @ 2)
at /Users/colomon/tools/niecza/lib/CORE.setting line 2225 (module-CORE @ 58)
at /Users/colomon/tools/niecza/lib/CORE.setting line 2225 (mainline @ 1)
at <unknown> line 0 (ExitRunloop @ 0)
```

Okay, so that’s at least two errors that need fixing.

We’ll go in order here, even though it means tackling what is most likely the most complicated error first. (If you do think this part of the problem is too hard to tackle, please skip ahead, because the last few improvements I made really were incredibly easy to do.) Opening `src/CORE.setting`, we find the following definition for `round`:

```sub round(\$x, \$scale=1) { floor(\$x / \$scale + 0.5) * \$scale }
```

Okay, so the real problem is in `floor`:

```sub floor(\$x) { Q:CgOp { (floor {\$x}) } }
```

What the heck does `Q:CgOp` mean? It means `floor` is actually implemented in C#. So we open up `lib/Builtins.cs` and search for `floor`, eventually finding `public static Variable floor(Variable a1)`. I won’t print the full source code here, because it is on the long side, with a case for each of the different number types. We’re only interested in the floating point case here:

```        if (r1 == NR_FLOAT) {
double v1 = PromoteToFloat(r1, n1);
ulong bits = (ulong)BitConverter.DoubleToInt64Bits(v1);
BigInteger big = (bits & ((1UL << 52) - 1)) + (1UL << 52);
int power = ((int)((bits >> 52) & 0x7FF)) - 0x433;
// note: >>= has flooring semantics for signed values
if ((bits & (1UL << 63)) != 0) big = -big;
if (power > 0) big <<= power;
else big >>= -power;
return MakeInt(big);
}
```

We don’t actually need to understand how all that works to fix this problem. The important bit is the `PromoteToFloat` line, which sets `v1` to the floating point value which is the input to our floor. If we add a trap right after that, it should fix this bug. A quick C# websearch shows me that `Double` has member functions `IsNaN`, `IsNegativeInfinity`, and `IsPositiveInfinity`. Looking a bit around the Niecza source shows there is a `MakeFloat` function for returning floating point values. Let’s try:

```if (Double.IsNaN(v1) || Double.IsNegativeInfinity(v1) || Double.IsPositiveInfinity(v1)) {
return MakeFloat(v1);
}
```

One quick call to `make` later, I can try the test file again:

```Wynne:niecza colomon\$ t/fudgeandrun t/spec/S32-num/rounders.t
1..108
ok 1 - floor(NaN) is NaN
ok 2 - round(NaN) is NaN
ok 3 - ceiling(NaN) is NaN
not ok 4 - truncate(NaN) is NaN
# /Users/colomon/tools/niecza/t/spec/S32-num/rounders.t line 19
#    Failed test
#           got: -269653970229347386159395778618353710042696546841345985910145121736599013708251444699062715983611304031680170819807090036488184653221624933739271145959211186566651840137298227914453329401869141179179624428127508653257226023513694322210869665811240855745025766026879447359920868907719574457253034494436336205824
```

Progress! Apparently truncate uses a separate method, so we’ll have to fix it separately.

```sub truncate(\$x) { \$x.Int }
method Int() { Q:CgOp { (coerce_to_int {self}) } }
```
```    public static Variable coerce_to_int(Variable a1) {
int small; BigInteger big;
return GetAsInteger(a1, out small, out big) ?
MakeInt(big) : MakeInt(small);
}
```

Oooo, this is perhaps a little bit trickier. Still a basic variant on the previous method, grabbing boilerplate code from a nearby function:

```        int r1;
P6any o1 = a1.Fetch();
P6any n1 = GetNumber(a1, o1, out r1);

if (r1 == NR_FLOAT) {
double v1 = PromoteToFloat(r1, n1);
if (Double.IsNaN(v1) || Double.IsNegativeInfinity(v1) || Double.IsPositiveInfinity(v1)) {
return MakeFloat(v1);
}
}
```

I skipped the `HandleSpecial2` bit in the boilerplate, because I’m never quite sure how that works. Luckily, we have the spectests to check and see if I have broken something by doing this.

Now the first 15 tests in `rounders.t` pass, leaving us with the

```Unhandled exception: Unable to resolve method truncate in class Num
```

error. That should be easy to handle! If we go back to `lib/CORE.setting` and search for `ceiling`, we see it appears two times: in the catch-all base class `Cool` and as a stand-alone sub. If we look at the neighboring subs, we see `floor`, `ceiling`, `round`, and `truncate` are all defined. If we look in `Cool`, however, only `floor`, `ceiling`, and `round` defined. That’s the source of our trouble!

The method definitions of the others in `Cool` are really simple; all they do is forward to the sub versions. It’s very easy to add a `truncate` that does that:

```    method truncate() { truncate self }
```

And poof! This time when we run `rounders.t`, we pass all 108 tests.

At this point we’ve got three things left to do. First, now that `rounders.t` passes, we need to add it to `t/spectest.data`. The list of tests there is ordered, so I just find the `S32-num` section and add `S32-num/rounders.t` in alphabetical order.

Next I will commit all the changes to my copy of the git repo. (I won’t explain how to do that, there are lots of git tutorials on the web.) Then I run `make spectest` to make sure I haven’t broken anything with these changes. (Hmm… actually a few TODO passing, bugs elsewhere that this patch has fixed! Oh, and one test broken, but it’s one which we were only passing by accident before, so I won’t feel bad about fudging it.)

Once that is done, you need to send the patch on to the Niecza developers; I believe the easiest way to do this is via github.

I’ve got one more little change to make that popped into my head while I was working on this. One naive way of implementing, say `floor` would be to convert the input into a floating point value (a Num in Perl 6) and then do `Num.floor`. That doesn’t work for all numbers, however, as most of the other number types are capable of storing numbers larger than will fit in a standing floating point double. So we probably need tests in the test suite to check for these cases. Let’s add them.

The tests in `rounders.t` are weirdly organized for my taste. But hey, we can always add our tests at the bottom.

```{
my \$big-int = 1234567890123456789012345678903;
is \$big-int.floor, \$big-int, "floor passes bigints unchanged";
is \$big-int.ceiling, \$big-int, "ceiling passes bigints unchanged";
is \$big-int.round, \$big-int, "round passes bigints unchanged";
is \$big-int.truncate, \$big-int, "truncate passes bigints unchanged";
}
```

That passes okay in Niecza. (Probably out of courtesy we should check it on Rakudo as well and fudge it appropriately to make sure we’re not breaking their spectest!) We need to remember to add the count of new tests to the plan at the top of the test file. And then we can push that fix to github as well.

In conclusion, contributing to Perl 6 is easy. Anyone who tries writing Perl 6 code and reports problems they have to `#perl6` is helping in a very real way. If you can write even fairly simple Perl 6 code, then you can write useful spec tests. It’s only marginally harder to write new methods for the setting in Perl 6. And even when you have to get down and dirty and start dealing with the language the compiler is implemented in, it’s still quite possible to do useful work without any deep understanding of how the compiler works.

Day 1: Catching Up With Perl 6

When we started the Perl 6 Advent Calendar back in 2009, Rakudo was really the only game in town if you wanted to play with Perl 6. But Perl 6 was intended from the start to be a language with multiple implementations, and at the moment there are four different Perl 6 implementations of interest. Because there are so many implementations, I’m not going to give instructions for getting each; instead I’m linking to those instructions.

The most stable and complete implementation is Rakudo Star. This is currently based on the last major revision of Rakudo. It’s been frozen since July, and so lags a bit behind the current Perl 6 spec. It’s slow. But it’s also pretty reliable.

The current Rakudo development version is called “Nom”. It’s full of great improvements over the last Rakudo Star release, notably native types, improved performance, and a much better metamodel. (For example, check out the Grammar::Tracer module, which takes advantage of the new metamodel to add regex tracing in just 44 lines of code.) It’s not quite ready for prime time yet, as it still misses some features that work in Rakudo Star, but progress has been incredible, and it’s quite possible a new Rakudo Star based on Nom will be released during this month.

Stefan O’Rear’s Niecza was just a fledging compiler during last year’s Advent calendar, but it’s a serious contender these days. Built to run on the CLR (.NET and Mono), it is relatively zippy, implements a significant portion of Perl 6, and works easily with existing CLR libraries.

Lastly, ingy and Mäsak have plans afoot to revive Pugs, the original Perl 6 implementation in Haskell. So far they’ve just got it building again on current Haskell compilers, but the long-term goal is to get it running on the spec tests again and bring it closer to the current spec.

Which implementation should you use? If you’re looking for a stable, fairly complete Perl 6, Rakudo Star is it. If you just want to explore the language, try Rakudo Nom — you will probably run into bugs, but it’s significantly more advanced than Rakudo Star, and exposing the bugs is a big help to Rakudo’s development. If you have an idea which would benefit from being able to use CLR libraries, Niecza is fantastic. There’s a handy comparison chart of the different features available.

Personally, I have all three of these installed on my machine, and have different projects underway on each of them.

Finally, please don’t hesitate to ask for help, either in the comments here or on the `#perl6` IRC channel on Freenode. The Perl 6 community is very friendly.

Day 18 – ABC Module

Instead of focusing on a particular feature of Perl 6 today, I’d like to take you on a quick tour of the ABC module. I think it’s a nice example of some of the strengths of Perl 6.

ABC Notation is a simple text file format designed to make it easy to support musical notation. It’s widely used in the world of traditional dance music because it is very lightweight and more than powerful enough to support notating jigs and reels. Here’s an example in honor of the season:

```X:1
T:While Shepherds Watched Their Flocks
M:5/4
L:1/4
O:Green's Harbour, Newfoundland
K:A major
E|:[M:5/4] A A/B/ B2 A|B c/B/ A2 A/B/|
[M:6/4]c d/c/ B2 B2|[M:4/4] A3 E|AB AG|
FE FE|AB AG|F2 F2|E2 G2|A/G/ F/E/ DF|
[1 [M:6/4] E C/B,/ A,3 E:|[2 [M:5/4] E C/B,/ A,3|]
```

I won’t get into the details — here’s a tutorial if you’d like to know more — but the structure of the file is simple. The first section is the header, with general information about the tune. The remainder is the tune itself. This one is a bit more complicated than many because of all the embedded time signature changes, like `[M:6/4]`.

I was always surprised there wasn’t a CPAN module for handling ABC notation, and even seriously considered writing one myself at one point. But parsing ABC is a complicated process, and I gave up in frustration before getting very far.

Enter Perl 6 and its grammars. About 60 lines of simple regexes is all that is needed to parse most of the tunes I am interested in. (Several of the more complicated features of ABC, like lyrics and multi-staff music are not implemented yet.) Here’s a snatch of it:

```    regex basenote { <[a..g]+[A..G]> }
regex octave { "'"+ | ","+ }
regex accidental { '^' | '^^' | '_' | '__' | '=' }
regex pitch { <accidental>? <basenote> <octave>? }
```

Compare that to the model BNF grammar for ABC:

```basenote ::= %x43 / %x44 / %x45 / %x46 / %x47 / %x41 / %x42 / %x63 / %x64 / %x65 / %x66 / %x67 / %x61 / %x62 ; CDEFGABcdefgab
octave ::= 1*"'" / 1*","
accidental ::= "^" / "^^" / "_" / "__" / "="
pitch ::= [accidental] basenote [octave]
```

It’s clearly a very straightforward translation process.

By default, parsing with a Perl 6 grammar just gives you a basic `Match` object. Adding an object to specify actions to go with the grammar allows you to easily process the information as it is parsed. A simple example is

```    method rest(\$/) {
make ABC::Rest.new(~\$<rest_type>,
\$<note_length>.ast);
}
```

Whenever the `rest` regex fires, it returns a new `ABC::Rest` object. The constructor is passed the string form of the `rest_type` regex, and an `ABC::Duration` object created by the action for `note_length`.

Speaking of durations, another feature of Perl 6 comes very handy here. Durations are represented exactly using the rational number Rat type. If we didn’t have them available, we’d have to code up something similar by hand in order to handle things like triplets whose durations cannot be exactly represented by a floating point number.

So far there is only one significant application using these tools — the `abc2ly.pl` script included in the ABC module. It converts ABC files to the Lilypond music notation format. Lilypond is a very powerful open source music notation system which produces gorgeous sheet music output. This is a great way to print out good looking music notation from ABC files. (I know there is an `abc2ly` program included with Lilypond, but last time I checked its output looked tragically bad. `abc2ly.pl` is already working well enough that I’m hoping to produce a book of sheet music using it in 2011.) So let me leave you with the PDF of the above carol, produced using these tools, Rakudo Perl 6, and Lilypond.