Day 10 – Don’t quote me on it…

Day 10 – Don’t quote me on it…

In many areas, Perl 6 provides you with a range of sane defaults for the common cases along with the power to do something a little more interesting when you need it. Quoting is no exception.

The Basics

The two most common quoting constructs are the single and double quotes. Single quotes are simplest: they let you quote a string and just about the only “magic” they provide is being able to stick a backslash before a single quote, which escapes it. Since backslash has this special meaning, you can write an explicit backslash with \\. However, you don’t even need to do that, since any other backslashes just pass on straight through. Here’s some examples.

> say 'Everybody loves Magical Trevor'
Everybody loves Magical Trevor
> say 'Oh wow, it\'s backslashed!'
Oh wow, it's backslashed!
> say 'You can include a \\ like this'
You can include a \ like this
> say 'Nothing like \n is available'
Nothing like \n is available
> say 'And a \ on its own is no problem'
And a \ on its own is no problem

Double quotes are, naturally, twice as powerful. :-) They support a range of backslash escapes, but more importantly they allow for interpolation. This means that variables and closures can be placed within them, saving you from having to use the concatenation operator or other string formatting constructs so often. Here are some simple examples.

> say "Ooh look!\nLine breaks!"
Ooh look!
Line breaks!
> my $who = 'Ninochka'; say "Hello, dear $who"
Hello, dear Ninochka
> say "Hello, { prompt 'Enter your name: ' }!"
Enter your name: Jonathan
Hello, Jonathan!

The second example shows the interpolation of a scalar, and the third shows how closures can be placed inside double quoted strings also. The value the closure produces will be stringified and interpolated into the string. But what about all the other sigils besides “$”? The rule is that you can interpolate all of them, but only if they are followed by some kind of postcircumfix (that is, an array or hash subscript, parentheses to make an invocation, or a method call). In fact, you can put all of these on a scalar too.

> my @beer = <Chimay Hobgoblin Yeti>;
Chimay Hobgoblin Yeti
> say "First up, a @beer[0]"
First up, a Chimay
> say "Then @beer[1,2].join(' and ')!"
Then Hobgoblin and Yeti!
> say "Tu je &prompt('Ktore pivo chces? ')"
Ktore pivo chces? Starobrno
Tu je Starobrno

Here you can see interpolation of an array element, a slice that we then call a method on and even a function call. The postcircumfix rule happily means that we don’t go screwing up your email address any more.

> say "Please spam me at"
Please spam me at

Choose Your Own Delimiters

The single and double quotes are suitable for a bunch of cases, but what if you want to use a bunch of single or double quotes inside the string? Escaping them would rather suck. Thing is, you could probably make that argument about any choice of quoting characters. So instead of making the choice for you, Perl 6 lets you pick. The q and qq quote constructs expect to be followed by a delimiter. If it’s something with a matching closer, it will look for that (for example, if you use an opening curly then your string is terminated by a closing curly; note that there’s only a finite set of these, and no, it doesn’t include having a comet be terminated by a snowman). Otherwise it looks for the same thing to terminate the string. Note you can also use multi-character openers and closers too (but only by repeating the same character). Otherwise, the q gives you the same semantics as single quotes, and qq gives you the same semantics as double quotes.

> say q{C'est la vie}
C'est la vie
> say q{{Unmatched } and { are { OK } in { here}}
Unmatched } and { are { OK } in { here
> say qq!Lottery results: {(1..49).roll(6).sort}!
Lottery results: 12 13 26 34 36 46


All of the quoting constructs demonstrated so far allow you to include multiple lines of content. However, for that there’s usually a better way: here documents. There can be started with either q or qq, and then with the :to adverb being used to specify the string we expect to find, on a line of its own, at the end of the quoted text. Let’s see how this works, illustrated by a touching story.

print q:to/THE END/
    Once upon a time, there was a pub. The pub had
    lots of awesome beer. One day, a Perl workshop
    was held near to the pub. The hackers drank
    the pub dry. The pub owner could finally afford
    a vacation.

The output of this script is as follows:

Once upon a time, there was a pub. The pub had
lots of awesome beer. One day, a Perl workshop
was held near to the pub. The hackers drank
the pub dry. The pub owner could finally afford
a vacation.

Notice how the text is not indented like in the program source. Heredocs remove indentation automatically, up to the indentation level of the terminator. If we’d used qq, we could have interpolated things into the heredoc. Note that this is all implemented by using the indent method on strings, but if your string doesn’t do any interpolation we do the call to indent at compile time as an optimization.

You can also have multiple heredocs, and even call methods on the data that will be located in the heredoc (note the call to lines in the following program).

my ($input, @searches) = q:to/INPUT/, q:to/SEARCHES/.lines;
    Once upon a time, there was a pub. The pub had
    lots of awesome beer. One day, a Perl workshop
    was held near to the pub. The hackers drank
    the pub dry. The pub owner could finally afford
    a vacation.

for @searches -> $s {
    say $input ~~ /$s/
        ?? "Found $s"
        !! "Didn't find $s";

The output of this program is:

Found beer
Didn't find masak
Found vacation
Didn't find whisky

Quote Adverbs for Custom Quoting Constructs

The single and double quote semantics, also available through q and qq, cover most cases. But what if you have a situation where you want to, say, interpolate closures but not scalars? This is where quote adverbs come in. They allow you to turn certain quoting features on and off. Here’s an example.

> say qq:!s"It costs $10 to {<eat nom>.pick} here."
It costs $10 to eat here.

Here, we use the semantics of qq, but then turn off scalar interpolation. This means we can write the price without worrying about it trying to interpolate the 11th capture of the last regex. Note that this is just using the standard colonpair syntax. If you want to start from a quote construct that supports basically nothing, and then just turn on some options, you can use the Q construct.

> say Q{$*OS\n&sin(3)}
> say Q:s{$*OS\n&sin(3)}
> say Q:s:b{$*OS\n&sin(3)}
> say Q:s:b:f{$*OS\n&sin(3)}

Here we start with a featureless quoting construct, then turn on extra features: first scalar interpolation, then backslash escapes, then function interpolation. Note that we could have chosen any delimiter we wished too.

Quote Constructs are Languages

Finally, it’s worth mentioning that when the parser enters a quoting construct, really it is switching to parsing a different language. When we build up quoting constructs from adverbs, really this is just mixing extra roles into the base quoting language to turn on extra features. For the curious, here’s how Rakudo does it. Whenever we hit a closure or some other interpolation, the language is temporarily switched back to the main language. This is why you can do things like:

> say "Hello, { prompt "Enter your name: " }!"
Enter your name: Jonathan
Hello, Jonathan!

And the parser doesn’t get terribly confused about the fact that the closure being interpolated contains another double quoted string. That is, we’re parsing the main language, then slip into a quoting language, then recurse into the main language again, and finally recurse into the quoting language again to parse the string in the closure in the string in the program. It’s like the Perl 6 parser wants to give us all matryoshka dolls for Christmas. :-)

Day 9 – Longest Token Matching

Day 9 – Longest Token Matching

Perl 6 regular expressions prefer to match the longest alternative when possible.

say "food and drink" ~~ / foo | food /;   # food

This is in contrast to Perl 5, which would prefer the first alternative above, and produce the match “foo”.

You can still get the first-alternative behavior if you want; it’s tucked away in the slightly longer alternation operator ||:

say "food and drink" ~~ / foo || food /;  # foo

…And that’s it! That’s Longest Token Matching. ☺ Short post.

“Huh, wait!” I hear you exclaim, in a desperate attempt to make the daily Perl 6 Advent goodness last a bit longer. “Why is Longest Token Matching such a big deal? Who would ever be so obsessed with long tokens?”

I’m glad you asked. As it turns out, Longest Token Matching (or LTM for short) plays very well with our intuition about how things should be parsed. If you’re creating a language, you want people to be able to declare a variable forest_density without the mention of this variable clashing with the syntax of for loops. LTM will see to that.

I like “strange consistencies” — when distal parts of a language design turn out to have commonalities that make the language feel more uniform. There is that kind of consistency here, between classes and grammars. Perl 6 basically exploits that consistency to the max. Let me briefly map out what I mean.

We’re all used to writing classes at this point. From a birds-eye view, they look like this:

class {

Grammars have a suspiciously similar structure:

grammar {

(The keywords are actually regex, token and rule, but when we talk about them as a group, we just call them “rules”.)

We’re also used to being able to derive classes into subclasses (class B is A), and add or override methods in a way which produces a nice mix of old and new behavior. Perl 6 provides multi methods which even allow you to add new methods of the same name, and the old ones won’t be overridden, they’ll just all try to match alongside the new methods. The dispatch is handled by a (usually autogenerated) proto method that dispatches to all eligible candidates.

What does all this have to do with grammars and rules? Well, it turns out that first off, you can derive new grammars from old ones. It works the same as deriving classes. (In fact, under the hood it’s exactly the same mechanism. Grammars are classes with a different metaclass object.) New rules will override old rules just like you’d expect with methods.

S05 has a cute example with parsing of letters, and deriving the grammar to parse formal letters:

    grammar Letter {
         rule text     { <greet> $<body>=<line>+? <close> }
         rule greet    { [Hi|Hey|Yo] $<to>=\S+? ',' }
         rule close    { Later dude ',' $<from>=.+ }
         token line    { \N* \n}

     grammar FormalLetter is Letter {
         rule greet { Dear $<to>=\S+? ',' }
         rule close { Yours sincerely ',' $<from>=.+ }

The derived FormalLetter overrides greet and close, but not line.

But what about all the goodness with multi methods? Could we define some kind of “proto rule” that would allow us to have several rules in a grammar with the same name but different bodies? For example, we might want to parse a language with a rule term, but there are many different terms: strings, numbers… and maybe the numbers can be decimal or binary or octal or hexadecimal…

Perl 6 grammars can contain a proto rule, and then you can define and redefine a rule with the same name as many times as you want. And now we’re back full circle with the / foo | food / alternation from the start of the article. All those rules you write with the same name compile down to one big alternation. Not only that — rules which call other rules, some of them possibly proto rules, all of that will be “flattened” out into one big LTM alternation. In practice that means that all the possible things a term can be are tried out all at once, on equal footing. Neither alternative wins because you happened to define it before the others. An alternative wins because it is the longest.

The strange consistency resides in the fact that in the call-a-method side of things, the most specific method wins, and “most specific” has to with signature narrowness. The better the types in the signature describe the arguments coming in, the more specific the method.

In the parse-with-a-rule side of things, the most specific rule wins, but here “most specific” has to do with parse success. The better the rule can describe what comes next in the text, the more specific the rule.

And that’s strangely consistent, because on the surface methods and rules look like quite different beasts.

We really believe we have something going with this whole principle of deriving a grammar and getting a new language. LTM is right at the center of that because it allows new rules and old to intermix in a fair and predictable way. It’s a kind of meritocracy: rules win not based on whether they’re young or old, but based on whether they are able to parse the text well.

In fact, the Perl 6 compiler itself works this way. It parses your program using a Perl 6 grammar, and that grammar is derivable… whenever you declare a new operator in your program, a new grammar is derived for you. The parsing of your operator is added as a new rule in the new grammar, and the new grammar is given the task of parsing the rest of your program. Your new operator will win against similar but shorter ones, and lose against similar but longer ones.

Day 8 – Panda package manager

Day 8 – Panda package manager

Perl 6 is not just the language. While without modules it can do more than Perl 5, modules can make life easier. About two years ago neutro was discussed on this blog. I’m not going to talk about it, as it’s deprecated today.

Today, the standard way of installing modules is the panda utility. If you’re using Rakudo Star, you should have it already installed (try the panda command in console to check it). After running it and waiting a few seconds, you should see help for the panda utility.

$ panda
  panda [--notests] [--nodeps] install [ ...] -- Install the specified modules
  panda [--installed] [--verbose] list -- List all available modules
  panda update -- Update the module database
  panda info [ ...] -- Display information about specified modules
  panda search  -- Search the name/description

As you can see, it doesn’t have many options (it’s actually similar to RubyGems or cpanminus in its simplicity). You can see the current list of modules at Perl 6 Modules page. Let’s say you would want to parse an INI file. First, you can find module for it using the search command.

$ panda search INI
JSON::Tiny               *          A minimal JSON (de)serializer
Config::INI              *          .ini file parser and writer module for
                                    Perl 6
MiniDBI                  *          a subset of Perl 5 DBI ported to Perl 6
                                    to use while experts build the Real Deal
Class::Utils             0.1.0      Small utilities to help with defining

Config::INI is module you want. Other modules were found because my query wasn’t specific enough and found “ini” in other words (minimal, MiniDBI and defining). Config::INI isn’t part of Rakudo Star, so you have to install it.

Panda installs modules globally when you writing access to the installation directory, locally otherwise. Because of that you can use panda even when Perl 6 is installed globally without installing modules like local::lib, like you have to in Perl 5.

$ panda install Config::INI
==> Fetching Config::INI
==> Building Config::INI
Compiling lib/Config/
Compiling lib/Config/INI/
==> Testing Config::INI
t/00-load.t .... ok
t/01-parser.t .. ok
t/02-writer.t .. ok
All tests successful.
Files=3, Tests=55, 3 wallclock secs ( 0.04 usr 0.00 sys + 2.38 cusr 0.14 csys = 2.56 CPU)
Result: PASS
==> Installing Config::INI
==> Succesfully installed Config::INI

After the module has been installed, you can update it as easily – by installing it. Currently panda cannot automatically upgrade modules, but after a module has been updated (you can watch repositories on GitHub to know when it happens – every module is available on GitHub), you can easily upgrade it by reinstalling the module.

When a module was installed, you can check if it works by trying to use it. This is a sample script that can be used to convert INI file into a Perl 6 data structure.

#!/usr/bin/env perl6
use Config::INI;
multi sub MAIN($file) {
    say '# your INI file as seen by Perl 6';
    say Config::INI::parse_file($file).perl;
Day 7 – MIME::Base64 – On encoded strings

Day 7 – MIME::Base64 – On encoded strings

parrot MIME::Base64 FixedIntegerArray: index out of bounds!

Ronaldxs created the following parrot ticket #813 4 months ago:

“Was playing with p6 MIME::Base64 and utf8 sampler page when I came across this. It seems that the parrot MIME Base64 library can’t handle some UTF-8 characters as demonstrated below.”

.sub go :main
    load_bytecode 'MIME/Base64.pbc'

    .local pmc enc_sub
    enc_sub = get_global [ "MIME"; "Base64" ], 'encode_base64'

    .local string result_encode
    result_encode = enc_sub(utf8:"\x{203e}")

    say result_encode

FixedIntegerArray: index out of bounds!
current instr.: 'parrot;MIME;Base64;encode_base64'
pc 163 (runtime/parrot/library/MIME/Base64.pir:147)

called from Sub 'go' pc 11 (die_utf8_base64.pir:8)

This was interesting, because parrot strings store the encoding information in the string. The user does not need to store the string encoding information somewhere else as in perl5, nor have to do educated guesses about the encoding. parrot supports ascii, latin1, binary, utf-8, ucs-2, utf-16 and ucs-4 string encodings natively.
So we thought we the hell cannot parrot handle simple utf-8 encoded strings?

As it turned out, the parrot implementation of MIME::Base64, which can be shared to all languages which use parrot as VM, stored the character codepoints for each character as array of integers. On multibyte encodings such as UTF-8 this leads to different data held in memory than a normal multibyte string which is encoded as the byte buffer and the additional encoding information.

Internal string representations

For example an overview of different internal string representations for the utf-8 string “\x{203e}”:

perl5 strings:

len=3, utf-8 flag, "\342\200\276" buf=[e2 80 be]

parrot strings:

len=1, bufused=3, encoding=utf-8, buf=[e2 80 be]

The Unicode tables:

U+203E	‾	e2 80 be	OVERLINE

gdb perl5

Let’s check it out:

$ gdb --args perl -e'print "\x{203e}"'
(gdb) start
(gdb) b Perl_pp_print
(gdb) c
(gdb) n 

.. until if (!do_print(*MARK, fp))

(gdb) p **MARK
$1 = {sv_any = 0x404280, sv_refcnt = 1, sv_flags = 671106052, sv_u = {
      svu_pv = 0x426dd0 "‾", svu_iv = 4353488, svu_uv = 4353488, 
      svu_rv = 0x426dd0, svu_array = 0x426dd0, svu_hash = 0x426dd0, 
      svu_gp = 0x426dd0, svu_fp = 0x426dd0}, ...}

(gdb) p Perl_sv_dump(*MARK)
ALLOCATED at -e:1 for stringify (parent 0x0); serial 301
SV = PV(0x404280) at 0x4239a8
  REFCNT = 1
  PV = 0x426dd0 "\342\200\276" [UTF8 "\x{203e}"]
  CUR = 3
  LEN = 16
$2 = void

(gdb) x/3x 0x426dd0
0x426dd0:	0xe2	0x80	0xbe

We see that perl5 does store the utf-8 flag, but not the length of the string, the utf8 length (=1), only the length of the buffer (=3).
Any other multi-byte encoded string, such as UCS-2 is stored differently. We suppose as utf-8.

We are already in the debugger, so let’s try the different cmdline argument.

(gdb) run -e'use Encode; print encode("UCS-2", "\x{203e}")'
  The program being debugged has been started already.
  Start it from the beginning? (y or n) y
Breakpoint 2, Perl_pp_print () at pp_hot.c:712
712	    dVAR; dSP; dMARK; dORIGMARK;

(gdb) p **MARK

$3 = {sv_any = 0x404b30, sv_refcnt = 1, sv_flags = 541700, sv_u = {
    svu_pv = 0x563a50 " >", svu_iv = 5651024, svu_uv = 5651024, 
    svu_rv = 0x563a50, svu_array = 0x563a50, svu_hash = 0x563a50, svu_gp = 0x563a50, 
    svu_fp = 0x563a50}, ...}

(gdb) p Perl_sv_dump(*MARK)
ALLOCATED at -e:1 by return (parent 0x0); serial 9579
SV = PV(0x404b30) at 0x556fb8
  REFCNT = 1
  PV = 0x563a50 " >"
  CUR = 2
  LEN = 16
$4 = void

(gdb) x/2x 0x563a50
0x563a50:	0x20	0x3e

But we don’t see the UTF8 flag in encode(“UCS-2”, “\x{203e}”), just the simple ascii string ” >”, which is the UCS-2 representation of [20 3e].
Because ” >” is perfectly representable as non-utf8 ASCII string.
UCS-2 is much much nicer than UTF-8, is has a fixed size, it is readable, Windows uses it, but it cannot represent all Unicode characters.

Encode::Unicode contains this nice cheatsheet:

Quick Reference
                Decodes from ord(N)           Encodes chr(N) to...
       octet/char BOM S.P d800-dfff  ord > 0xffff     \x{1abcd} ==
  UCS-2BE       2   N   N  is bogus                  Not Available
  UCS-2LE       2   N   N     bogus                  Not Available
  UTF-16      2/4   Y   Y  is   S.P           S.P            BE/LE
  UTF-16BE    2/4   N   Y       S.P           S.P    0xd82a,0xdfcd
  UTF-16LE    2/4   N   Y       S.P           S.P    0x2ad8,0xcddf
  UTF-32        4   Y   -  is bogus         As is            BE/LE
  UTF-32BE      4   N   -     bogus         As is       0x0001abcd
  UTF-32LE      4   N   -     bogus         As is       0xcdab0100
  UTF-8       1-4   -   -     bogus   >= 4 octets   \xf0\x9a\af\8d

gdb parrot

Back to parrot:

If you debug parrot with gdb you get a gdb pretty-printer thanks to Nolan Lum, which displays the string and encoding information automatically.
In perl5 you have to call Perl_sv_dump with or without the my_perl as first argument, if threaded or not. With a threaded perl, e.g. on Windows you’d need to call p Perl_sv_dump(my_perl, *MARK).
In parrot you just ask for the value and the formatting is done with a gdb pretty-printer plugin.
The string length is called strlen (of the encoded string), the buffer size is called bufused.

Even in a backtrace the string arguments are displayed abbrevated like this:

#3  0x00007ffff7c29fc4 in utf8_iter_get_and_advance (interp=0x412050, str="utf8:� [1/2]", 
    i=0x7fffffffdd00) at src/string/encoding/utf8.c:551
#4  0x00007ffff7a440f6 in Parrot_str_escape_truncate (interp=0x412050, src="utf8:� [1/2]",
    limit=20) at src/string/api.c:2492
#5  0x00007ffff7b02fb3 in trace_op_dump (interp=0x412050, code_start=0x63a1c0, pc=0x63b688)
    at src/runcore/trace.c:450

[1/2] means strlen=1 bufused=2
Each non-ascii or non latin-1 encoded string is printed with the encoding prefix.
Internally the encoding is of course a index or pointer in the table of supported encodings.

You can set a breakpoint to utf8_iter_get_and_advance and watch the strings.

(gdb) r t/library/mime_base64u.t
Breakpoint 1, utf8_iter_get_and_advance (interp=0x412050, str="utf8:\\x{00c7} [8/8]", 
                i=0x7fffffffcd40) at src/string/encoding/utf8.c:544
(gdb) p str
$1 = "utf8:\\x{00c7} [8/8]"
(gdb) p str->bufused 
$3 = 8
(gdb) p str->strlen
$4 = 8
(gdb) p str->strstart
$5 = 0x5102d7 "\\x{00c7}"

This is escaped. Let’s advance to a more interesting utf8 string in this test, i.e. until str=”utf8:Ā [1/2]”
You get the members of a struct with tab-completion, i.e. press <TAB> after p str->

(gdb) p str->
_buflen    _bufstart  bufused    encoding   flags      hashval    strlen     strstart
(gdb) p str->strlen
$9 = 8

(gdb) dis 1
(gdb) b utf8_iter_get_and_advance if str->strlen == 1
(gdb) c
Breakpoint 2, utf8_iter_get_and_advance (interp=0x412050, str="utf8:Ā [1/2]", 
                i=0x7fffffffcd10) at src/string/encoding/utf8.c:544
544	    ASSERT_ARGS(utf8_iter_get_and_advance)

(gdb) p str->strlen
$10 = 1
(gdb) p str->strstart
$11 = 0x7ffff7faeb58 "Ā"
(gdb) x/2x str->strstart
0x7ffff7faeb58:	0xc4	0x80
(gdb) p str->encoding
$12 = (const struct _str_vtable *) 0x7ffff7d882e0
(gdb) p *str->encoding

$13 = {num = 3, name = 0x7ffff7ce333f "utf8", name_str = "utf8", bytes_per_unit = 1,
  max_bytes_per_codepoint = 4, to_encoding = 0x7ffff7c292b0 <utf8_to_encoding>, chr =
  0x7ffff7c275c0 <unicode_chr>, equal = 0x7ffff7c252e0 <encoding_equal>, compare =
  0x7ffff7c254e0 <encoding_compare>, index = 0x7ffff7c25690 <encoding_index>, rindex
  = 0x7ffff7c257a0 <encoding_rindex>, hash = 0x7ffff7c25a20 <encoding_hash>, scan =
  0x7ffff7c29380 <utf8_scan>, partial_scan = 0x7ffff7c29460 <utf8_partial_scan>, ord
  = 0x7ffff7c297e0 <utf8_ord>, substr = 0x7ffff7c25de0 <encoding_substr>, is_cclass =
  0x7ffff7c26000 <encoding_is_cclass>, find_cclass =
  0x7ffff7c260e0 <encoding_find_cclass>, find_not_cclass =
  0x7ffff7c26220 <encoding_find_not_cclass>, get_graphemes =
  0x7ffff7c263d0 <encoding_get_graphemes>, compose =
  0x7ffff7c27680 <unicode_compose>, decompose = 0x7ffff7c26450 <encoding_decompose>,
  upcase = 0x7ffff7c27b20 <unicode_upcase>, downcase =
  0x7ffff7c27be0 <unicode_downcase>, titlecase = 0x7ffff7c27ca0 <unicode_titlecase>,
  upcase_first = 0x7ffff7c27d60 <unicode_upcase_first>, downcase_first =
  0x7ffff7c27dc0 <unicode_downcase_first>, titlecase_first =
  0x7ffff7c27e20 <unicode_titlecase_first>, iter_get =
  0x7ffff7c29c40 <utf8_iter_get>, iter_skip = 0x7ffff7c29d60 <utf8_iter_skip>,
  iter_get_and_advance = 0x7ffff7c29eb0 <utf8_iter_get_and_advance>,
  iter_set_and_advance = 0x7ffff7c29fd0 <utf8_iter_set_and_advance>}


$ perl -MMIME::Base64 -lE'$x="20e3";$s="\x{20e3}";
  printf "0x%s\t%s=> %s",$x,$s,encode_base64($s)'
Wide character in subroutine entry at -e line 1.

Oops, I’m clearly a unicode perl5 newbie. Does my term not understand utf-8?

$ echo $TERM

No, it should. encode_base64 does not understand unicode.
perldoc MIME::Base64
“The base64 encoding is only defined for single-byte characters. Use the Encode module to select the byte encoding you want.”

Oh my! But it is just perl5. It just works on byte buffers, not on strings.
perl5 strings can be utf8 and non-utf8. Why on earth an utf8 encoded string is disallowed and only byte buffers of unknown encodings are allowed goes beyond my understanding, but what can you do. Nothing. base64 is a binary only protocol, based on byte buffers. So we decode it manually to byte buffers. The Encode API for decoding is called encode.

$ perl -MMIME::Base64 -MEncode -lE'$x="20e3";$s="\x{20e3}";
  printf "0x%s\t%s=> %s",$x,$s,encode_base64(encode('utf8',$s))'
Wide character in printf at -e line 1.
0x20e3	=> 4oOj

This is now the term warning I know. We need -C

$ perldoc perluniintro

$ perl -C -MMIME::Base64 -MEncode -lE'$x="20e3";$s="\x{20e3}";
  printf "0x%s\t%s=> %s",$x,$s,encode_base64(encode('utf8',$s))'
0x20e3	=> 4oOj

Over to rakudo/perl6 and parrot:

$ cat >m.pir << EOP
.sub main :main
    load_bytecode 'MIME/Base64.pbc'
    $P1 = get_global [ "MIME"; "Base64" ], 'encode_base64'
    $S1 = utf8:"\x{203e}"
    $S2 = $P1(s1)
    say $S1
    say $S2

$ parrot m.pir
FixedIntegerArray: index out of bounds!
current instr.: 'parrot;MIME;Base64;encode_base64'
                pc 163 (runtime/parrot/library/MIME/Base64.pir:147)

The perl6 test, using the parrot library, from

$ git clone git://
Cloning into 'perl6-Enc-MIME-Base64'...

$ PERL6LIB=perl6-Enc-MIME-Base64/lib perl6 <<EOP
use Enc::MIME::Base64;
say encode_base64_str("\x203e");

> use Enc::MIME::Base64;
> say encode_base64_str("\x203e");
FixedIntegerArray: index out of bounds!

The pure perl6 workaround:

$ PERL6LIB=perl6-Enc-MIME-Base64/lib perl6 <<EOP
use PP::Enc::MIME::Base64;
say encode_base64_str("\x203e");

> use PP::Enc::MIME::Base64;
> say encode_base64_str("\x203e");

Wait. perl6 creates a different enoding than perl5?
What about coreutils base64 command.

$ echo -n "‾" > m.raw
$ od -x m.raw
0000000 80e2 00be
$ ls -al m.raw
-rw-r--r-- 1 rurban rurban 3 Dec  6 10:23 m.raw
$ base64 m.raw

[80e2 00be] is the little-endian version of [e2 80 be], 3 bytes, flipped.
Ok, at least base64 agrees with perl6, and I must have made some encoding mistake with perl5.

Back to debugging our parrot problem:

parrot unlike perl6 has no debugger yet. So we have to use gdb, and we need to know in which function the error occured. We use the parrot -t trace flag, which is like the perl5 debugging -Dt flag, but it is always enabled, even in optimized builds.

$ parrot --help
    -t --trace [flags] 
$ parrot --help-debug
--trace -t [Flags] ...
    0001    opcodes
    0002    find_method
    0004    function calls

$ parrot -t7 m.pir
009f band I9, I2, 63         I9=0 I2=0 
00a3 set I10, P0[I5]         I10=0 P0=FixedIntegerArray=PMC(0xff7638) I5=[2063]
016c get_results PC2 (1), P2 PC2=FixedIntegerArray=PMC(0xedd178) P2=PMCNULL
016f finalize P2             P2=Exception=PMC(0x16ed498)
0171 pop_eh
lots of error handling
0248 callmethodcc P0, "print" P0=FileHandle=PMC(0xedcca0) 
FixedIntegerArray: index out of bounds!

We finally see the problem, which matches the run-time error.

00a3 set I10, P0[I5]         I10=0 P0=FixedIntegerArray=PMC(0xff7638) I5=[2063]

We want to set I10 to the I5=2063’th element in the FixedIntegerArray P0, and the array is not big enough.

After several hours of analyzing I came to the conclusion that the parrot library MIME::Base64 was wrong by using ord of every character in the string. It should use a bytebuffer instead.
Which was fixed with commit 3a48e6. ord can return integers > 255, but base64 can only handle chars < 255.

The fixed parrot library was now correct:

$ parrot m.pir

But then the tests started failing. I spent several weeks trying to understand why the parrot testsuite was wrong with the mime_base64 tests, the testdata came from perl5. I came up with different implementation hacks which would match the testsuite, but finally had to bite the bullet, changing the tests to match the implementation.

And I had to special case the tests for big-endian, as base64 is endian agnostic. You cannot decode a base64 encoded powerpc file on an intel machine, when you use multi-byte characters. And utf-8 is even more multi-byte than ucs-2. I had to accept the fact the big-endian will return a different encoding. Before the results were the same. The tests were written to return the same encoding on little and big-endian.


The first reason why I wrote this blog post was to show how to debug into crazy problems like this, when you are not sure if the core implementation, the library, the spec or the tests are wrong. It turned out, that the library and the tests were wrong.
You saw how easily you could use gdb to debug into such problems, as soon as you find out a proper breakpoint.

The internal string representations looked like this:

MIME::Base64 internally:

len=1, encoding=utf-8, buf=[3e20]

and inside the parrot imcc compiler the SREG

len=8, buf="utf-8:\"\x{203e}\""

parrot is a register based runtime, and a SREG is the string representation of the register value. Unfortunately a SREG cannot hold the encoding info yet, so we prefix the encoding in the string, and unquote it back. This is not the reason why parrot is still slower than the perl5 VM. I benchmarked it. parrot still uses too much sprintf’s internally and the encoding quote/unquoting counts only for a 4th of the time of the sprintf gyrations.
And parrot function calls are awfully slow and de-optimized.

The second reason is to explain the new decode_base64() API, which only parrot – and therefore all parrot based languages like rakudo – now have got.

decode_base64(str, ?:encoding)

“Decode a base64 string by calling the decode_base64() function.
This function takes as first argument the string to decode, as optional second argument the encoding string for the decoded data.
It returns the decoded data.

Any character not part of the 65-character base64 subset is silently ignored.
Characters occurring after a ‘=’ padding character are never decoded.”

So decode_base64 got now a second optional encoding argument. The src string for encode_base64 can be any encoding and is automatically decoded to a bytebuffer. You can easily encode an image or unicode string without any trouble, and for the decoder you can define the wanted encoding beforehand. The result can be the encoding binary or utf-8 or any encoding you prefer, no need for additional decoding of the result. The default encoding of the decoded string is either ascii, latin-1 or utf-8. parrot will upgrade the encoding automatically.

You can compare the new examples of pir against the perl5 version:


.sub main :main
    load_bytecode 'MIME/Base64.pbc'

    .local pmc enc_sub
    enc_sub = get_global [ "MIME"; "Base64" ], 'encode_base64'

    .local string result_encode
    # GH 814
    result_encode = enc_sub(utf8:"\x{a2}")
    say   "encode:   utf8:\"\\x{a2}\""
    say   "expected: wqI="
    print "result:   "
    say result_encode

    # GH 813
    result_encode = enc_sub(utf8:"\x{203e}")
    say   "encode:   utf8:\"\\x{203e}\""
    say   "expected: 4oC+"
    print "result:   "
    say result_encode



use MIME::Base64 qw(encode_base64 decode_base64);
use Encode qw(encode);

my $encoded = encode_base64(encode("UTF-8", "\x{a2}"));
print  "encode:   utf-8:\"\\x{a2}\"  - ", encode("UTF-8", "\x{a2}"), "\n";
print  "expected: wqI=\n";
print  "result:   $encoded\n";
print  "decode:   ",decode_base64("wqI="),"\n\n"; # 302 242

my $encoded = encode_base64(encode("UTF-8", "\x{203e}"));
print  "encode:   utf-8:\"\\x{203e}\"  -> ",encode("UTF-8", "\x{203e}"),"\n";
print  "expected: 4oC+\n";
print  "result:   $encoded\n"; # 342 200 276
print  "decode:   ",decode_base64("4oC+"),"\n";

for ([qq(a2)],[qq(c2a2)],[qw(203e)],[qw(3e 20)],[qw(1000)],[qw(00c7)],[qw(00ff 0000)]){
    $s = pack "H*",@{$_};
    printf "0x%s\t=> %s", join("",@{$_}), encode_base64($s);


use Enc::MIME::Base64;
say encode_base64_str("\xa2");
say encode_base64_str("\x203e");
Day 6 – Lexical Imports

Day 6 – Lexical Imports

Perl 6 is built on lexical scopes. Variables, subroutines, constants and even types are looked up lexically first, and subroutines are only looked up in lexical scopes.

So it is only fitting that importing symbols from modules is also done into lexical scopes. I often write code such as

    use v6;

    # the main functionality of the script
    sub deduplicate(Str $s) {
        my %seen;
        $s.comb.grep({!%seen{ .lc }++}).join;

    # normal call
    multi MAIN($phrase) {
        say deduplicate($phrase)

    # if you call the script with --test, it runs its unit tests
    multi MAIN(Bool :$test!) {
        # imports &plan, &is etc. only into the lexical scope
        use Test;
        plan 2;
        is deduplicate('just some words'),
            'just omewrd', 'basic deduplication';
        is deduplicate('Abcabd'),
            'Abcd', 'case insensitivity';

This script removes all but the first occurrence of each character given on the command line:

    $ perl6 deduplicate 'Duplicate character removal'
    Duplicate hrmov

But if you call it with the --test option, it runs its own unit tests:

    $ perl6 deduplicate --test
    ok 1 - basic deduplication
    ok 2 - case insensitivity

Since the testing functions are only necessary in a part of the program — in a lexical scope, to be more precise –, the use statement is inside that scope, and limits the visibility of the imported symbols to this scope. So if you try to use the is function outside the routine in which Test is used, you get a compile-time error.

Why, you might ask? From the programmer's perspective, it reduces risk of (possibly unintended and unnoticed) name clashes the same way that lexical variables are safer than global variables.

From the point of view of language design, the combination of lexical importing, runtime-immutable lexical scopes and lexical-only lookup of subroutines allows resolving subroutine names at compile time, which again allows neat stuff like detecting calls to undeclared functions, compile-time type checking of arguments, and other nice optimizations.

But subroutines are only the tip of the iceberg. Perl 6 has a very flexible syntax, which you can modify with custom operators and macros. Those too can be exported, and imported into lexical scopes. Which means that language modifications are also lexically by default. So you can safely load any language-modifying extension, without running into danger that a library you use can't cope with it — the library doesn't even see the language modification.

So ultimately, lexical importing is another facet of encapsulation.

Day 5 – A Perl 6 Debugger

Day 5 – A Perl 6 Debugger

There’s much more to the developer experience of a language than its design, features and implementations. While the language and its implementations are perhaps the thing developers will spend most time with, the overall experience will also involve interaction with the community, reading documentation, using modules and employing various development tools. Thus, it’s important that Perl 6 make progress on these fronts too. Over the past year, we’ve taken some good steps forward in these areas; there’s now, the module ecosystem has grown, and the module installation tooling has improved. Another big step forward with regards to tooling – and the topic of this post – is that an interactive Perl 6 debugger is now available.

Running With The Debugger

The debugger has been included with the last few Rakudo * releases. If you have one of those, you’re all set. Just run perl6-debug instead of perl6. It takes the same set of options, so if your normal invocation involves, for example, using the -I flag to set the include path for modules, it’ll Just Work Like Usual. Of course, what happens next is entirely different. The debugger will show you each module it is loading, followed by placing you at the first interesting statement of the program, highlighted in yellow.


Note how it takes care to put you on the first line that actually does something, skipping the my statement above it (it’s getting increasingly smart about this).

The Basics

Hitting enter allows you to single-step through the program. At any point, you can look at variables, call methods on variables, or even evaluate expressions.


If you want to move statement by statement, but never descend into a function call or method call, type an s, followed by enter. To step out of the current sub (that is, run until it returns then break in its caller), use so to step out. To run the program until it hits an exception, just use r. Even at the point you get an exception, you can still access variables to try and dissect what went wrong.


One final variant, rt, will run until an exception is thrown, but handled. You’ll break at the point of the throw. This means you’re not disadvantaged in the debugger if you took care to handle exceptions well in your program; you can still break when they are thrown and use the debugger to help understand why. :-)


Sometimes, you know exactly where the juicy stuff happens in your program that you wish to debug. If only you could just run until you got there. Turns out you can – that’s what breakpoints are for. We can add one, use r to run, and it will stop where we placed the breakpoint.


Note that you don’t have to type out the full name of the file you want to put the breakpoint in; any unambiguous substring of the name of a file that is loaded will be sufficient.

I won’t cover them here, but there are also tracepoints, which instead of breaking will log the value of an expression each time a certain place in the program is hit. Later, you can display the log. It’s like adding print statements, but without the print statement going in your code, removing the risk of them accidentally making it into a commit (‘cus we’ve all done that one, right? :-))

Regex and Grammar Debugging

When the debugger detects you are in a regex or grammar, it offers a little extra help. As well as allowing you to single-step your way through the regex, atom by atom, it also displays the match text, indicating what has been matched so far.


Here, you can see that the pattern already successfully matched SELECT, and is now looking for a literal * or will try to call the field list rule. If in a regex, which may backtrack, the match position  jumps backwards when backtracking happens, so you can understand the backtracking behavior of the pattern.

Yes, Perl 5 Regexes Too!

Rakudo has some support for the :P5 adverb on regexes, which allows use of the Perl 5 regex syntax. Here the debugger is used in REPL mode (where you enter an expression, then can immediately debug it) to explore the difference between alternations in Perl 5 and Perl 6 (in Perl 5 they go left to right, in Perl 6 they have longest token matching semantics, such that it tries the thing that will match most characters first).


The debugger in REPL mode is great for exploring and understanding how things will execute (and as such can serve as a learning or teaching aid). Another use is for debugging modules without having to write a test script; just write a use statement in the debugger or you can even supply the module using the -M command line flag and the debugger will load it!

What About Funky Stuff, Like Macros?

That is, macros, BEGIN time, eval, and those other things where your Perl 6 program does the time warp again, doing a bit of runtime at compile time or compiling some more stuff at runtime. The debugger is built for it. If a macro is applied, the debugger will place you in it. Notice below how we’re still in the process of loading the second file, and did not get to the third yet – we really are debugging at BEGIN time!


Any lines of code that are stripped out by the macro are simply never hit at runtime. And what about statements in quasi blocks? The debugger will take you there, so you not only know what macro was applied, but can dig into exactly what it does too.


Just as bits of runtime happening at compile time work out fine, any code that gets eval‘d at runtime is also compiled with debug hooks, meaning that you can step straight into it and debug the evaluated code.

Written in Perl 6 and NQP!

You might think that writing a debugger must involve all kinds of low-level hackery. In fact, that’s not the case. The debug hooks mechanism is written in NQP, and the command line user interface is written in Perl 6. This is significant from a couple of angles. The first is the fact that we can write something like this without breaking the encapsulation of the compiler, but instead just by subclassing the Grammar, Actions and Compiler objects and twiddling with the AST. In fact, the debugger was built without any changes being required to Rakudo as it already existed. This provides important feedback on our compiler architecture – this time, very positive feedback. Things are extensible in the ways they were designed to be. The second is that writing so much of it in Perl 6 is a healthy bit of dogfooding – using the product in order to build further products. My hope is that, since most of what people would want to change is actually written in the Perl 6 part, it will feel quite hackable by the community at large.

And What Of Future Plans?

Various features are still to come: conditional breakpoints, dumping tracepoint output to a file, showing the path taken through a grammar to get to the current point, and various bits of configurability. The command line interface is nice, but of course having some extra options would be even nicer. I’m interested in a web-based interface, but also in integration with tools like Padre. There’s some work afoot on a common protocol for these things, which could make such integration possible without having to re-invent too many wheels. In the meantime, having an interactive debugger which is aware of and works well with a wide range of Perl 6 language features is a solid step forward. Happy debugging, and feature ideas (or patches ;-)) are welcome; here’s the GitHub repo!

Day 4 – Having Fun with Rakudo and Project Euler

Day 4 – Having Fun with Rakudo and Project Euler

Rakudo, the leading Perl6 implementation, is not perfect, and performance is a particularly sore subject. However, the pioneer does not ask ‘Is it fast?’, but rather ‘Is it fast enough?’, or perhaps even ‘How can I help to make it faster?’.

To convince you that Rakudo can indeed be fast enough, we’ll take a shot at a bunch of Project Euler problems. Many of those involve brute-force numerics, and that’s something Rakudo isn’t particularly good at right now. However, that’s not necessarily a show stopper: The less performant the language, the more ingenious the programmer needs to be, and that’s where the fun comes in.

All code has been tested with Rakudo 2012.11.

We’ll start with something simple:

Problem 2

By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.

The solution is beautifully straight-forward:

    say [+] grep * %% 2, (1, 2, *+* ...^ * > 4_000_000);

Runtime: 0.4s

Note how using operators can lead to code that’s both compact and readable (opinions may vary, of course). We used

  • whatever stars * to create lambda functions
  • the sequence operator (in its variant that excludes the right endpoint) ...^ to build up the Fibonacci sequence
  • the divisible-by operator %% to grep the even terms
  • reduction by plus [+] to sum them.

However, no one forces you to go crazy with operators – there’s nothing wrong with vanilla imperative code:

Problem 3

What is the largest prime factor of the number 600,851,475,143?

An imperative solution looks like this:

    sub largest-prime-factor($n is copy) {
        for 2, 3, *+2 ... * {
            while $n %% $_ {
                $n div= $_;
                return $_ if $_ > $n;

    say largest-prime-factor(600_851_475_143);

Runtime: 2.6s

Note the is copy trait, which is necessary as Perl6 binds arguments read-only by default, and that integer division div is used instead of numeric division /.

Nothing fancy going on here, so we’ll move along to

Problem 53

How many, not necessarily distinct, values of nCr, for 1 ≤ n ≤ 100, are greater than one-million?

We’ll use the feed operator ==> to factor the algorithm into separate steps:

    [1], -> @p { [0, @p Z+ @p, 0] } ... * # generate Pascal's triangle
    ==> (*[0..100])()                     # get rows up to n = 100
    ==> map *.list                        # flatten rows into a single list
    ==> grep * > 1_000_000                # filter elements exceeding 1e6
    ==> elems()                           # count elements
    ==> say;                              # output result

Runtime: 5.2s

Note the use of the Z meta-operator to zip the lists 0, @p and @p, 0 with +.

The one-liner generating Pascal’s triangle has been stolen from Rosetta Code, another great resource for anyone interested in Perl6 snippets and exercises.

Let’s do something clever now:

Problem 9

There exists exactly one Pythagorean triplet for which a + b + c = 1000. Find the product abc.

Using brute force will work (solution courtesy of Polettix), but it won’t be fast (~11s on my machine). Therefore, we’ll use a bit of algebra to make the problem more managable:

Let (a, b, c) be a Pythagorean triplet

    a < b < c
    a² + b² = c²

For N = a + b + c it follows

    b = N·(N - 2a) / 2·(N - a)
    c = N·(N - 2a) / 2·(N - a) + a²/(N - a)

which automatically meets b < c.

The condition a < b gives the constraint

    a < (1 - 1/√2)·N

We arrive at

    sub triplets(\N) {
        for 1..Int((1 - sqrt(0.5)) * N) -> \a {
            my \u = N * (N - 2 * a);
            my \v = 2 * (N - a);

            # check if b = u/v is an integer
            # if so, we've found a triplet
            if u %% v {
                my \b = u div v;
                my \c = N - a - b;
                take $(a, b, c);

    say [*] .list for gather triplets(1000);

Runtime: 0.5s

Note the declaration of sigilless variables \N, \a, …, how $(…) is used to return the triplet as a single item and .list – a shorthand for $_.list – to restore listy-ness.

The sub &triplets acts as a generator and uses &take to yield the results. The corresponding &gather is used to delimit the (dynamic) scope of the generator, and it could as well be put into &triplets, which would end up returning a lazy list.

We can also rewrite the algorithm into dataflow-driven style using feed operators:

    constant N = 1000;

    1..Int((1 - sqrt(0.5)) * N)
    ==> map -> \a { [ a, N * (N - 2 * a), 2 * (N - a) ] } \
    ==> grep -> [ \a, \u, \v ] { u %% v } \
    ==> map -> [ \a, \u, \v ] {
        my \b = u div v;
        my \c = N - a - b;
        a * b * c
    ==> say;

Runtime: 0.5s

Note how we use destructuring signature binding -> […] to unpack the arrays that get passed around.

There’s no practical benefit to use this particular style right now: In fact, it can easily hurt performance, and we’ll see an example for that later.

It is a great way to write down purely functional algorithms, though, which in principle would allow a sufficiently advanced optimizer to go wild (think of auto-vectorization and -threading). However, Rakudo has not yet reached that level of sophistication.

But what to do if we’re not smart enough to find a clever solution?

Problem 47

Find the first four consecutive integers to have four distinct prime factors. What is the first of these numbers?

This is a problem where I failed to come up with anything better than brute force:

    constant $N = 4;

    my $i = 0;
    for 2..* {
        $i = factors($_) == $N ?? $i + 1 !! 0;
        if $i == $N {
            say $_ - $N + 1;

Here, &factors returns the number of prime factors. A naive implementations looks like this:

    sub factors($n is copy) {
        my $i = 0;
        for 2, 3, *+2 ...^ * > $n {
            if $n %% $_ {
                repeat while $n %% $_ {
                    $n div= $_
        return $i;

Runtime: unknown (33s for N=3)

Note the use of repeat while … {…}, the new way to spell do {…} while(…);.

We can improve this by adding a bit of caching:

    BEGIN my %cache = 1 => 0;

    multi factors($n where %cache{$n}:exists) { %cache{$n} }
    multi factors($n) {
        for 2, 3, *+2 ...^ * > sqrt($n) {
            if $n %% $_ {
                my $r = $n;
                $r div= $_ while $r %% $_;
                return %cache{$n} = 1 + factors($r);
        return %cache{$n} = 1;

Runtime: unknown (3.5s for N=3)

Note the use of BEGIN to initialize the cache first, regardless of the placement of the statement within the source file, and multi to enable multiple dispatch for &factors. The where clause allows dynamic dispatch based on argument value.

Even with caching, we’re still unable to answer the original question in a reasonable amount of time. So what do we do now? We cheat and use Zavolaj – Rakudo’s version of NativeCall – to implement the factorization in C.

It turns out that’s still not good enough, so we refactor the remaining Perl code and add some native type annotations:

    use NativeCall;

    sub factors(int $n) returns int is native('./prob047-gerdr') { * }

    my int $N = 4;

    my int $n = 2;
    my int $i = 0;

    while $i != $N {
        $i = factors($n) == $N ?? $i + 1 !! 0;
        $n = $n + 1;

    say $n - $N;

Runtime: 1m2s (0.8s for N=3)

For comparison, when implementing the algorithm completely in C, the runtime drops to under 0.1s, so Rakudo won’t win any speed contests just yet.

As an encore, three ways to do one thing:

Problem 29

How many distinct terms are in the sequence generated by ab for 2 ≤ a ≤ 100 and 2 ≤ b ≤ 100?

A beautiful but slow solution to the problem can be used to verify that the other solutions work correctly:

    say +(2..100 X=> 2..100).classify({ .key ** .value });

Runtime: 11s

Note the use of X=> to construct the cartesian product with the pair constructor => to prevent flattening.

Because Rakudo supports big integer semantics, there’s no loss of precision when computing large numbers like 100100.

However, we do not actually care about the power’s value, but can use base and exponent to uniquely identify the power. We need to take care as bases can themselves be powers of already seen values:

    constant A = 100;
    constant B = 100;

    my (%powers, %count);

    # find bases which are powers of a preceeding root base
    # store decomposition into base and exponent relative to root
    for 2..Int(sqrt A) -> \a {
        next if a ~~ %powers;
        %powers{a, a**2, a**3 ...^ * > A} = a X=> 1..*;

    # count duplicates
    for %powers.values -> \p {
        for 2..B -> \e {
            # raise to power \e
            # classify by root and relative exponent
            ++%count{p.key => p.value * e}

    # add +%count as one of the duplicates needs to be kept
    say (A - 1) * (B - 1) + %count - [+] %count.values;

Runtime: 0.9s

Note that the sequence operator ...^ infers geometric sequences if at least three elements are provided and that list assignment %powers{…} = … works with an infinite right-hand side.

Again, we can do the same thing in a dataflow-driven, purely-functional fashion:

    sub cross(@a, @b) { @a X @b }
    sub dups(@a) { @a - @a.uniq }

    constant A = 100;
    constant B = 100;

    2..Int(sqrt A)
    ==> map -> \a { (a, a**2, a**3 ...^ * > A) Z=> (a X 1..*).tree } \
    ==> reverse()
    ==> hash()
    ==> values()
    ==> cross(2..B)
    ==> map -> \n, [\r, \e] { (r) => e * n } \
    ==> dups()
    ==> ((A - 1) * (B - 1) - *)()
    ==> say();

Runtime: 1.5s

Note how we use &tree to prevent flattening. We could have gone with X=> instead of X as before, but it would make destructuring via -> \n, [\r, \e] more complicated.

As expected, this solution doesn’t perform as well as the imperative one. I’ll leave it as an exercise to the reader to figure out how it works exactly ;)

That’s it

Feel free to add your own solutions to the Perl6 examples repository under euler/.

If you’re interested in bioinformatics, you should take a look at Rosalind as well, which also has its own (currently only sparsely populated) examples directory rosalind/.

Last but not least, some solutions for the Computer Language Benchmarks Game – also known as the Debian language shootout – can be found under shootout/.

You can contribute by sending pull requests, or better yet, join #perl6 on the Freenode IRC network and ask for a commit bit.

Have the appropriate amount of fun!