Day 09 – Hashes and pairs

December 9, 2013 by

Hashes are nice. They can work as a kind of “poor man’s objects” when creating a class seems like just too much ceremony for the occasion.

my $employee = {
    name => 'Fred',
    age => 51,
    skills => <sweeping accounting barking>,
};

Perl (both 5 and 6) takes hashes seriously. So seriously, in fact, that there’s a dedicated sigil for them, and by using it you can omit the braces in the above code:

my %employee =
    name => 'Fred',
    age => 51,
    skills => <sweeping accounting barking>,
;

Note that Perl 6, just as Perl 5, allows you to use barewords for hash keys. People coming from Perl 5 seem to expect that we’ve dropped this feature — I don’t know why, but I suspect that as much as they like the ability, they also feel that it’s secretly “dirty” or “wrong” somehow, and thus they just assume that hash keys need to be quotes in Perl 6. Don’t worry, fivers, you can omit the quotes there without any feelings of guilt!

Another nice thing is that final comma. Yes, Perl allows a comma even after the last hash entry. This makes rearranging lines later a lot less of a hair-pulling experience, because all lines have a final comma. So a big win for maintainability, and not a lot of extra bookkeeping for the Perl 6 parser.

Hashes make great “configuration objects”, too. You want to pass some options into a routine somewhere, but the options (for reasons of future compatibility, perhaps) need to be an open set.

my %options =
    rpm => 440,
    duration => 60,
;
$centrifuge.start(%options);

Actually, we have two options with that last line. Either we pass in the whole hash like that, and the method in the centrifuge class will need to look like this:

method start(%options) {
    # probably need to start by unpacking options here
    # ...
}

Or we decide to “gut” the hash as we pass it in, effectively turning it into a bunch of named arguments:

$centrifuge.start( |%options );  # means :rpm(440), :duration(60)

In which case the method declaration will have to look like this instead:

method start(:$rpm!, :$duration!) {
    # ...
}

(In this case, we probably want to put in those exclamation marks, to make those named parameters obligatory. Unless we’re fine with providing some of them with a default, such as :$duration = 120.)

The “gut” operator prefix:<|> is really called “flattening” or “interpolation”. I really like how, in Perl 6, arrays flatten into positional parameters, and hashes flatten into named parameters. Decades after the fact, it gives some extra rationale to making arrays and hashes special enough to have their own sigils.

my @args = "Would you like fries with that?", 15, 5;
say substr(|@args);    # fries

my %details = :year(1969), :month(7), :day(16),
              :hour(20), :minute(17);
my $moonlanding = DateTime.new( |%details );

This brings us neatly into my next point: hash entries in Perl 6 really try to look like named parameters. They aren’t, of course, they’re just keys and values in a hash. But they try really hard to look the same. So much so that we even have *two* syntaxes for writing a hash entry. We have the fat-arrow syntax:

my %opts = blackberries => 42;

And we have the named argument syntax:

my %opts = :blackberries(42);

They each have their pros and cons. One of the nice things about the latter syntax is that it mixes nicely with variables, and (in case your variables are fortunately named) eliminates a bit of redundancy:

my $blackberries = 42;
my %opts = :$blackberries;   # means the same as :blackberries($blackberries)

We can’t do that in the fat-arrow syntax, not without repeating the word blackberries. And no-one likes to do that.

So hash entries (a key plus a value) really become more of a thing in Perl 6 than they ever were in Perl 5. In Perl 5 they’re a bit of a visual hack, since the fat-arrow is just a synonym for the comma, and hashes are initialized through lists.

In Perl 6, hash entries are syntactically pulled into visual pills through the :blackberries(42) syntax, and even more so through the :$blackberries syntax. Not only that, but we’re passing hashes into routines entry by entry, making the entries stand out a bit more.

In the end, we give up and realize that we care a bunch about those hash entries as units, so we give them a name: Pair. A hash is an (unordered) bunch of Pair objects.

So when you’re saying this:

say %employee.elems;

And the answer comes back “3”… that’s the number of Pair objects in the hash that were counted.

But in the end, Pair objects even turn out to have a sort of independent existence, graduating from their role as hash constituents. For, example, you can treat them as cons pairs and simulate Lisp lists with them:

my $lisp-list = 1 => 2 => 3 => Nil;  # it's nice that infix:<< => >> is right-associative

And then, as a final trick, let’s dynamically extend the Pair class to recognize arbitrary cadr-like method calls. (Note that .^add_fallback is not in the spec and currently Rakudo-only.)

Pair.^add_fallback(
    -> $, $name { $name ~~ /^c<[ad]>+r$/ },  # should we handle this? yes, if /^c<[ad]>+r$/
    -> $, $name {                            # if it turned out to be our job, this is what we do
        -> $p {
            $name ~~ /^c(<[ad]>*)(<[ad]>)r$/;        # split out last 'a' or 'd'
            my $r = $1 eq 'a' ?? $p.key !! $p.value; # choose key or value
            $0 ne '' ?? $r."c{$0}r"() !! $r;               # maybe recurse
        } 
    }
);

$lisp-list.caddr.say;    # 3

Whee!

Day 08 – Array-based objects

December 8, 2013 by

With the advent of a fully-realized object system in Perl 6, complete with formal attributes and accessors, hash-based objects are no longer the de-facto standard. Indeed, the underlying representation for the vast majority of classes (including Arrays and Hashes) is something called “P6Opaque”. This is because such complexities have been moved into the guts of the implementation, which provides a consistent interface to various internal data types transparently. Today’s post isn’t about underlying representations, though. The whole point is for us to be able to take advantage of the new arrangement by spending our time worrying about practical intentions instead of data type semantics.

So, what are we talking about, and if nearly everything is just “opaque”, what makes an array look and act like an array? Arrays (and hashes too) are in many ways just another class. Specifically, the array class is named “Array” with a capital A. We can subclass it, add our own methods and properties, and it will still behave as any array, because it still is one. Many language elements which used to be a largely static part of Perl syntax have been reimplemented as first-class citizens such as grammars and roles. Creating an array-based class truly is as simple as:

class Vector is Array {}

Binding and assignment

It doesn’t look like much yet, but we get a lot for that short bit of code. As might be expected from a Perl array, Vector is an auto-sizing sequence of writable scalars which are indexed by sequential integers. Our Vector class inherits Array’s constructor, which takes positional arguments. It does the Positional role, too, so it is followed by square brackets for subscripting and can be bound to containers declared with the “@” sigil, like so:

my @vec := Vector.new(1, 2, 3);

Note the “:=” binding instead of “=” assignment. Without the colon, a positional container will be created as an empty Array whose contained values are then set to the list after the “=”. In other words:

my @vec = Vector.new(1, 2, 3); # WRONG: assignment instead of binding
say @vec.WHAT; # (Array) instead of (Vector)

because it is the same as:

my @vec := Array.new;
@vec[] = Vector.new(1, 2, 3);

which isn’t what we want, of course. Since “$” means “any type” in Perl 6, as opposed to “a single value” as in Perl 5, we can also use scalar containers for our array-based objects, like any other object. No consideration of binding versus assignment is necessary in this case:

my $vec = Vector.new(1, 2, 3);
say $vec.WHAT; # (Vector)

Methods

Now we know how to create and store our array-based objects in a couple of ways. But our class doesn’t actually do anything useful, so we add some methods:

class Vector is Array {
    method magnitude () {
        sqrt [+] self »**» 2 
    }

    method subtract (@vec) {
        self.new( self »-« @vec )
    }
}

We have a couple simple vector operations. The subtract method takes any positional object as its argument, including Arrays and Vectors, and returns another Vector to allow method chaining:

my @position := Vector.new(1, 2);
my @destination = 4, 6;
my $distance = @position.subtract(@destination).magnitude;
    # $distance = 5

In practical reality we would of course do various sanity checks such as type and dimension comparisons. Needlessly complicating the example serves no purpose, though. This is working pretty well so far. For us, at least.

TIMTOWTDI

Many programmers, however, dislike the idea of treating an object as an array or vice-versa, for various reasons. They would rather have their objects all look like opaque scalars, and access their values via ordinary-looking method calls instead of subscripts. We might not share this view (we are, after all, writing an Array-based class), but imposing that singular vision on all of our users would be decidedly un-Perlish. And we’re half way there. Our Vector can already be stored in a scalar, as discussed before:

my $vec = Vector.new(0, 1);
say $vec.magnitude; # 1

It’s not a bad facade, for an array. But let’s see if we can do better, so we don’t alienate our more traditional users. What about the values? A normal Perl 6 object would expose those via lvalue accessors, so let’s add a few of our own:

class Vector is Array {
    method x () is rw { self[0] }
    method y () is rw { self[1] }
    method z () is rw { self[2] }

    method magnitude () {
        sqrt [+] self »**» 2 
    }

    method subtract (@vec) {
        self.new( self »-« @vec )
    }
}

Here we’re simply defining lvalue accessors with “is rw”, which means their return value can be assigned to with “=” like normal variables instead of through arguments to the accessor:

my $vec = Vector.new(1, 2);
$vec.z = 3;
say $vec; # 1 2 3

Customizing the Constructor

Only one step remains to complete our somewhat seamless illusion: honoring those names if they are passed to the constructor. To accomplish this, we’ll write our own constructor to turn named arguments into positional ones, and pass them on to Array’s constructor implementation positionally:

class Vector is Array {
    method new (*@values is copy, *%named) {
        @values[0] = %named<x> if %named<x> :exists;
        @values[1] = %named<y> if %named<y> :exists;
        @values[2] = %named<z> if %named<z> :exists;

        nextwith(|@values);
    }

    method x () is rw { self[0] }
    method y () is rw { self[1] }
    method z () is rw { self[2] }

    method magnitude () {
        sqrt [+] self »**» 2 
    }

    method subtract (@vec) {
        self.new( self »-« @vec )
    }
}

Don’t forget to declare any parameters which you make changes to as “is copy”, as in the constructor above. Otherwise you run the risk of causing unintended changes to your users’ variables or having your routine just plain die when it tries to perform the assignment.

Seeing is Believing

With the final piece in place, we can now treat our Vector class as a “normal” class with named properties:

my $vec = Vector.new(:x(1), :y(2));
$vec.z = 3;

as an array:

my @vec := Vector.new(1, 2);
@vec.push(3);

or even a mixture of the two, if you just plain want to do it how you feel like at any given moment, and don’t care what anyone else thinks:

my $vec = Vector.new(:y(3), 0);
$vec.x++;
$vec[1]--;
$vec.z = 3;
.say for @$vec;
    # 1
    # 2
    # 3

Closing Remarks

That concludes our demonstration for today, but our Vector class is still far from complete. If you’re looking for something to sharpen your Perl 6 teeth on, you could add methods for more of the basic vector operations, and generally experiment with your working model. Optimize methods with internal caching in private attributes. Export a convenience sub for construction (something like vector(1, 2, 3) instead of Vector.new(1, 2, 3)). Or maybe you want x, y, and z to be actual attributes with all of the extra semantics and meta goodness thereof, in which case you would probably do something in your constructor along the lines of binding the attributes to the array slots, or vice-versa. Or try something different based on a hash. Go wild. -Ofun, as the mantra goes.

Here’s hoping that on day 8 (a number thought by some to variously represent building, power, balance, and new beginnings), you’ve received the gift of inspiration, if nothing else, to look long and hard at all the new ways Perl 6 allows you to bend and blend traditionally rigid design patterns along with more specialized approaches. We have illustrated in our explorations that this power, as with any, can create ugliness and danger if not exercised with due care. It has also been made apparent, however, that wise application of this level of flexibility can sometimes better suit the practical goals, personal thinking style, and human nature of yourself and/or users of your code, and can be accomplished without introducing unreasonable complexity.

For a complete implementation of vectors in Perl 6, including overloaded operators for extra-sugary, very math-looking expressions, see Math::Vector, one of a growing number of modules listed on modules.perl6.org and easily installed via panda.

Day 07 – Bagging the changes in the Set specification

December 7, 2013 by

In 2012, colomon++ implemented most of the then Set / Bag specification in Niecza and Rakudo, and blogged about it last year.

Then in May this year, it became clear that there was a flaw in the implementation that prohibited creating Sets of Sets easily. In June, colomon++ re-implemented Sets and Bags in Niecza using the new views on the spec. And I took it upon myself to port these changes to Rakudo. And I was in for a treat (in the Bertie Bott’s Every Flavour Beans kind of way).

Texan versus (non-ASCII) Unicode

Although the Set/Bag modules were written in Perl 6, there were some barriers to conquer: it was not as simple as a copy-paste operation. First of all, all Set operators were implemented in their Unicode form in Niecza foremost, with the non-Unicode (ASCII) versions (aka Texan versions) implemented as a dispatcher to the Unicode version. At the time, I was mostly developing in rakudo on Parrot. And Parrot has this performance issues with parsing code that contains non-ASCII characters (at any place in the code, even in comments). Therefore, the old implementation of Sets and Bags in Rakudo, only had the Texan versions of the operators. So I had to carefully figure out which Unicode operator in the Niecza code (e.g. ) matched which Texan operator (namely (<=)) in the Rakudo code, and make the necessary changes.

Then I decided: well, why don’t we have Unicode operators for Sets and Bags in Rakudo either? I mentioned this on the #perl6 channel, and I think it was jnthn++ who pointed out that there should be a way to define Unicode operators without actually having to use Unicode characters. After trying this, and having jnthn++ fix some issues in that area, it was indeed possible.

So how does that look?

  # U+2286 SUBSET OF OR EQUAL TO
  only sub infix:<<"\x2286">>($a, $b --> Bool) {
      $a (<=) $b;
  }

One can only say: Yuck! But it gets the job done. So now one can write:

  $ perl6 -e 'say set( <a b c> ) ⊆ set( <a b c d> )'
  True

Note that Parcels (such as <a b c>) are automatically upgraded to Sets by the set operators. So one can shorten this similar statement to:

  $ perl6 -e 'say <a b c> ⊆ <a b d>'  # note missing c
  False

Of course, using the Unicode operator in Rakudo comes at the expense of an additional subroutine call. But that’s for some future optimizer to take care of.

Still no bliss

But alas, the job was still not done. The implementation using Hash in Rakudo, would not allow Sets within Sets yet still. It would look like it worked, but that was only because the stringification of a Set was used as a key in another set. So, when you asked for the elements of such a Set of Sets, you would get strings back, rather than Sets.

Rakudo allows objects to be used as keys (and still remain objects), by mixing in the TypedHash role into a Hash, so that .keys returns objects, rather than strings. Unfortunately, using the TypedHash role is only completely functional for user programs, not when building the Perl 6 internals using Perl 6, as is done in the core settings. Bummer.

However, the way TypedHash is implemented, could also be used for implementing Sets and Bags. For Sets, there is simply an underlying Hash, in which the key is the .WHICH value of the thing in the set, and the value is the thing. For Bags, that became a little more involved, but not a lot: again, the key is the .WHICH of the thing, and the value is a Pair with the thing as the key, and the count as the value. So, after refactoring the code to take this into account as well, it seemed I could finally consider this job finished (at least for now).

It’s all about values

Then, the nitty gritty of the Set spec struck again.

  "they are subject to === equality rather than eqv equality"

What does that mean?

  $ perl6 -e 'say <a b c> === <a b c>'
  False
  $ perl6 -e 'say <a b c> eqv <a b c>'
  True

In other words, because any Set consisting of <a b c> is identical to any other Set consisting of <a b c>, you would expect:

  $ perl6 -e 'say set(<a b c>) === set(<a b c>)'
  False

to return True (rather than False).

Fortunately, there is some specification that comes in aid. It’s just a matter of creating a .WHICH method for Sets and Bags, and we’re set. So now:

  $ perl6 -e 'say set(<a b c>) === set(<a b c>)’
  True

just works, because:

  $ perl6 -e 'say set(<a b c>).WHICH; say set(<a b c>).WHICH'
  Set|Str|a Str|b Str|c
  Set|Str|a Str|b Str|c

shows that both sets, although defined separately, are really the same.

Oh, and some other spec changes

In October, Larry invoked rule #2 again, but those changes were mostly just names. There’s a new immutable Mix and mutable MixHash, but you could consider those just as Bags with floating point weights, rather than unsigned integer weights. Creating Mix was mostly a Cat license job.

Conclusion

Sets, Bags, Mixes, and their mutable counterparts SetHash, BagHash and MixHash, are now first class citizens in Perl 6 Rakudo. So, have fun with them!

Day 06 – Parsing and generating recurring dates

December 6, 2013 by

There are a lot of events that are scheduled on particular days of the week each month, for example the regular Windows Patch Day on the second Tuesday of each month, or in Perl 6 land that Rakudo Perl 6 compiler release, which is scheduled for two days after the Parrot release day, which again is scheduled for the third Tuesday of the month.

So let's write something that calculates those dates.

The specification format I have chosen looks like 3rd tue + 2 for the Rakudo release date, that is, two days after the 3rd Tuesday of each month (note that this isn't always the same as the 3rd Thursday).

Parsing it isn't hard with a simple grammar:

grammar DateSpec::Grammar {
    rule TOP {
        [<count><.quant>?]?
        <day-of-week>
        [<sign>? <offset=count>]?
    }
    token count { \d+ }
    token quant { st | nd | rd | th }
    token day-of-week { :i
        [ mon | tue | wed | thu | fri | sat | sun ]
    }
    token sign { '+' | '-' }
}

As you can see, everything except the day of the week is optional, so sun would simply be the first Sunday of the month, and 2 sun - 1 the Saturday before the second Sunday of the month.

Now it's time to actually turn this specification into a data structure that does something useful. And for that, a class wouldn't be a bad choice:

my %dow = (mon => 1, tue => 2, wed => 3, thu => 4,
        fri => 5, sat => 6, sun => 7);

class DateSpec {
    has $.day-of-week;
    has $.count;
    has $.offset;

    multi method new(Str $s) {
        my $m = DateSpec::Grammar.parse($s);
        die "Invalid date specification '$s'\n" unless $m;
        self.bless(
            :day-of-week(%dow{lc $m<day-of-week>}),
            :count($m<count> ?? +$m<count>[0] !! 1),
            :offset( ($m<sign> eq '-' ?? -1 !! 1)
                    * ($m<offset> ?? +$m<offset> !! 0)),
        );
    }

We only need three pieces of data from those date specification strings: the day of the week, whether the 1st, 2nd, 3rd. etc is wanted (here named $.count), and the offset. Extracting them is a wee bit fiddly, mostly because so many pieces of the grammar are optional, and because the grammar allows a space between the sign and the offset, which means we can't use the Perl 6 string-to-number conversion directly.

There is a cleaner but longer method of extracting the relevant data using an actions class.

The closing } is missing, because the class doesn't do anything useful yet, and that should be added. The most basic operation is to find the specified date in a given month. Since Perl 6 has no built-in type for months, we use a Date object where the .day is one, that is, a Date object for the first day of the month.

    method based-on(Date $d is copy where { .day == 1}) {
        ++$d until $d.day-of-week == $.day-of-week;
        $d += 7 * ($.count - 1) + $.offset;
        return $d;
    }

The algorithm is quite simple: Proceed to the next date (++$d) until the day of week matches, then advance as many weeks as needed, plus as many days as needed for the offset. Date objects support addition and subtraction of integers, and the integers are interpreted as number of days to add or subtract. Handy, and exactly what we need here. (The API is blatantly copied from the Date::Simple Perl 5 module).

Another handy convenience method to implement is next, which returns the next date matching the specification, on or after a reference date.

    method next(Date $d = Date.today) {
        my $month-start = $d.truncated-to(month);
        my $candidate   = $.based-on($month-start);
        if $candidate ge $d {
            return $candidate;
        }
        else {
            return $.based-on($month-start + $month-start.days-in-month);
        }
    }
}

Again there's no rocket science involved: try the date based on the month of $d, and if that's before $d, try again, but with the next month as base.

Time to close the class :-).

So, when is the next Rakudo release? And the next Rakudo release after Christmas?

my $spec = DateSpec.new('3rd Tue + 2');
say $spec.next;
say $spec.next(Date.new(2013, 12, 25));

Output:

2013-12-19
2014-01-23

The code works fine on Rakudo with both the Parrot and the JVM backend.

Happy recurring hollidates!

Day 05 – Changes in specification and operational fallout

December 4, 2013 by

Perl 6 has become much more stable in the past year. There have however been some potentially disrupting changes to the Perl 6 specification, followed by implementation changes to adhere to that new spec.

bless() changes

One of the most visible changes is the removal of an object candidate in bless(). If you wanted to call bless() yourself in your code, rather than supplying your own BUILD() method, you needed to provide an object candidate as the first parameter. Over the years, this turned out to basically always be * (as in Whatever). Which is pretty useless and an obstacle for future optimisations. So TimToady invoked rule #2 to remove that first parameter.

The changes to calls to bless() in the core setting were implemented by moritz++. For those Perl 6 modules in the wild, a warning was added:

 Passing an object candidate to Mu.bless is deprecated

The first parameter would then be removed and execution would continue.

This has the disadvantage of generating a warning every time you create an object of that class with the deprecated call to bless(). So there must be a better way to do this!

Enter “is DEPRECATED”

It turns out there is a better way. Already in June 2012, pmichaud++ added an “is DEPRECATED” routine trait that did nothing until earlier this year when I decided to add some functionality to it. Initially it was just a warn, but that just had the same annoying quality as the warning with bless().

Since the idea behind the “is DEPRECATED” trait was not specced yet, I figured I could turn it any way I wanted, unless I would not be forgiven by the #perl6 crowd. So I re-used an idea I had had at former $work, already years ago. Instead of warning at the moment a transgression is spotted, it feels better, especially for these types of deprecations, to just remember where these transgressions take place. Only when the program is finished, report the transgressions that were spotted (on STDERR).

One of the other standard methods that has been deprecated in Perl 6, is ucfirst(). One should use the tc() (for “title case”) method instead. So what happens if you do call ucfirst()? That is easily demonstrated with a one-liner:

$ perl6 -e 'say "foo".ucfirst; say "done"'
Foo
done
Saw 1 call to deprecated code during execution.
================================================================================
Method ucfirst (from Cool) called at:
  -e, line 1
Please use 'tc' instead.
--------------------------------------------------------------------------------
Please contact the author to have these calls to deprecated code adapted,
so that this message will disappear!

After this has been live in the repo for a while, and spectested, and since nobody on #perl6 complained, I decided to spec this behavior not only for routines, but also for attributes and classes. Unfortunately, the latter ones have not been implemented yet (although you can already specify the traits). But there is a patch -p1 coming up, which should give me some quality time to look at this.

So why is bless(*) not properly DEPRECATED

Indeed. Why? Simply because I missed it. So I just fixed this: nothing like blog-driven development! So this one-liner now says:

$ perl6-p -e 'class A { method new { self.bless(*) } }; say A.new'
A.new()
Saw 1 call to deprecated code during execution.
================================================================================
Method bless (from Mu) called at:
  -e, line 1
Please use a call to bless without initial * parameter instead.
--------------------------------------------------------------------------------
Please contact the author to have these calls to deprecated code adapted,
so that this message will disappear!

So how do you specify the text?

The “is DEPRECATED” trait currently takes one parameter: the string to be shown between Please use and instead. If you don’t specify anything, the text something else will be assumed. Since that is not entirely useful, it is advised to always specify a text that makes sense in that context. Additional parameters may be added in the future to allow for more customisation, but so far they have not been needed.

Conclusion

Perl 6 will continue to evolve. Changes will still be made. To not break early adopter’s code, any non-compatible changes in the implementation of Perl 6 can be marked as deprecated without interfering with the execution of existing programs much. Perl 6 module authors can do the same should they feel the need to change the API of their modules.

Day 04 — Heredocs, Theredocs, Everywheredocs docs

December 4, 2013 by

So let’s say you’ve got a bit of documentation to print out, a help statement perhaps. You could use an ordinary string, but it always looks like something you really shouldn’t be doing.

    sub USAGE {
        say "foobar Usage:
    ./foobar <args> <file>

    Options:

    ...
    ";
    }

Perl 6 has a much better idea for you, fortunately: heredocs! They work a bit differently from Perl 5, and are now invoked using the adverb :heredoc on quoting constructs:

    say q:heredoc/END/;
    Hello world!
    END

When you use :heredoc, the contents of the string are no longer the final contents; they become the string that signifies the end of a heredoc. q"END" results in the string "END", q:heredoc"END" results in everything before the next END to appear on its own line.

You will have also noticed that heredocs only start on the next possible line for them to start, not immediately after the construct closes. That semicolon after the construct never gets picked up as part of a heredoc, don’t worry :) .

The :heredoc adverb is nice, but it seems a bit long, doesn’t it? Luckily it has a short form, :to, which is much more commonly used. So that’s what we’ll be using through the rest of the post.

    say q:to"FIN";
    Hello again.
    FIN

You can use any sort of string for the delimiter, so long as there’s no leading whitespace in it. A null delimiter (q:to//) is fine too, it just means you end the heredoc with two newlines, effectively a blank line.

And yes, delimiters need to be on their own line. This heredoc never ends:

    say q:to"noend";
    HELLO WORLD noend

A note about indentation: look at this heredoc

    say q:to[finished];
      Hello there
        everybody
    finished

Which of those three heredoc lines decides how much whitespace is removed from the beginning of each line (and thus sets the base level of indentation)? It’s the line with the end delimiter, “finished” in the last example. Lines with more indentation than the delimiter will appear indented by however much extra space they use, and lines with less indentation will be as indented as the delimiter, with a warning about the issue.

(Tabs are considered to be 8 spaces long, unless you change $?TABSTOP. This usually doesn’t matter unless you mix spaces and tabs for indentation anyway though.)

It doesn’t matter how much the delimiter indentation is, all that matters is indentation relative to the delimiter. So these are all the same:

    say q:to/END/;
    HELLO
      WORLD
    END
    say q:to/END/;
        HELLO
          WORLD
        END
    say q:to/END/;
                   HELLO
                     WORLD
                   END

One other thing to note is that what quoting construct you use will affect how the heredoc contents are parsed, so

    say q:to/EOF/;
    $dlrs dollars and {$cnts} cents.
    EOF

Interpolates nothing,

    say q:to:c/EOF/;
    $dlrs dollars and {$cnts} cents.
    EOF

Interpolates just {$cnts} (the :c adverb allows for interpolation of just closures), and

    say qq:to/EOF/;
    $dlrs dollars and {$cnts} cents.
    EOF

Interpolates both $dlrs and {$cnts}.

Here’s the coolest part of heredocs: using more than one at once! It’s easy too, just use more than one heredoc quoting construct on the line!

    say q:to/end1/, qq:to/end2/, Q:to/end3/;
    This is q.\\Only some backslashes work though\t.
    $sigils don't interpolate either.
    end1
    This is qq. I can $interpolate-sigils as well as \\ and \t.
    Neat, yes?
    end2
    This is Q. I can do \\ no \t such $things.
    end3

Which, assuming you’ve defined $interpolate-sigils to hold the string "INTERPOLATE SIGILS", prints out

    This is q.\Only some backslashes work though\t.
    $sigils don't interpolate either.
    This is qq. I can INTERPOLATE SIGILS as well as \ and   .
    Neat, yes?
    This is Q. I can do \\ no \t such $things.

After every end delimiter, the next heredoc to look for its contents starts.

Of course, indentation of different heredocs will help whenever you have to stack a bunch of them like this.

    say qq:to/ONE/, qq:to/TWO/, qq:to/THREE/, qq:to/ONE/;
    The first one.
    ONE
        The second one.
        TWO
    The third one.
    THREE
        The fourth one.
        ONE

Which outputs:

    The first one.
    The second one.
    The third one.
    The fourth one.

(And yes, you don’t have to come up with a unique end delimiter every time. That could have been four q:to/EOF/ statements and it’d still work.)

One final note you should be aware of when it comes to heredocs. Like the rest of Perl 6 (barring a couple of small exceptions), heredocs are read using one-pass parsing (this means your Perl 6 interpreter won’t re-read or skip ahead to better understand the code you wrote). For heredocs this means Perl 6 will just wait for a newline to start reading heredoc data, instead of looking ahead to try and find the heredoc.

As long as the heredoc contents and the statement that introduces the heredoc are part of the same compilation unit, everything’s fine. In addition to what you’ve seen so far, you can even do stuff like this:

    sub all-info { return q:to/END/ }
    This is a lot of important information,
    and it is carefully formatted.
    END

(If you didn’t put the brace on the same line, it would be part of the heredoc, and then you’d need another brace on a line after END.)

However, things like BEGIN blocks start compiling before normal code, so trying that last one with BEGIN block fails:

    BEGIN { say q:to/END/ }
    This is only the BEGINning.
    END

You have to put the heredoc inside the BEGIN block, with the quoting construct, in order to place them in the same compilation unit.

    BEGIN {
        say q:to/END/;
        This is only the BEGINning.
        END
    }

That’s it for heredocs! When should you use them? I would say whenever you need to type a literal newline (by hitting Enter) into the string. Help output from the USAGE sub is probably the most common case. The one at the beginning could easily (and more readably) be written as

    sub USAGE {
        say q:to"EOHELP";
            foobar Usage:
            ./foobar <args> <file>

            Options:

            ...
            EOHELP
    }

Day 03 – Rakudo Perl 6 on the JVM

December 3, 2013 by

There have been a number of exciting developments for Perl 6 during 2013. In this post, we’ll take a look at one of them in some detail: running Perl 6 on the JVM (Java Virtual Machine).

Why the JVM?

There are many reasons for a language to have an implementation that targets the JVM. Here are some that drove us to bring Perl 6 to this platform.

  • The JVM is a stable, widely deployed,  trusted-in-the-enterprise platform. There are places where they don’t mind which language you write it, but they do care that it can run on the JVM.
  • The JVM has been very well optimized over the years. It most certainly isn’t fast to get started – but for long running things it typically performs well.
  • These days, the JVM is most certainly not just for Java. In fact, the commitment to run other languages – including those very different to Java – is serious. For example, there’s now a yearly JVM Language Summit, and the invokedynamic instruction and infrastructure was added in JDK7, and being improved in JDK8. Since Perl 6 is a gradually typed language, a VM that can play host to both static and dynamic languages is a good fit. Furthermore, every other major dynamic language is on the JVM. So, why not Perl too?
  • The JVM has widely used, well exercised support for concurrent, parallel and asynchronous programming. A wide range of primitives are available. Given that before this year, the Perl 6 story in these areas was also rather weak so far as implementation went, being on the JVM would provide an opportunity for fast prototyping and exploration, to help drive things forward.

But…another Perl 6 implementation?!

Implementing Perl 6 is a large undertaking – as those of us who got sucked into the process along the way have discovered. Many languages have got to the JVM by having a JVM-specific implementation of the language: JRuby, Jython, Nashorn, etc. For Perl 6, we’ve taken a different path.

The Rakudo Perl 6 compiler may only have targeted Parrot for much of its life, but those designing it have had VM portability in mind for a good while. Furthermore, the basic architecture has always been to have strongly isolated compilation stages, communicating by well-defined data structures. This put Rakudo in a good place to gain a JVM backend – at least, in theory. Over the course of the last year, what we hoped would work out well in theory has played out very nicely in practice.

The vast majority of the Rakudo codebase is not in any way VM-specific. Better still, the bits that need to change most often and that undergo most active development are almost always VM-independent. Many developers working on Rakudo test their changes against a single backend, and it’s relatively uncommon to find breakage on the other backend as a result. That said, we have automated daily spectest runs to catch any regressions.

Status

First, let’s consider the specification test suite. You might think at this point I’d mention how close Rakudo on JVM is to passing the number of specification tests that Rakudo on Parrot does. In fact, we need to do it the other way around these days: Rakudo on Parrot passes 99.64% of the spectests that Rakudo on the JVM does. “Huh,” you might think. “How’d the JVM backend come out ahead?” The answer is relatively simple: the JVM backend runs a bunch of concurrency tests that we don’t run on the Parrot backend. There actually are a small number of tests (tens rather than hundreds) that only pass on Rakudo on Parrot, largely due to “interesting” edge-case behaviors that have yet to be hunted down. However, these days the vast majority of programs run unmodified on both.

In the wider ecosystem, things are not quite so polished yet. Panda, the module installer, runs on Rakudo on the JVM. However, a number of modules depend on the NativeCall library, for calling into native code. The NativeCall porting effort is very much underway; last time I looked at it I could do basic things, like calling simple Win32 APIs. But it’s not all the way there yet. This is, however, really the last major missing piece. To say this time last year, we couldn’t run Perl 6 on the JVM at all, we’ve come a very long way.

Is it faster?

Well, it depends. For quick one liners and short-running scripts? No, startup will kill you. For something long running? Yes, usually it’s faster, and sometimes it’s significantly faster (perhaps five times or even forty times). And that’s before we’ve really done a great deal of optimization work on the JVM backend; the focus thus far has largely been “make it work”.

Can I call Java libraries?

Yes, but… :-) We do have some basic interop support in place already. Here’s an example:

use java::util::zip::CRC32:from<java>;

my $crc = CRC32.new();
for 'Hello, Java'.encode('utf-8') {
    $crc.'method/update/(B)V'($_);
}
say $crc.getValue();

It doesn’t look all that bad until you hit the method call in the loop. What’s that funny method/update/(B)V thing about? In Java you can statically overload methods. When there’s no overloading, we quite happily give you a short name. When there’s multiple, for now you need to use the JVM’s method descriptors to indicate the desired one. We’ll improve that, and many other aspects of interop, over the coming months. In summary, it’s often quite possible to call code from Java libraries today, it’s just not pleasant yet.

The future

Much has been done, yet of course there’s still plenty to do. Once NativeCall support is in shape, we’ll be able to add the JVM as an option to the Rakudo Star distribution release (for now, it’s only available in compiler releases – or fresh from Git, of course). Beyond that, the main areas of focus will be convergence, Java interop and performance. Given that this year took us from zero JVM support to Rakudo on JVM being the implementation passing the most spectests, it’s exciting to think where we’ll be in another year from now.

Day 02 – The humble type object

December 2, 2013 by

For some inscrutable reason, we have defined a Dog class in today’s post.

class Dog {
    has $.name;
}

Don’t ask me why — maybe we’re writing software for a kennel? Maybe we’re writing software for dogs? “Teach your dog how to type!” Clever dogs can do up to 10 words a minute, with surprisingly few typos.

Anyway. Having a Dog class gives us the dubious pleasure of being able to create dogs out of thin air and passing them to functions. No surprise there.

sub check(Dog $d) {
    say "Yup, that's a dog for sure.";
}

my Dog $dog .= new(:name);
check($dog);     # Yup, that's a dog for sure.

But where there might be some surprise — if you haven’t gotten used to the idea of Perl 6’s type objects yet — is that you can also do this:

check(Dog);      # Yup, that's a dog for sure.

What did we just do there?

We didn’t pass a dog to the function, we passed Dog to the function. And the function accepts it and says it’s a dog, too. So, Dog is a dog. Huh.

What’s going on here is that in Perl 6 when we declare a class Dog like we just did, the word Dog ends up having two meanings:

  • The class Dog that we declared.
  • The type object Dog, kind of a patron saint for all Dog objects ever instantiated.

Cramming two concepts into one word like this seems like a recipe for failure. But it turns out it works surprisingly well.

Before we look at the reasons for using type objects, let’s find out a bit more about what they are.

say Dog;          # (Dog)
say Dog.name;     # ERROR: Cannot look up attributes in a type object
say ?Dog;         # False
say defined Dog;  # False

So, in summary, the Dog type object identifies itself as (Dog), it refuses to have its attribute inspected, it boolifies to False, and it’s not defined. Contrast this with an instance of Dog:

say $dog;         # Dog.new(name => "Fido")
say $dog.name;    # Fido
say ?$dog;        # True
say defined $dog; # True

An instance is everthing the type object isn’t: it knows how to output itself, it will happily tell you its name, and it’s both True and defined. Nice.

(Being undefined is only almost a surefire way to identify the type object. Someone could have gone through the trouble of making their instance object undefined. As they say in the industry, you don’t have to be a type object to be undefined… but it helps!)

And now, as promised, the Top Five Reasons Type Objects Work Surprisingly Well:

  1. Classes actually make sense as objects in the program. There’s this idea that classes have to haughtily refuse to play among the rest of the values in a program, that they have to somehow be like gods looking down on the instances from a Parthenon of class-hood. Perl 5 kind of has it like that. But both Ruby and Python show that classes can behave as more or less normal objects. In Perl 6, Dog is also what you get if you do $dog.WHAT.
  2. It fits quite well with the whole smartmatching thing. So what $dog ~~ Dog actually means is something like “hey, Dog type object, does this $fido look anything like you?”. The type object doesn’t just sit there, it does useful things like smartmatching.
  3. Another thing: the whole reason that line, my Dog $dog .= new(:name);, works as it does is because we end up calling .new on the type object. So here’s what that line does in slow motion. It desugars to a declaration and an assignment. The declaration is my Dog $dog; and so, because the $dog variable needs to start out with some undefined value, it starts out with Dog, the type object. The assignment then is simply $dog.=new, which is short for $dog = $dog.new. Conveniently, because the type object Dog is an object of the type Dog, it has a .new method (inherited from Mu in this case) that knows how to construct dogs.
  4. A little detail from that last point, which actually turns out to be a rather big deal: Perl 6 doesn’t really have undef like Perl 5 does. It turned out that undef wasn’t a really good fit with a type system; undef gets in everywhere, and doesn’t really have a type at all. (A bit like Java’s null which is known to have caused people no end of suffering.) So what Perl 6 has instead are these typed undefined values, namely — you guessed it — the type objects. If you declare a variable my Int $i, then $i will start out as undefined, that is, containing the type object Int.
  5. Not only do you sometimes want to call .new on the type object, sometimes you have other methods which don’t require an instance. (These kinds of methods are sometimes known as static methods in some languages, and class methods in other languages. Some languages have both of these, and they’re different, just to be confusing.) Again, the type object comes to the rescue here, sort of acts like an instance so that you can call your method on it, and then once again fades into the background. For example, if the class Dog had a method bark { say "woof" } then the Dog type object would be able to bark just as well as actual dog instances. (But the type object still refuses to tell you its .name, or any of its attributes.)

So that’s type objects for you. They’re sitting in a convenient semantic spot halfway between the class and its instances, sometimes representing one end of the spectrum, sometimes the other.

One thing before we part ways today. It doesn’t happen often, but sometimes you do want to be able to tell type objects and real instances apart, for example when accepting parameters in a function:

multi sniff(Dog:U $dog) {
    say "a type object, of course"
}
multi sniff(Dog:D $dog) {
    say "definitely a real dog instance"
}

sniff Dog;    # a type object, of course
sniff $dog;   # definitely a real dog instance

Here, :U stands for “undefined” and :D for “defined”. (And that, dear friends, is how we got a smiley into the design of Perl 6. Program language designers, take heed.) As I mentioned parenthetically before, it’s actually possible to be an undefined object without being a type object. For these special occasions, we have :T, but this modifier isn’t implemented in Rakudo as of this writing. (Though moritz++ informs me that, in Rakudo, :U currently has the semantics that :T should have.)

Let’s just end this post with maybe the corniest one-liner ever to see the light of day in #perl6:

$ perl6 -e 'say (my @a = "Hip " x 2), @a.^name, "!"'
Hip Hip Array!

Day 01 – The State of Perl 6 in 2013

December 1, 2013 by

Welcome to the 2013 Perl 6 advent calendar!

In Perl 6 land, 2013 will be remembered as the year that brought proper concurrency support.

But I'm getting ahead of myself.

There is also sad news. Niecza, the Perl 6 compiler on the CLR (.NET/Mono) platform, and the Perl 6 compiler with the best runtime characteristics, had its last release in March. Since then there were a few maintenance patches and new built-in types and routines, but little in terms of actual compiler features.

A little later, Rakudo gained support to run on the Java Virtual machine. There are still some bits missing, mostly notably support for the native call interface, but all in all it works quite well, passes more than 99.9% of the tests that Rakudo on Parrot passes, and has two key advantages: it is much faster at run time, and has proper concurrency/parallelism support.

Jonathan Worthington prototyped and implemented it, and later specified it in S17, which again led to lots of improvements. Stay tuned for more advent calendar posts on the JVM and concurrency/parallelism topics.

Another big news this year was the revelation of MoarVM, a virtual machine designed to run Perl 6. With the JVM's high startup time and Parrot being mostly unmaintained and having lots of unsolved problems, there is a niche to be filled. NQP, the "Not Quite Perl" Perl 6 compiler used to bootstrap Rakudo already runs on MoarVM; Rakudo support for MoarVM is on its way, and progressing well so far.

There was also lots of progress in terms of built-in types likes Set and Bag, and IO::Path for handling path and directory objects.

As a developer and early adopter, I find Perl 6 to be pleasant to work with. In 2013 it has gotten easier to use, due to better error reporting and improved IO.

Perl 6 Advent Calendar 2013: Table of Contents

November 30, 2013 by

This post serves as a table of contents for the 2013 Perl 6 advent calendar. Links to new posts will appear here during the course of this month.

See also: table of contents for 2012, 2011, 2010, 2009.


Follow

Get every new post delivered to your Inbox.

Join 49 other followers