Author Archive

Day 19 – Snow white and the seven conditionals

December 19, 2014

Perl 6 is a big language. It’s rather unapologetic about it. The vast, frightening, beautiful (and slightly dated) periodic table of operators hammers this point home. The argument for having a plethora of operators is to cover a lot of semantic ground, to let you freely express yourself with More Than One Way To Do It. The argument against is fear: “aaargh, so many!”

But ’tis the season, and love conquers fear. Today I went searching for conditionals in the language; ways to say “if p then q” in Perl 6. All this in order to take you on a guided tour of conditionals.

Here’s what we are looking for: a language construct with an antecedent (“the ‘if’ part”) and a consequent (“the ‘then’ part”). (They may also have an optional ‘else’ part, but under my arbitrary search criteria, I don’t care.)

I went looking, and I found seven of them. Now for the promised tour.

if

An oldie but a goodie.

if prompt("Should I exult? ") eq "yes" {
    say "yay!";
}

Unlike in Perl 5, you don't have to put parentheses around the antecedent expression. (Cue Python programmers rolling their eyes, unimpressed.) There are elsif and else clauses that you can add if you need it.

I won't count unless separately, as all it does is reverse the logical value of the antecedent.

when

The when statement also qualifies. It's a conditional statement, but with a preference to matching against the topic variable, $_.

given prompt("What is your age? ") {
    when * < 15 { do-not-let-into-bar() }
    when * < 25 { ask-for-ID() }
    when * > 80 { suggest-bingo-around-the-corner() }
}

The semantic "bonus" with this statement is that the end of a when block implicitly triggers a succeed (known as break to traditionalists), which will leave the surrounding contextualizer ($_-binding) block, if any. Or, in plain language, you don't have to put in explicit breaks in your Perl 6 switch statements.

The when statement doesn't have an opposite version. I have sometimes campaigned for whenn't, but with surprisingly (or rather, predictably) little uptake. Ah well; there will be modules.

&&

"Now hold your horses!", I hear you say. "That ain't no conditional!" And you would be right. But, according to the criteria of the search, it is, you see. It has an antecedent and a consequent. The consequent only gets evaluated if the antecedent is truthy.

if @stuff && @stuff[0] eq "interesting" {
    # ...
}

This behavior is called "short-circuiting", and Perl 6 is only one of a long list of languages that work this way.

I won't count ||, because it's the dual of &&. It also short-circuits, but it bails early on truthy values instead of falsy ones.

and

If we count &&, surely we should count its low-precedence cousin, and.

open("file.txt")
    and say "Yup, it opened";

Some people use && and and interchangeable, or prefer only one of them. I'm a little bit more restrictive, and tend to use && inside of expressions, and and only between statements.

I won't count or, again because it's the dual of and.

not ?&

A non-entrant in our tour which nontheless deserves an honorable mention is the ?& operator. It's a "boolifying &&".

But besides always returning a boolean value (rather than the rightmost truthy value, as && and and do), it also doesn't short-circuit. Which makes it useless as a conditional.

sub f1 { say "first"; False }
sub f2 { say "second" }

f1() && f2();   # says "first"
f1() ?& f2();   # says "first", "second"

Therefore, there are two reasons I'm not counting ?|.

not & either

We have another honorable mention. In Perl 5, the & operator deals in bitmasks (just as in a lot of C-derived languages), but in Perl 6 this operator has been stolen for a higher purpose: junctions.

f1() & f2();    # says "first", "second" in some order

The old bitmask operator (now spelled +&, by the way) doesn't short-circuit, and never did in any language I'm aware of. This new junctional operator also doesn't short-circuit. In fact, the semantics it conveys is one of quantum computer "simultaneous evaluation". Actual threading semantics is not guaranteed, currently does not happen in Rakudo — and may not be worth the overhead for most small calculations anyway — but the message is clear: hie the hence, short-circuiting.

//

The most questionable conditional on my list, this one nevertheless made the cut. It's a version of ||, but instead of testing for truthiness, it tests for undefinedness. Other than that, it's very conditional.

my $name = %names{$id} // "<no name given>";

This operator is notable for sharing a first place (with say) as "most liked feature side-ported to Perl 5". (Just don't mention smartmatch to a Perl 5 person.)

The dual of //, for evaluating a consequent only if the antecedent is a defined value, is spelled ⅋⅋, and of course, it does not exist and never did. *bright neuralizer flash*

?? !!

Loved and known by many names — ternary operator, trinary operator, conditional operator — this operator defies grammatical categories but is clearly a conditional, too. It even has an else part.

my $opponent = $player == White ?? Black !! White;

Almost every language out there spells this operator as ? : and it takes a while to get used to the new spelling in Perl 6. (The reason was that both ? and : were far too short and valuable to waste on this operator, so it got "demoted" to a longer spelling.)

If you ask me, ?? !! is an improvement once you see past the historical baggage: all of the other pure boolean operators already identify themselves by doubling a symbolic character like that. It blends in better.

You didn't hear it from me, but if anyone ever feels like creating an unless/else version of this operator, you could do a lot worse than choosing ¿¿ ¡¡ for the job. Ah, Unicode.

andthen

Finally, a late entry, andthen also works as a conditional. Just when you thought Perl 6 had you semantically covered with && and and for ordinary boolean values, and // for undefined values...

...it opens up a third door. The andthen operator only evaluates its right-hand statement if the left-hand statement succeeded.

shut-down-reactor()
    andthen don-hazmat-suit()
    andthen enter-chamber()
;

Succeeding means returning something defined and not failing along the way. Structually, the andthen operator has been compared to Haskell's monadic bind: success of earlier statements guards execution of later ones. Though it's not implemented yet in Rakudo, you're even supposed to be able to pass the successful value from the left-hand statement to the right-hand one. (It ends up in $_.)

I won't count orelse, even though it also short-circuits. Though it's worth noting that the right-hand side of orelse ends up with the $! value from the left-hand side.

Signoff

Thank you for taking the tour. As you can see, there really is More Than One Way to write a condition in Perl 6. Some of them will turn up more often in everyday code, and some less often. I use and only rarely, and I've yet to use an andthen in real code. The others all crop up fairly regularly, and all in their own favorite contexts.

Some languages give you a sparse diet of the bare necessities, claiming that it's good for you to have fewer choices. Perl is not one of those languages. Instead, you get lots of choice, lots of freedom, and it's up to you to do great things with it. Of course, we also try to make sure that the constructs are there make sense, are in good taste, and feel consistent with the rest of the language. The end result is a pleasant language with plenty of rope to shoot yourself in the foot with.

Merry conditional Christmas!

Day 17 – Three strange loops

December 17, 2014

For a long time, I’ve been fascinated by strange loops.

hands drawing each other

The concept “strange loop” revels in the circular definition. Something is the way it is because it’s defined in terms of itself. I think fascination for this concept is related to fascination with the infinite. Chicken-egg conundrums. Grandfather paradoxes. Catch-22s. (“Catches-22″?) Newcomb’s paradox. Gödel’s incompleteness theorems. Something about causal feedback loops tickle our brains.

The concept has taken root not just in art or in brainteasers. When we talk of “bootstrapping compilers” or “metacircular evaluators”, we mean exactly this type of activity. To make a compiler bootstrapping means to make it able to compile itself by (re-)writing it in the target language. An evaluator is metacircular if it implements features not in terms of a helper subsystem, but in terms of themselves. And people do this not just because it’s a cool trick, but because it opens up possibilities. Arbitrary limits in programming stop us from doing what we want, and the limit between “host system” and “guest system” is arbitrary.

In other words, if you’re seriously interested in extensibility, then you’re also likely interested in strange loops. That’s why you see these ideas floated around in the Lisp communities a lot. And that’s why compilers sometimes take the bootstrapping route. (Another happy example is JavaScript, where some language extension proposals happens through their metacircular interpreter, aptly named Narcissus.)

Perl 6 has three strange loops. Grammar, OO, and program elements. It’s in various stages of closing those.

Grammar

Status: not quite closed yet

Perl 6 grammars are famously powerful and allows the programmer to specify entire languages. It stands to reason that a Perl 6 implementation would want to eat its own dogfood and use a Perl 6 grammar to specify the syntax of the language itself. And behold, Rakudo and Niecza do exactly that. (Pugs doesn’t, but among its future versions is the milestone of being self-hosting.)

Having this bootstrapping in place gives us a high confidence in our grammar engine. We know that it’s powerful enough to parse all the Perl 6 we pass it every day. This is also what allows us to write most of Rakudo in Perl 6, making it easier for others to contribute.

I say “not quite closed yet” because the access from user code into the Perl 6 grammar has some ways to go still.

Benefits: tooling, slangs, macros

Something like PPI should be trivial once we close the loop completely. The compiler already has access to all that information, and we just need to get it out into user space. Slangs can either extend the existing Perl 6 grammar, or start from scratch and provide a completely different syntax. In either case, good communication between old and new grammars is required for e.g. variables to be visible across languages. Macros need to be able to effortlessly go from the textual form of something to AST form.

OO

Status: closed

The system underlying all objects in the Perl 6 runtime is called 6model. An object was created from a class, a class was created from a “class metaobject”. After just a few more steps you end up with something called a “knowhow” — this is the end of the line, because by all appearances the knowhow created itself.

6model is our object-oriented implementation of object orientation. It’s very nice and it allows us to extend OO when we need it.

Benefits: make your own OO rules

You will find most examples of the kind of OO extension that’s posssible through jnthn’s modules. Grammar::Debugger extends what it means to call a grammar rule. Test::Mock similarly keep track of how and when methods are called. The recent modules OO::Monitors and OO::Actors both provide new class metaobjects, basically giving us new rules for OO. They also go one step further and provide definitional keywords for monitors and actors, respectively.

Program elements

Status: just discovered

Macros can generate code, but in some cases they also analyze or extend code. That’s the idea anyway. What’s been stopping us so far in realizing this in Perl 6 is that there hasn’t been a standard way to talk about Perl 6 program elements. What do I mean by “program element” anyway? Let’s answer that by example.

Say you have a class definition in your program. Maybe you have a fancy editor, with refactoring capability. The editor is certainly aware of the class definition and can traverse/manipulate it according to the rules of the language. In order for it to do that, it needs to be able to represent the class definition as an object. That object is a program element. It’s different from the class metaobject; the former talks about its relation to the program text, and the latter talks about its relation to the OO system.

<PerlJam> masak: "program elements" reads like
          "DOM for Perl" to me.
<masak> yep.

Macros are headed in the same way as such a refactoring editor. By handling program elements, you can analyze and extend user code at compile time inside macros. The API of the program elements give the user the ability to extend the Perl 6 compiler itself from library code and user code.

The work on this has only just started. My progress report so far is this: Perl 6 has a lot of node types. :) Having a rich language that doesn't take the Lisp route of almost-no-syntax also means that the API of the program elements becomes quite extensive. But it will also open up some exciting doors.

Benefits: macros, slangs

Parting words

Part of the reason why I like Perl 6 is that it has these strange loops. Little by little, year by year, Perl 6 lifts itself up by the bootstraps. There's still some work left to close some of these loops. I've been hopefully waiting for a long time to be able to do Perl 6 parsing from user code. And I'm eager to provide more of the program element loop so that we can write Perl 6 that writes Perl 6 from inside of our macros.

Mostly I'm excited to see people come up with things that none of us have thought of yet, but that are made possible, even easy, by embedding these strange loops into the language.

Have a loopy Christmas.

Day 7 – that’s how we .roll these days

December 7, 2014

This is a public service announcement.

Since Wolfman++ wrote his blog post about .pick back in the year two thousand nine (nine years after the distant future), we’ve dropped the :replace flag, and we’ve gained a .roll method. (Actually, the change happened in September 2010, but no-one has advent blogged about it.)

So this is the new way to think about the two methods:

.pick    like picking pebbles out of an urn, without putting them back
.roll    like rolling a die

I don't remember how I felt about it back in 2010, but nowadays I'm all for keeping these methods separate. Also, it turns out, I use .roll($N) a whole lot more. In that light, I'm glad it isn't spelled .pick($N, :replace).

Just to give a concrete example:

> say <Rock Paper Scissor>.roll
Rock                    # good ol' Rock

.roll is equivalent to .roll(1). When we're only interested in one element, .roll and .pick work exactly the same since replacement doesn't enter into the picture. It's a matter of taste which one to use.

I just want to show two more tricks that .roll knows about. One is that you can do .roll on a Bag:

> my $bag = (white-ball => 9, black-ball => 1).Bag;
bag(white-ball(9), black-ball)
> say $bag.roll         # 90% chance I'll get white-ball...
white-ball              # ...told ya

Perfect for all your weighted-randomness needs.

The other trick is that you can .roll on an enumeration:

> enum Dwarfs <Dipsy Grumpy Happy Sleepy Snoopy Sneezy Dopey>
> Dwarfs.roll
Happy                   # me too!
> enum Die <⚀ ⚁ ⚂ ⚃ ⚄ ⚅>
> Die.roll
⚄
> Die.roll(2)
⚀ ⚀                     # snake eyes!
> Die.roll(5).sort
⚀ ⚀ ⚁ ⚂ ⚃

The most frequent use I make of this is the idiom Bool.roll for randomly tossing a coin in my code.

if Bool.roll {
    # heads, I win
}
else {
    # tails, you lose
}

Bool is an enum by spec, even though it isn't one in Rakudo (yet) for circularity saw reasons.

> say Die.HOW
Perl6::Metamodel::EnumHOW.new()
> say Bool.HOW
Perl6::Metamodel::ClassHOW.new()    # audience goes "awwww!"

This has been a public service announcement. Enjoy your holidays, and have a merry Perl 6.

Day 3 – cap your junctions

December 3, 2014

“With great power comes great responsibility.” These days I muse on how this applies not just to movie superheroes, but to programming language features as well.

It happens all over the place. Database connection? Awesome. Well, until you become the victim of an SQL injection or a SELECT N+1 problem. Regular expressions? Also awesome… until someone uses it to parse HTML and gets a visitation from elder non-beings from beyond the fabric of reality. Ouch.

Of course, we still like our powerful features. But we need to learn to contain them, to harness them. The awesome should work for us and the purpose we have set it to. It should not work for attackers, elder non-beings, or Murphy. We need to make sure the awesome doesn’t leak beyond its designated boundaries.

In 2009, Matthew Walton blogged about junctions. His explanation of what makes junctions exciting is great, and I agree with it. Today I would simply like to add the following point: junctions are so awesome that they should be contained.

Let’s demonstrate the problem real quickly.

> my @list = 3..7
3 4 5 6 7
> my $answer = all(@list) > 0;

Quick, what value ends up in $answer? Well, all the values in @list are
indeed greater than 0, so…

all(True, True, True, True, True)

Aw, dang. Not again.

Just to be clear, this is by spec, and correctly implemented and everything. But unless you ask specifically for Perl 6 to collapse the junction, it just keeps on going. Keeps on going, like a juggernaut, through walls and oceans, propagating through your program whether you want it to or not.

What you have to do then is to collapse the junction, putting a cap on all that awesome. The junction is useful as long as you’re doing the junctive computation itself (in this case all(@list) > 0), but immediately after that, we’ll want to collapse back into the normal, sane, non-junctive world:

> ?$answer      # symbolic prefix
True
> so $answer    # listop prefix
True
> $answer.Bool  # coerce method
True

Moritz points out that the whole “collapsing” metaphor comes from quantum physics, where something collapses as it’s measured. The analogy holds up; after we’re done junctioning around with our “quantum” superposition, we want to get a single boolean value back out. We do that by “measuring” — boolifying — the junction.

I will hasten to add that the most common use of junctions doesn’t suffer from this problem, namely junctions in if statements:

if all(@list) > 0 {
...
}

The if statement itself provides a natural cap to the junction. It does its own boolification of it under the hood, but more importantly, the junction value doesn’t stick around. We only see the effects of the boolean value of the junction.

No, the real risk comes when you’re a library writer, and you provide routines that happen to compute things using junctions:

sub all-positive(@list) {
    all(@list) > 0;     # DANGER
}

That’s where you want to remember the old tired adage. With great power… comes… right, that’s right. So you cap your junction before you let it run out into the wild and do untold structural damage to people, buildings, cattle, and those cute fluffy dogs that would otherwise suffer Puppy Death By Junction.

sub all-positive(@list) {
    so all(@list) > 0;
}

Yes. That’s better.

If you declare your return type, you can get a runtime error for leaving out the `so`.

sub all-positive(@list --> Bool) {
    all(@list) > 0;
}

say all-positive([1, 2, 3]); # Type check failed for return value;
                             # expected 'Bool' but got 'Junction'

Maybe that’s an argument for declaring return types for your routines. And maybe some day we’ll catch that one at compile-time, too.

Cap your junctions as early as possible.
— masak’s Rule of Responsible Junction Use

The longest I can remember legitimately keeping junctions around in an uncollapsed state was when storing junctions of regexes in a hash as part of this password-analyzing script. Even that’s not an exception to the above rule, though — note that the first thing that happens to the junctions in the subsequent code is that they become the rhs in a smartmatch statement, and then get capped by an if statement. So even here, they don’t escape out into the wild.

Code responsibly. Cap your junctions.

Day 22 – A catalogue of operator types

December 22, 2013

Perl 6 has a very healthy relationship to operators. "An operator is just a funnily-named subroutine", goes the slogan in the community.

In practice, it means you can do this:

sub postfix:<!>($N) {
    [*] 2..$N;
}
say 6!;    # 720

Yeah, that’s the prototypical factorial example. I promise I won’t do it any more in this post. I swear the only reason we don’t have factorial as a standard operator in the language, is so that we can impress people by defining it.

Anyway, so a postfix operator ! is really just a "funnily-named" subroutine postfix:<!>. Similarly, we have prefix and infix operators, like prefix:<-> and infix:<*>. They’re all just subroutines. I wrote about that before, so I’m not going to hammer on that point.

Operators have different precedence (like infix:<*> binds tighter than infix:<+>)…

$x + $y * $z   # compiler sees $x + ($y * $z)
$x * $y + $z   # compiler sees ($x * $y) + $z

…and different associativity (like infix:</> associates to the left but infix:<=> associates to the right).

$x / $y / $z   # compiler sees ($x / $y) / $z
$x = $y = $z   # compiler sees $x = ($y = $z)

But I wrote about that before too, at quite some length, so I’m not going to revisit that topic.

No, today I just want to talk about the operator categories in general. I think Perl 6 does a nice job of describing the operator types themselves. I don’t see that in many other languages.

Here are all the different types:

 type            position           syntax
======          ==========         ========

prefix          before a term        !X
infix           between two terms    X ! Y
postfix         after a term         X!

circumfix       around               [X]
postcircumfix   after & around       X[Y]

A lot of other languages give you the ability to define your own operators. Many of them provide approaches which are hackish afterthoughts, and only allow you to override existing operators. Some other languages do approach the problem head-on, but end up simplifying the language to decrease the complexity of defining new operators.

Perl 6 approaches, and solves, the problem, head-on. You get all of the above operators, you can refine or override old ones, and you can talk within the language about things like precedence and associativity.

All that is rather nice. But, as usual, Perl 6 goes one step further and starts classifying metaoperators, too.

What’s a metaoperator? That’s our name for when you can extend an operator in a certain way. For example, many languages allow some kind of not-equal operator like infix:<!=>. Perl 6 has another one for string non-equality: infix:<ne>. And then maybe you want to do smart non-matching: infix:<!~~>. And so on — pretty soon you catch on to the pattern: you may, at one point or another, want to invert the result of any boolean matcher. So that’s what Perl 6 provides.

Here are a few examples:

normal op     negated op
=========     ==========
eq            !eq     ( synonym of ne )
~~            !~~
<             !<      ( synonym of >= )
before        !before

Because this particular metaoperator attaches itself before an infix operator, it gets the name infix_prefix_meta_operator:<!>. I was going to say "and yes, you can add your own user-defined metaoperators too", but currently no implementation supports that. Maybe sometime in the future.

There are many other categories of metaoperators. For example the "multiplication reducer" [*] that we used in the factorial example at the top (which takes a list of numbers and multiplies them all together) is really an infix:<*> surrounded by the metaop prefix_circumfix_meta_operator:sym<[ ]>. (It’s "prefix" because it goes before the list of numbers.)

Luckily, we don’t have meta-meta-ops. Already thinking about the metaops is quite a challenge sometimes. But what I like about Perl 6 and the approach it takes to grammar and operators is this: someone did think about it, long and hard. The resulting system has a pleasing simplicity to it.

Perl 6 is very well-suited for building parsers for languages. As one of its neatest tricks, it takes this expressive power and directs it towards the task of defining itself. As a Perl 6 user, you’re given not just an excellent toolbox, but a well-organized workshop to adapt and extend to fit your needs.

Day 15 – Numbers and ways of writing them

December 15, 2013

Consider the humble integer.

my $number-of-dwarfs = 7;
say $number-of-dwarfs.WHAT;   # (Int)

Integers are great. (Well, many of them are.) Computers basically run on integers. Also, God made the integers, and everything else is basically cheap counterfeit knock-offs, smuggled into the Platonic realm when no-one’s looking. If they’re so important, we’d expect to have lots of good ways of writing them in any respectable programming language.

Do we have lots of ways of writing integers in Perl 6? We do. (Does that make Perl 6 respectable? No, because it’s a necessary but not sufficient condition. Duh.)

One thing we might want to do, for example, is write our numbers in other number bases. The need for this crops up now and then, for example if you’re stranded on Binarium where everyone has a total of two fingers:

my $this-is-what-we-humans-call-ten = 0b1010;   # 'b' for 'binary', baby
my $and-your-ten-is-our-two = 0b10;

Or you may find yourself on the multi-faceted world of Hexa X, whose inhabitants (due to an evolutionary arms race involving the act of tickling) sport 16 fingers each:

my $open-your-mouth-and-say-ten = 0xA;    # 'x' is for, um, 'heXadecimal'
my $really-sixteen = 0x10;

If you’re really unlucky, you will find yourself stuck in a file permissions factory, with no other means to signal the outer world than by using base 8:

my $halp-i'm-stuck-in-this-weird-unix-factory = 0o644;

(Fans of other C-based languages may notice the deviation from tradition here: for your own sanity, we no longer write the above number as 0644 — doing so rewards you with a stern (turn-offable) warning. Maybe octal numbers were once important enough to merit a prefix of just 0, but they ain’t no more.)

Of course, just these bases are not enough. Sometimes you will need to count with a number of fingers that’s not any of 2, 8, or 16. For those special occasions, we have another nice syntax for you:

say :3<120>;       # 15 (== 1 * 3**2 + 2 * 3**1 + 0 * 3**0)
say :28<aaaargh>;  # 4997394433
say :5($n);        # let me parse that in base 5 for you

Yes, that’s the dear old pair syntax, here hijacked for base conversions from a certain base. You will recognize this special syntax by the fact that it uses colonpairs (:3<120>), not the "fat arrow" (3 => 120), and that the key is an integer. For the curious, the integer goes up to 36 (at which point we’ve reached ‘z’ in the alphabet, and there’s no natural way to make up more symbols for "digits").

I once used :4($_) to translate DNA into proteins. Perl 6 — fulfilling your bioinformatics dreams!

Oh, and sometimes you want to convert into a certain base, essentially taking a good-old-integer and making it understandable for a being with a certain amount of fingers. We’ve got you covered there, too!

say 0xCAFEBABE;            # 3405691582
say 3405691582.base(16);   # CAFEBABE

Moving on. Perl 6 also has rationals.

my $tenth = 1/10;
say $tenth.WHAT;   # (Rat)

Usually computers (and programming languages) are pretty bad at representing numbers such as a tenth, because most of them are stranded on the planet Binarium. Not so Perl 6; it stores your rational numbers exactly, most of the time.

say (0, 1/10 ... 1);            # 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
say (1/3 + 1/6) * 2;            # 1
say 1/10 + 1/10 + 1/10 == 0.3;  # True (\o/)

The rule here is, if you give some number as a decimal (0.5) or as a ratio (1/2), it will be represented as a Rat. After that, Perl 6 does its darnedest to strike a balance between representing numbers without losing precision, and not losing too much performance in tight loops.

But sometimes you do reach for those lossy, precision-dropping, efficient numbers:

my $pi = 3.14e0;
my $earth-mass = 5.97e24;  # kg
say $earth-mass.WHAT;      # (Num)

You get floating-point numbers by writing things in the scientific notation (3.14e0, where the e0 means "times ten to the power of 0"). You can also get these numbers by doing some lossy calculation, such as exponentiation.

Do keep in mind that these are not exact, and will get you into trouble if you treat them as exact:

say (0, 1e-1 ... 1);             # misses the 1, goes on forever
say (0, 1e-1 ... * >= 1);        # 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 (oops)
say 1e0/10 + 1/10 + 1/10 == 0.3; # False (awww!)

So, don’t do that. Don’t treat them as exact, that is.

You’re supposed to be able to write lists of numbers really neatly with the good old Perl 6 quote-words syntax:

my @numbers = <1 2.0 3e0>;
say .WHAT for @numbers;          # (IntStr) (RatStr) (NumStr)

But this, unfortunately, is not implemented in Rakudo yet. (It is implemented in Niecza.)

If you’re into Mandelbrot fractals or Electrical Engineering, Perl 6 has complex numbers for you, too:

say i * i == -1;            # True
say e ** (i * pi) + 1;      # 0+1.22464679914735e-16i

That last one is really close to 0, but the exponentiation throws us into the imprecise realm of floating-point numbers, and we lose a tiny bit of precision. (But what’s a tenth of a quadrillionth between friends?)

my $googol = EVAL( "1" ~ "0" x 100 );
say $googol;                            # 1000…0 (exactly 100 zeros)
say $googol.WHAT;                       # (Int)

Finally, if you have a taste for the infinite, Perl 6 allows handing, passing, and storage of Inf as a constant, and it compares predictably against numbers:

say Inf;              # Inf
say Inf.WHAT;         # (Num)
say Inf > $googol;    # True
my $max = -Inf;
$max max= 5;          # (max= means "store only if greater")
say $max;             # 5

Perl 6 meets your future requirements by giving you alien number bases, integers and rationals with excellent precision, floating point and complex numbers, and infinities. This is what has been added — now go forth and multiply.

Day 09 – Hashes and pairs

December 9, 2013

Hashes are nice. They can work as a kind of “poor man’s objects” when creating a class seems like just too much ceremony for the occasion.

my $employee = {
    name => 'Fred',
    age => 51,
    skills => <sweeping accounting barking>,
};

Perl (both 5 and 6) takes hashes seriously. So seriously, in fact, that there’s a dedicated sigil for them, and by using it you can omit the braces in the above code:

my %employee =
    name => 'Fred',
    age => 51,
    skills => <sweeping accounting barking>,
;

Note that Perl 6, just as Perl 5, allows you to use barewords for hash keys. People coming from Perl 5 seem to expect that we’ve dropped this feature — I don’t know why, but I suspect that as much as they like the ability, they also feel that it’s secretly “dirty” or “wrong” somehow, and thus they just assume that hash keys need to be quotes in Perl 6. Don’t worry, fivers, you can omit the quotes there without any feelings of guilt!

Another nice thing is that final comma. Yes, Perl allows a comma even after the last hash entry. This makes rearranging lines later a lot less of a hair-pulling experience, because all lines have a final comma. So a big win for maintainability, and not a lot of extra bookkeeping for the Perl 6 parser.

Hashes make great “configuration objects”, too. You want to pass some options into a routine somewhere, but the options (for reasons of future compatibility, perhaps) need to be an open set.

my %options =
    rpm => 440,
    duration => 60,
;
$centrifuge.start(%options);

Actually, we have two options with that last line. Either we pass in the whole hash like that, and the method in the centrifuge class will need to look like this:

method start(%options) {
    # probably need to start by unpacking options here
    # ...
}

Or we decide to “gut” the hash as we pass it in, effectively turning it into a bunch of named arguments:

$centrifuge.start( |%options );  # means :rpm(440), :duration(60)

In which case the method declaration will have to look like this instead:

method start(:$rpm!, :$duration!) {
    # ...
}

(In this case, we probably want to put in those exclamation marks, to make those named parameters obligatory. Unless we’re fine with providing some of them with a default, such as :$duration = 120.)

The “gut” operator prefix:<|> is really called “flattening” or “interpolation”. I really like how, in Perl 6, arrays flatten into positional parameters, and hashes flatten into named parameters. Decades after the fact, it gives some extra rationale to making arrays and hashes special enough to have their own sigils.

my @args = "Would you like fries with that?", 15, 5;
say substr(|@args);    # fries

my %details = :year(1969), :month(7), :day(16),
              :hour(20), :minute(17);
my $moonlanding = DateTime.new( |%details );

This brings us neatly into my next point: hash entries in Perl 6 really try to look like named parameters. They aren’t, of course, they’re just keys and values in a hash. But they try really hard to look the same. So much so that we even have *two* syntaxes for writing a hash entry. We have the fat-arrow syntax:

my %opts = blackberries => 42;

And we have the named argument syntax:

my %opts = :blackberries(42);

They each have their pros and cons. One of the nice things about the latter syntax is that it mixes nicely with variables, and (in case your variables are fortunately named) eliminates a bit of redundancy:

my $blackberries = 42;
my %opts = :$blackberries;   # means the same as :blackberries($blackberries)

We can’t do that in the fat-arrow syntax, not without repeating the word blackberries. And no-one likes to do that.

So hash entries (a key plus a value) really become more of a thing in Perl 6 than they ever were in Perl 5. In Perl 5 they’re a bit of a visual hack, since the fat-arrow is just a synonym for the comma, and hashes are initialized through lists.

In Perl 6, hash entries are syntactically pulled into visual pills through the :blackberries(42) syntax, and even more so through the :$blackberries syntax. Not only that, but we’re passing hashes into routines entry by entry, making the entries stand out a bit more.

In the end, we give up and realize that we care a bunch about those hash entries as units, so we give them a name: Pair. A hash is an (unordered) bunch of Pair objects.

So when you’re saying this:

say %employee.elems;

And the answer comes back “3”… that’s the number of Pair objects in the hash that were counted.

But in the end, Pair objects even turn out to have a sort of independent existence, graduating from their role as hash constituents. For, example, you can treat them as cons pairs and simulate Lisp lists with them:

my $lisp-list = 1 => 2 => 3 => Nil;  # it's nice that infix:<< => >> is right-associative

And then, as a final trick, let’s dynamically extend the Pair class to recognize arbitrary cadr-like method calls. (Note that .^add_fallback is not in the spec and currently Rakudo-only.)

Pair.^add_fallback(
    -> $, $name { $name ~~ /^c<[ad]>+r$/ },  # should we handle this? yes, if /^c<[ad]>+r$/
    -> $, $name {                            # if it turned out to be our job, this is what we do
        -> $p {
            $name ~~ /^c(<[ad]>*)(<[ad]>)r$/;        # split out last 'a' or 'd'
            my $r = $1 eq 'a' ?? $p.key !! $p.value; # choose key or value
            $0 ne '' ?? $r."c{$0}r"() !! $r;               # maybe recurse
        } 
    }
);

$lisp-list.caddr.say;    # 3

Whee!

Day 02 – The humble type object

December 2, 2013

For some inscrutable reason, we have defined a Dog class in today’s post.

class Dog {
    has $.name;
}

Don’t ask me why — maybe we’re writing software for a kennel? Maybe we’re writing software for dogs? “Teach your dog how to type!” Clever dogs can do up to 10 words a minute, with surprisingly few typos.

Anyway. Having a Dog class gives us the dubious pleasure of being able to create dogs out of thin air and passing them to functions. No surprise there.

sub check(Dog $d) {
    say "Yup, that's a dog for sure.";
}

my Dog $dog .= new(:name);
check($dog);     # Yup, that's a dog for sure.

But where there might be some surprise — if you haven’t gotten used to the idea of Perl 6’s type objects yet — is that you can also do this:

check(Dog);      # Yup, that's a dog for sure.

What did we just do there?

We didn’t pass a dog to the function, we passed Dog to the function. And the function accepts it and says it’s a dog, too. So, Dog is a dog. Huh.

What’s going on here is that in Perl 6 when we declare a class Dog like we just did, the word Dog ends up having two meanings:

  • The class Dog that we declared.
  • The type object Dog, kind of a patron saint for all Dog objects ever instantiated.

Cramming two concepts into one word like this seems like a recipe for failure. But it turns out it works surprisingly well.

Before we look at the reasons for using type objects, let’s find out a bit more about what they are.

say Dog;          # (Dog)
say Dog.name;     # ERROR: Cannot look up attributes in a type object
say ?Dog;         # False
say defined Dog;  # False

So, in summary, the Dog type object identifies itself as (Dog), it refuses to have its attribute inspected, it boolifies to False, and it’s not defined. Contrast this with an instance of Dog:

say $dog;         # Dog.new(name => "Fido")
say $dog.name;    # Fido
say ?$dog;        # True
say defined $dog; # True

An instance is everthing the type object isn’t: it knows how to output itself, it will happily tell you its name, and it’s both True and defined. Nice.

(Being undefined is only almost a surefire way to identify the type object. Someone could have gone through the trouble of making their instance object undefined. As they say in the industry, you don’t have to be a type object to be undefined… but it helps!)

And now, as promised, the Top Five Reasons Type Objects Work Surprisingly Well:

  1. Classes actually make sense as objects in the program. There’s this idea that classes have to haughtily refuse to play among the rest of the values in a program, that they have to somehow be like gods looking down on the instances from a Parthenon of class-hood. Perl 5 kind of has it like that. But both Ruby and Python show that classes can behave as more or less normal objects. In Perl 6, Dog is also what you get if you do $dog.WHAT.
  2. It fits quite well with the whole smartmatching thing. So what $dog ~~ Dog actually means is something like “hey, Dog type object, does this $fido look anything like you?”. The type object doesn’t just sit there, it does useful things like smartmatching.
  3. Another thing: the whole reason that line, my Dog $dog .= new(:name);, works as it does is because we end up calling .new on the type object. So here’s what that line does in slow motion. It desugars to a declaration and an assignment. The declaration is my Dog $dog; and so, because the $dog variable needs to start out with some undefined value, it starts out with Dog, the type object. The assignment then is simply $dog.=new, which is short for $dog = $dog.new. Conveniently, because the type object Dog is an object of the type Dog, it has a .new method (inherited from Mu in this case) that knows how to construct dogs.
  4. A little detail from that last point, which actually turns out to be a rather big deal: Perl 6 doesn’t really have undef like Perl 5 does. It turned out that undef wasn’t a really good fit with a type system; undef gets in everywhere, and doesn’t really have a type at all. (A bit like Java’s null which is known to have caused people no end of suffering.) So what Perl 6 has instead are these typed undefined values, namely — you guessed it — the type objects. If you declare a variable my Int $i, then $i will start out as undefined, that is, containing the type object Int.
  5. Not only do you sometimes want to call .new on the type object, sometimes you have other methods which don’t require an instance. (These kinds of methods are sometimes known as static methods in some languages, and class methods in other languages. Some languages have both of these, and they’re different, just to be confusing.) Again, the type object comes to the rescue here, sort of acts like an instance so that you can call your method on it, and then once again fades into the background. For example, if the class Dog had a method bark { say "woof" } then the Dog type object would be able to bark just as well as actual dog instances. (But the type object still refuses to tell you its .name, or any of its attributes.)

So that’s type objects for you. They’re sitting in a convenient semantic spot halfway between the class and its instances, sometimes representing one end of the spectrum, sometimes the other.

One thing before we part ways today. It doesn’t happen often, but sometimes you do want to be able to tell type objects and real instances apart, for example when accepting parameters in a function:

multi sniff(Dog:U $dog) {
    say "a type object, of course"
}
multi sniff(Dog:D $dog) {
    say "definitely a real dog instance"
}

sniff Dog;    # a type object, of course
sniff $dog;   # definitely a real dog instance

Here, :U stands for “undefined” and :D for “defined”. (And that, dear friends, is how we got a smiley into the design of Perl 6. Program language designers, take heed.) As I mentioned parenthetically before, it’s actually possible to be an undefined object without being a type object. For these special occasions, we have :T, but this modifier isn’t implemented in Rakudo as of this writing. (Though moritz++ informs me that, in Rakudo, :U currently has the semantics that :T should have.)

Let’s just end this post with maybe the corniest one-liner ever to see the light of day in #perl6:

$ perl6 -e 'say (my @a = "Hip " x 2), @a.^name, "!"'
Hip Hip Array!

Day 23 – Macros

December 23, 2012

Syntactic macros. The Lisp gods of yore provided humanity with this invention, essentially making Lisp a programmable programming language. Lisp adherents often look at the rest of the programming world with pity, seeing them fighting to invent wheels that were wrought and polished back in the sixties when giants walked the Earth and people wrote code in all-caps.

And the Lisp adherents see that the rest of us haven’t even gotten to the best part yet, the part with syntactic macros. We’re starting to get the hang of automatic memory management, continuations, and useful first-class functions. But macros are still absent from this picture.

In part, this is because in order to have proper syntactic macros, you basically have to look like Lisp. You know, with the parentheses and all. Lisp ends up having almost no syntax at all, making every program a very close representation of a syntax tree. Which really helps when you have macros starting to manipulate those same trees. Other languages, not really wanting to look like Lisp, find it difficult-to-impossible to pull off the same trick.

The Perl languages love the difficult-to-impossible. Perl programmers publish half a dozen difficult-to-impossible solutions to CPAN before breakfast. And, because Perl 6 is awesome and syntactic macros are awesome, Perl 6 has syntactic macros.

It is known, Khaleesi.

What are macros?

For reasons even I don’t fully understand, I’ve put myself in charge of implementing syntactic macros in Rakudo. Implementing macros means understanding them. Understanding them means my brain melts regularly. Unless it fries. It’s about 50-50.

I have this habit where I come into the #perl6 channel, and exclaiming “macros are just X!” for various values of X. Here are some samples:

  • Macros are just syntax tree manipulators.
  • Macros are just “little compilers”.
  • Macros are just a kind of templates.
  • Macros are just routines that do code substitution.
  • Macros allow you to safely hand values back and forth between the compile-time world and the runtime world.

But the definition that I finally found that I like best of all comes from scalamacros.org:

Macros are functions that are called by the compiler during compilation. Within these functions the programmer has access to compiler APIs. For example, it is possible to generate, analyze and typecheck code.

While we only cover the “generate” part of it yet in Perl 6, there’s every expectation we’ll be getting to the “analyze and typecheck” parts as well.

Some examples, please?

Coming right up.

macro checkpoint {
  my $i = ++(state $n);
  quasi { say "CHECKPOINT $i"; }
}

checkpoint;
for ^5 { checkpoint; }
checkpoint;

The quasi block is Perl 6’s way of saying “a piece of code, coming right up!”. You just put your code in the quasi block, and return it from the macro routine.

This code inserts “checkpoints” in our code, like little debugging messages. There’s only three checkpoints in the code, so the output we’ll get looks like this:

CHECKPOINT 1
CHECKPOINT 2
CHECKPOINT 2
CHECKPOINT 2
CHECKPOINT 2
CHECKPOINT 2
CHECKPOINT 3

Note that the “code insertion” happens at compile time. That’s why we get five copies of the CHECKPOINT 2 line, because it’s the same checkpoint running five times. If we had had a subroutine instead:

sub checkpoint {
  my $i = ++(state $n);
  say "CHECKPOINT $i";
}

Then the program would print 7 distinct checkpoints.

CHECKPOINT 1
CHECKPOINT 2
CHECKPOINT 3
CHECKPOINT 4
CHECKPOINT 5
CHECKPOINT 6
CHECKPOINT 7

As a more practical example, let’s say you have logging output in your program, but you want to be able to switch it off completely. The problem with an ordinary logging subroutine is that with something like:

LOG "The answer is { time-consuming-computation() }";

The time-consuming-computation() will run and take a lot of time even if LOG subsequently finds that logging was turned off. (That’s just how argument evaluation works in a non-lazy language.)

A macro fixes this:

constant LOGGING = True;

macro LOG($message) {
  if LOGGING {
    quasi { say {{{$message}}} };
  }
}

Here we see a new feature: the {{{ }}} triple-block. (Syntax is likely to change in the near future, see below.) It’s our way to mix template code in the quasi block with code coming in from other places. Doing say $message; would have been wrong, because $message is a syntax tree of the message to be logged. We need to inject that syntax tree right into the quasi, and we do that with a triple-block.

The macro conditionally generates logging code in your program. If the constant LOGGING is switched on, the appropriate logging code will replace each LOG macro invocation. If LOGGING is off, each macro invocation will be replaced by literally nothing.

Experience shows that running no code at all is very efficient.

What are syntactic macros?

A lot of things are called “macros” in this world. In programming languages, there are two big categories:

  • Textual macros. They substitute code on the level of the source code text. C’s macros, or Perl 5’s source filters, are examples.
  • Syntactic macros. They substitute code on the level of the source code syntax tree. Lisp macros are an example.

Textual macros are very powerful, but they represent the kind of power that is just as likely to shoot half your leg off as it is to get you to your destination. Using them requires great care, of the same kind needed for a janitor gig at Jurassic Park.

The problem is that textual macros don’t compose all that well. Bring in more than one of them to work on the same bit of source code, and… all bets are off. This puts severe limits on modularity. Textual macros, being what they are, leak internal details all over the place. This is the big lesson from Perl 5’s source filters, as far as I understand.

Syntactic macros compose wonderfully. The compiler is already a pipeline handing off syntax trees between various processing steps, and syntactic macros are simply more such steps. It’s as if you and the compiler were two children, with the compiler going “Hey, you want to play in my sandbox? Jump right in. Here’s a shovel. We’ve got work to do.” A macro is a shovel.

And syntactic macros allow us to be hygienic, meaning that code in the macro and code outside of the macro don’t step on each other’s toes. In practice, this is done by carefully keeping track of the macros context and the mainline’s context, and making sure wires don’t cross. This is necessary for safe and large-scale composition. Textual macros don’t give us this option at all.

Future

Both of the examples in this post work already in Rakudo. But it might also be useful to know where we’re heading with macros in the next year or so. The list is in the approximate order I expect to tackle things.

  • Un-hygiene. While hygienic macros are the sane and preferable default, sometimes you want to step on the toes of the mainline code. There should be an opt-out, and escape hatch. This is next up.
  • Introspection. In order to analyze and typecheck code, not just generate it, we need to be able to take syntax trees coming in as macro arguments, and look inside of them. There are no tools for that yet, and there’s no spec to guide us here. But I’m fairly sure people will want this. The trick is to come up with something that doesn’t tie us down to one compiler’s internal syntax-tree format. Both for the sake of compiler interoperability and future compatibility.
  • Deferred declarations. The sandbox analogy isn’t so frivolous, really. If you declare a class inside a quasi block, that declaration is limited (“sandboxed”) to within that quasi block. Then, when the code is injected somewhere in the mainline because of a macro invocation, it should actually run. Fortunately, as it happens, the Rakudo internals are factored in such a way that this will be fairly straightforward to implement.
  • Better syntax. The triple-block syntax is probably going away in favor of something better. The problem isn’t the syntax so much as the fact that it currently only works for terms. We want it to work for basically all syntactic categories. A solid proposal for this is yet to materialize, though.

With each of these steps, I expect us to find new and fruitful uses of macros. Knowing my fellow Perl 6 developers, we’ll probably find some uses that will shock and disgust us all, too.

In conclusion

Perl 6 is awesome because it puts you, the programmer, in the driver seat. Macros are simply more of that.

Implementing macros makes your brain melt. However, using them is relatively straightforward.

Day 22 – Parsing an IPv4 address

December 22, 2012

Guest post by Herbert Breunung (lichtkind).

Perl 5 brought regexes to mainstream programming and set a standard, one that is felt as relevant even in Redmond. Perl 6, of course, steps up the game by adding many new features to the regex camp, including easy-to-build grammars for your own complex parsers. But without getting too complex, you can get a lot of joy out of Perl 6’s rx (that’s how Perl 6 spells Perl 5’s qr operator, that enables you to save a Regex in a variable).

Because the Perl 6 regex syntax is less littered with exceptional cases, Larry Wall also likes to joke that he put the “regular” back into “regular expression”.

Some of the changes are:

  • most special variables are gone,
  • non-capturing groups and other grouping syntax is easier to type,
  • no more single/multi line modes,
  • x mode became default, making whitespace non-significant by default.

In summary, regexes are more regular than in Perl 5, confirming Larry’s joke. They try a bit harder to make your life easier when you need to match text. Under the hood, regexes have blossomed out into a complete sub-language within the bigger Perl 6 language. A language with its own parsing rules.

But don’t fret; not everything has changed. Some things remain the same:

/\d+/

This regex still matches one or more consecutive digits.

Similarly, if you want to capture the digits, you can do this, just like you’re used to:

/(\d+)/

You’ll find the matched digits in $0, not $1 as in Perl 5. All the special variables $0, $1, $2 are really syntactic sugar for indexing the match variable ($/[0], $/[1], $/[2]). Because indices start at 0, it makes sense for the first matched group to be $0. In Perl 5, $0 contains the name of the script or program, but this has been renamed into $*EXECUTABLE_NAME in Perl 6.

Should you be interested in getting all of the captured groups of a regex match, you can use @(), which is syntactic sugar for @($/).

The object in the $/ variable holds lots of useful information about the last match. For example, $/.from will give you the starting string position of the match.

But $0 will get us far enough for this post. We use it to extract individual features from a string.

Sometimes we want to extract a whole bunch of similar things at once. Then we can use the :g (or :global) modifier on the regex:

$_ = '1 23 456 78.9';
say .Str for m:g/(\d+)/; # 1 23 456 78 9

Note that the :g — as opposed to prior regex implementations — sits up front, right at the start of the regex. Not at the end. That way, when you read the regex from left to right, you will know from the start how the regex is doing its matching. No more end-heavy regex expressions.

Matching “all things that look like this” is so useful, that there’s even a dedicated method for that, .comb:

$str.comb(/\d+/);

If you’re familiar with .split, you can think of .comb as its cheerful cousin, matching all the things that .split discards.

Let’s tackle the matching of an IPv4 address. Coming from a Perl 5 angle, we expect to have to do something like this:

/(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/

This won’t do in Perl 6, though. First of all, the {} blocks are real blocks in a Perl 6 regex; they contain Perl 6 code. Second, because Perl 6 has lots of error handling to catch p5isms, like this, you’ll get an error saying “Unsupported use of {N,M} as general quantifier; in Perl 6 please use ** N..M (or ** N..*)”.

So let’s do that. To match between one and three digits in a Perl 6 regex, we should type:

/\d ** 1..3/

Note how the regex sublanguage re-uses parts from the main Perl 6 language. ** can be seen as a kind of exponentiation (if we squint), in that we’re taking \d “to the between-first-and-third power”. And the range notation 1..3 exists both outside and within regexes.

Using our new knowledge about the repetition quantifier, we end up with something like this:

/(\d**1..3) \. (\d**1..3) \. (\d**1..3) \. (\d**1..3)/

That’s still kinda clunky. We might end up wishing that we could use the repetition operator again, but those literal dots in between prevent us from doing that. If only we could specify repetition a given number of times and a divider.

In Perl 6 regexes, you can.

/ (\d ** 1..3) ** 4 % '.' /

The % operator here is a quantifier modifier, so it can only follow on a quantifier like * or + or **. The choice of % for this function is relatively new in Perl 6, and you may prefer to read it as “modulo”, just like in the main language. That is, “match four groups of digits, modulo literal dots in between”. Or you could think of the dots in between as the “remainder”, the separators that are left after you’ve parsed the actual elements.

Oh, and you might’ve noticed that \. changed to '.' on the way. We can use either; they mean exactly the same. In Perl 5, there isn’t a simple rule saying which symbols have a magic meaning and which ones simply signify themselves. In Perl 6, it’s easy: word characters (alphanumerics and the underscore) always signify themselves. Everything else has to be escaped or quoted to get its literal meaning.

Putting it all together, here’s how we would extract IPv4 addresses out of a string:

$_ = "Go 127.0.0.1, I said! He went to 173.194.32.32.";

say .Str for m:g/ (\d ** 1..3) ** 4 % '.' /;
# output: 127.0.0.1 173.194.32.32

Or, we could use .comb:

$_ = "Go 127.0.0.1, I said! He went to 173.194.32.32.";
my @ip4addrs = .comb(/ (\d ** 1..3) ** 4 % '.' /);

If we’re interested in individual integers, we can get those too:

$_ = "Go 127.0.0.1, I said! He went to 173.194.32.32.";
say .list>>.Str.perl for m:g/ (\d ** 1..3) ** 4 % '.' /;
# output: ("127", "0", "0", "1") ("173", "194", "32", "32")

If you want to know more, read the S05, or watch me battling with my slide deck and the English language in this presentation about regexes.


Follow

Get every new post delivered to your Inbox.

Join 43 other followers