Day 8 — How to Make, Use, and Abuse Perl 6 Subsets

Ever reached for a Perl 6 type and found it good enough, but not perfect? Perhaps, you wanted an IntEven, StrPalindrome, or YourCustomClassWhereAttrFooIsBar. Never fear, Perl 6’s subsets are here!

What Are Subsets?

You can think of them as a refinement on some type and you can use them in most places where you’d use a type, such as in type constraints. To create one, use the subset keyword, along with a where keyword specifying your refinement:

    subset Even where * %% 2;
    say 1 ~~ Even; # False
    say 2 ~~ Even; # True

The WhateverStar * is the value being checked and the %% operator checks if that value is evenly divisible by 2. We can now use our subset on the right side of a smartmatch operator to check whether a value is an even number! Pretty awesome. What else can we do with it?

How about type-constraining a variable:

    my Even $x = 42; # all good

    $x = 43; # and this?
    # Type check failed in assignment to $x; expected Even but got Int (43)
    #   in block <unit> at script.p6 line 3

Or type-constraining input and output of a routine:

    sub takes-an-even-only (Even $x) { $x² }
    sub returns-an-even-only returns Even { $^x² }

    say takes-an-even-only   42; # 1764
    say returns-an-even-only 42; # 1764

    say takes-an-even-only   43;
    # Constraint type check failed for parameter '$x'
    #   in sub takes-an-even-only at script.p6 line 2
    #   in block <unit> at script.p6 line 8

    say returns-an-even-only 43;
    # Type check failed for return value; expected Even but got Int (1849)
    #   in sub returns-an-even-only at script.p6 line 3
    #   in block <unit> at script.p6 line 13

That’s all pretty sweet, but our Even accepts strings and other weird stuff:

    say '42.0000' ~~ Even; # True
    say class { method Real { 42 } }.new ~~ Even; # True

There’s a reason for that: we never specified what type we’re making a subset of, so by default, it used Any. Let’s fix that!

Getting Typical

If you want to create a subset based on some type, specify that type with the of keyword:

    subset IntEven of Int where * %% 2;

Now, before the where refinement even runs, the value we’re checking against must first pass the Int type constraint:

    say 42 ~~ IntEven; # True
    say 43 ~~ IntEven; # False
    say '42.0000' ~~ IntEven; # False
    say class { method Real { 42 } }.new ~~ IntEven; # False

We’re not limited to numerics! Let’s try a Str:

    subset StrPalindrome of Str where {
        .flip eq $_ given .comb(/\w+/).join.lc;
    }

    say ’Madam, I'm Adam.~~ StrPalindrome; # True
    say '1 on 1'  ~~ StrPalindrome; # False

We’re now using a more complex refinement in the where clause, using a code block. Just like the WhateverCode version with the *, the block receives the value to check as its argument, which it aliases to $_ topical variable. The code block tells us whether the argument is a palindrome, by returning a truthy or falsy value.

So how far can we go with the type constraints in our subsets?

Custom Made

We can type-constrain a subset using any class we have lying around! How about this one:

    class Awesome { has $.awesomeness-level }
    my $obj1 = Awesome.new: :10000awesomeness-level;
    my $obj2 = Awesome.new: :31337awesomeness-level;

We make a class called Awesome that has a public attribute called awesomeness-level. We also create two instances of that class, setting the awesomeness-level to 10000 in $obj1 and to 31337 in $obj1. So how about a subset that checks whether awesomeness-level is a prime number? It’s just a single line of code:

    subset AwesomePrime of Awesome where .awesomeness-level.is-prime;

    say $obj1 ~~ AwesomePrime; # False
    say $obj2 ~~ AwesomePrime; # True

The where block of the subset is “thunked,” which means the compiler takes an expression and turns it into a code block for us, so we don’t have to explicitly use a codeblock here, nor do we need a WhateverStar. The value being checked is in the $_ topical variable, which is what method calls use when you don’t specify what you’re calling them on. Thus, our subset expects a thing of type Awesome and then checks whether its awesomeness-level attribute is a prime number!

By using such subsets, you can create routines that only accept your custom objecs of a specific configuration. For example, in code of an IRC::Client bot, we can create a subset of IRC::Client::Message for messages that are received from bot admins, and then register events only for such messages:

    subset BotAdmin of IRC::Client::Message where .host eq conf<bot-admins>.any;

    multi method irc-to-me (BotAdmin $e where /:i ^ 'run' $<steps>=.+ $/ ) {
        ...
    }

The subset calls the subroutine that reads configuration and provides a list of bot admin hosts against which the host of the sender of the message is checked. We’re encapsulating the logic that checks we have an acceptable object to work with, and we’re able to call that logic even before we enter our method.

So if we can do that with subsets… is there anything we can’t do?

Time for Some Heavy Abuse!

Let’s do something crazy! A subroutine that fetches a link to a website and checks whether it contains a mention of Perl 6:

    use LWP::Simple;
    sub is-perl-site { LWP::Simple.get( $^website ).contains: 'Perl 6' }

There’s nothing crazy about that, you say? Then, how about we use that subroutine as a refiner in a subset where clause:

    subset PerlWebsite where &is-perl-site;

    say 'http://perl6.party' ~~ PerlWebsite; # True
    say 'http://lolcats.com' ~~ PerlWebsite; # False

In fact, we can make a routine that only accepts URLs to websites mentioning Perl 6:

    sub ain't-taking-non-perl-stuff (PerlWebsite $url) {
        say "Why! I can already tell $url is awesome!";
    }

    ain't-taking-non-perl-stuff 'http://perl6.party';
    # Why! I can already tell http://perl6.party is awesome!

    ain't-taking-non-perl-stuff 'http://lolcats.com';
    # Constraint type check failed for parameter '$url'
    #   in sub ain't-taking-non-perl-stuff at script.p6 line 8
    #   in block <unit> at script.p6 line 15

But do you notice something off? The error message is rather poor… Let’s improve it!

What we know so far is the where clause takes some code to run and if that code’s result is falsy, the typecheck will fail. That means inside the where we can know whether or not the typecheck will fail before we even return from it. Let’s put that to use:

    sub is-perl-site {
        given LWP::Simple.get( $^website ).contains: 'Perl 6' {
            when :so  { True }
            when :not {
                say ’This ain't no website containing "Perl 6"!‘;
                False;
            }
        }
    }

    subset PerlWebsite where &is-perl-site;

In the routine that checks a website mentions Perl 6, in the case when it does :not contain a mention of Perl 6, we say a helpful message, indicating what exactly was wrong. Let’s run this:

    sub ain't-taking-non-perl-stuff (PerlWebsite $url) {
        say "Why! I can already tell $url is awesome!";
    }

    ain't-taking-non-perl-stuff 'http://perl6.party';
    # Why! I can already tell http://perl6.party is awesome!

    ain't-taking-non-perl-stuff 'http://lolcats.com';
    # This ain't no website containing "Perl 6"!
    # This ain't no website containing "Perl 6"!
    # Constraint type check failed for parameter '$url'
    #   in sub ain't-taking-non-perl-stuff at script.p6 line 16
    #   in block <unit> at script.p6 line 23

Whoa! The message printed twice. What gives?

It’s actually expected that the refinement in subsets is an inexpensive and relatively simple operation… With that expectation in mind, the parameter binder—which doesn’t know how to generate errors—simply passes its stuff through the slower code path—which does—and it’s that slower code path that runs the where code the second time, triggering our message one more time.

So yes, doing overly complex stuff in subsets is abusive. However, you can throw an exception inside the where to avoid the repetition of the message:

    sub is-perl-site {
        LWP::Simple.get( $^website ).contains: 'Perl 6'
            or die ’This ain't no website containing "Perl 6"!‘;
    }

    ...

    ain't-taking-non-perl-stuff 'http://lolcats.com';
    # This ain't no website containing "Perl 6"!
    #   in sub is-perl-site at z.p6 line 4
    #   in any accepts_type at gen/moar/m-Metamodel.nqp line 3472
    #   in sub ain't-taking-non-perl-stuff at z.p6 line 11
    #   in block <unit> at z.p6 line 18

And don’t forget to check out Subset::Helper and Subset::Common modules.

What About a Light Spanking?

There is one type of abuse cheating with subsets that can get you out of a bind: fiddling with narrowness when it comes to resolution of multi candidates.

For example, let’s say you hate humanity and you wish to change the meaning of the infix + operator on two Ints to do subtraction instead of addition. You, of course, write this:

    multi sub infix:<+> (Int $a, Int $b) { $a$b }

But as you run a sample code, over the sound of your evil laughter…

    Ambiguous call to 'infix:<+>'; these signatures all match:
    :(Int:D \a, Int:D \b --> Int:D)
    :(Int $a, Int $b)
      in block <unit> at z.p6 line 4

… the complier errors out.

You see, core language already has an infix + operator that takes two Ints! When you add one of your own, you create an ambiguity. To resolve this issue, we need to somehow create an Int that the compiler thinks is narrower than an Int, but in reality isn’t. Sounds tough? Not an issue for subsets:

    subset NarrowInt of Int where {True};
    multi sub infix:<+> (NarrowInt $a, NarrowInt $b) { $a$b }

    say 42 + 2; # 40

It worked! We created a subset of Int, so we match all of Ints, just like we wanted. In the refinement, however, we specify a single code block that always returns True, making that refinement always succeed, and making our subset accept all the values a regular Int accepts, while being narrower than a regular Int as far as multi resolution goes.

If you’re wondering why we had to use an explicit block, it’s because the where smartmatches, and the smartmatch against a True produces a warning, because it’s always true, and while that is what we want here, in most code such a construct is a mistake.

But if you’re upset about writing one-too-many characters, here’s neat trick:

    multi sub infix:<+> (Int $a where {True}, Int $b where {True}) { $a - $b }
    say 42 + 2;

You don’t need to create an explicit subset, and can stick a where clause right onto the thing you’re working with to refine just that thing. The type constraint on it will function as the of ... of a subset.

You can also type constraint a variable with a subset and still add a where clause. Or create a subset of a subset of a subset and still add… well, we’re getting carried away.

When they stop calling…

Consider this piece of wonderful code:

    class Thingie {
        multi method stuff ($ where /meows/) { say "Meow meow!"; }
    }

    class SubThingie is Thingie {
        multi method stuff ($ where /foos/) { say "Just foos..."; }
    }

    SubThingie.new.stuff: 'meows and foos'; # Just foos...

You have a class with a multi method. Along with it, you have a subclass of it with another multi method of the same name. Both have a where clause and when you call the method with input that can match either multi, the subclass’s multi gets called. But what do you do if you want to reverse that… you want the parent class’s multi to be called, if both multies matches the input.

The first solution is very simple. Just add a type constraint (we’ll use Str) in the parent class, while leaving it off in the child:

    class Thingie {
        multi method stuff (Str $ where /meows/) { say "Meow meow!"; }
    }

    class SubThingie is Thingie {
        multi method stuff ($ where /foos/) { say "Just foos..."; }
    }

    SubThingie.new.stuff: 'meows and foos'; # Meow meow!

The presence of a type constraint on the method in the parent class makes it narrower than the one in the subclass, so even though the subclass’s method can also accept the input, it’s the parent class that gets to take care of it.

However, what if we wanted the same Str type constraint on both methods? The parent class we’ll leave as is: just a normal Str type constraint. In the kid, however, we’ll use a wider subset of Any (that’s the default if you don’t specify the of, remember?), but in its where clause we’ll smartmatch against Str, to ensure the subset accepts only Strs:

    class Thingie {
        multi method stuff (Str $ where /meows/) { say "Meow meow!"; }
    }

    class SubThingie is Thingie {
        subset WiderStr where { $_ ~~ Str };
        multi method stuff (WiderStr $ where /foos/) { say "Just foos..."; }
    }

    SubThingie.new.stuff: 'meows and foos'; # Meow meow!

The result is the opposite of a cheat we made in the previous section: instead of a subset that matches a type exactly, but is narrower than it, we now created a subset that matches a type exactly, but is wider than it, as far as multi candidate resolution goes. And yes, you can just merge the two where clauses instead of creating a subset, producing:

    multi method stuff ($ where { $_ ~~ Str and $_ ~~ /foos/ }) {
        say "Just foos...";
    }

It’ll work the same.

Conclusion

Subsets are a powerful feature that lets you specify refinements on existing core and custom types. You can smartmatch against a subset to perform a check on a value, or you can use subsets to type-contraint variables, parameters, and return values.

You can use the subset keyword to create a named subset, or you can attach a refinement onto a specific variable or parameter with a where clause. Subsets can also be used to effect alternative narrowness of a parameter, to affect multi candidate resolution order.

Subsets can also be abused to perform very complex operations, but… that’s probably a bad idea.

-Ofun

3 thoughts on “Day 8 — How to Make, Use, and Abuse Perl 6 Subsets

  1. Last time I checked you could also use “is default” to a multi to disambiguate the dispatch in case of equally narrow candidates.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.