Day 22: Operator Overloading

December 22, 2009December 7, 2010Matthew Walton

Today’s gift is something I complain about a lot. Not because it’s a problem in Perl 6, but because Java doesn’t have it: operator overloading, and the definition of new operators.

Perl 6 makes it easy to overload existing operators, and to define new ones. Operators are simply specially-named multi subs, and the standard multi-dispatch rules are used to determine what the most appropriate implementation to call is.

A common example, which many of you may have seen before, is the definition of a factorial operator which mimics mathematical notation:

multi sub postfix:<!>(Int $n) {
  [*] 1..$n;
}

say 3!;

The naming convention for operators is quite straightforward. The first part of the name is the syntactic category, which is prefix, postfix, infix, circumfix or postcircumfix. After the colon is an angle bracket quote structure (the same kind of thing often seen constructing lists or accessing hash keys in Perl 6) which provides the actual operator. In the case of the circumfix operator categories, this should be a pair of bracketing characters, but all other operators take a single symbol which can have multiple characters in it. In the example above, we define a postfix operator ! which functions on an integer argument.

You can exercise further control over the operator’s parsing by adding traits to the definition, such as tighter, equiv and looser, which let you specify the operator’s precedence in relationship to operators which have already been defined. Unfortunately, at the time of writing this is not supported in Rakudo so we will not consider it further today.

If you define an operator which already exists, the new definition simply gets added to the set of multi subs already defined for that operator. For example, we can define a custom class, and then specify that they can be added together using a custom infix:<+>:

class PieceOfString {
  has Int $.length;
}

multi sub infix:<+>(PieceOfString $lhs, PieceOfString $rhs) {
  PieceOfString.new(:length($lhs.length + $rhs.length));
}

Obviously, real-world examples tend to be rather more complex than this, involving multiple member variables. We could also check our pieces of string for equality:

multi sub infix:<==>(PieceOfString $lhs, PieceOfString $rhs --> Bool) {
  $lhs.length == $rhs.length;
}

In which case we’re really just redispatching to one of the built-in variants of infix:<==>. At the time of writing this override of == doesn’t work properly in Rakudo.

One thing you might want to do which you probably shouldn’t do with operator overloading is operating things like prefix:<~>, the stringification operator. Why not? Well, if you do that, you won’t catch every conversion to Str. Instead, you should give your class a custom Str method, which is what would usually do the work:

use MONKEY_TYPING;

augment class PieceOfString {
  method Str {
    '-' x $.length;
  }
}

This will be called by the default definition of prefix:<~>. Methods which have the names of types are used as type conversions throughout Perl 6, and you may commonly wish to provide Str and Num for your custom types where it makes sense to do so.

Thus overriding prefix:<~> makes little sense, unless you actually want to change its meaning for your type. This is not to be recommended, as programmers in C++ and other languages with operator overloading will be aware. Changing the conventional semantics of an operator for a custom type is not usually something which ends well, leads to confusion in the users of your library and may result in some unpleasant bugs. After all, who knows what operator behaviour the standard container types are expecting? Trample on that, and you could be in a great deal of trouble.

New semantics are best left for new operators, and fortunately, as we have seen, Perl 6 allows you to do just that. Because Perl 6 source code is in Unicode, there are a great variety of characters available for use as operators. Most of them are impossible to type, so it is expected that multicharacter ASCII operators will be the most common new operators. For an example of a Unicode snowman operator, refer back to the end of Day 17.

12 thoughts on “Day 22: Operator Overloading”

wondering says:

December 22, 2009 at 01:16

In the first example (the factorial), there might be a mistake:

[+] 1..$n;

should be

[*] 1..$n;

if I followed the calender rightly.

I enjoy to read about perl6 every day!

Reply
1. colomon says:
  
  December 22, 2009 at 01:29
  
  Good catch!
  
  Reply
  1. colomon says:
    
    December 22, 2009 at 01:30
    
    Fixed.
Joe Z says:

December 22, 2009 at 03:33

So what are “circumfix” and “postcircumfix”?

Reply
1. carl says:
  
  December 22, 2009 at 06:58
  
  Hi, avid reader! “Circumfix” refers to operators which occur both before and after their operand. Parentheses and brackets are a good example. (The “circum-” is the same as in “circumference” or “circumvent”.)
  
  “Postcircumfix” also surround things, but besides doing that, they also come after something else. Again, parentheses and brackets are a good example. :) The parentheses in ‘(1, 2, 3)’ work differently than the parentheses in ‘foo(1, 2, 3)’, because the latter are a postcircumfix. Ditto ‘[1, 2, 3]’ and ‘@bar[1, 2, 3]’.
  
  The parser in Perl 6 works in such a way that it’s never confused about whether it expects a circumfix or a postcircumfix. The ‘tax’ we as users pay for that is that we have to make sure we get the whitespace right. Specifically, there must be no whitespace between a postcircumfix operator and the thing that precedes it. So ‘say(1, 2, 3)’ and ‘say (1, 2, 3)’ mean “print three things” and “print one thing”, respectively. For the rare cases when you really want to separate things and still have a postcircumfix, there’s something called an ‘unspace’: a backslash followed by any amount of whitespace, like this: ‘say\ (1, 2, 3)’.
  
  Reply
  1. Joe Z says:
    
    December 22, 2009 at 20:32
    
    Ah, ok, that clears it up.
    
    I was pretty sure I had a handle on circumfix–I had rightly guessed the “around” connotation of the “circum-” prefix. It was “postcircumfix” that threw me off the scent, though. I was left thinking “Wait, if it’s ‘around’, how can it also be ‘after’?” After all, there isn’t a “precircumfix” or “incircumfix”. :-)
    
    So, with a postcircumfix operator, is the item that comes before also an argument to the operator? For example, given “@bar[1, 2, 3]”, is @bar also an argument to :postcircumfix<[]> ?
    
    (Here’s hoping my angle brackets didn’t get mangled, and that I didn’t goof the syntax too badly… Perl 6 syntax is rather notably different from Perl 5 in ways that still occasionally elude me.)
  2. Moritz says:
    
    December 23, 2009 at 17:44
    
    postcircumfix operators are methods, and they are called on the object being postcircumfixed, so the method can refer to that object with the keyword self.
VZ says:

December 22, 2009 at 11:23

There is a typo in infix == definition for PieceOfString, I think: “–> Bool” should be outside of parentheses, not inside them.

Reply
1. carl says:
  
  December 22, 2009 at 12:17
  
  Maybe you’re used to the other permissible way of specifying return types in Perl 6: RetType sub foo($a, $b, $c) { ... }.
  
  But specifying the return type inside of the signature is perfectly OK too: sub foo($a, $b, $c --> RetType) { ... }.
  
  Note the two dashes in the arrow, by the way.
  
  Reply
  1. VZ says:
    
    December 22, 2009 at 12:26
    
    Sorry, looks like I put my foot in my mouth yet again. I really had no idea that this syntax was valid so I didn’t even think to double check it. Sorry again.
    
    But I do wonder what is the rationale for having “–> RetType” inside the parentheses, it seems rather counter-intuitive to me to mix it with the parameters declaration. OTOH maybe it’s again my C++ background getting through as C++0x now allows either
    
    RetType foo(T a, U b);
    
    or
    
    [] foo(T a, U b) -> RetType;
    
    which is similar to Perl6 way.
  2. carl says:
    
    December 22, 2009 at 12:46
    
    S06 is silent on any advantages of putting the return type inside the parentheses, so I’ll resort to ad-hoc blabbering instead. :)
    
    The parentheses construct a Signature object which latches onto the routine. Having the ‘–> RetType’ inside of the parentheses means that it’s clearer that the return type forms a part of the Signature, and can be introspected along with the parameters.
    
    On the other hand, pointy blocks (see S06) show that Perl 6 sometimes employs syntax where things which are semantically ‘inside’ an object are syntactically placed outside that object. So I guess that’s not really a defendable reason.
    
    Starting from another angle, the Signature can be seen as a contract with the caller, essentially promising “if you give me $a, $b and $c of some specified types, I will return RetType (if and when I return normally)”. Seen as a contract, the Signature may as well keep the return type, representing the part of the bargain that the routine has promised to uphold.
    
    A third aspect of this, again speaking in your favour, is that Signatures are used as the LHS of list assignments in Perl 6. (A generalization which excites and surprises me.) In this case, I have no idea what a return type might mean; applying the argument from ignorance gives that it shouldn’t be there in the first place. :)
Christopher Bottoms says:

December 5, 2015 at 19:27

Hi Matthew,

Thank you so much for this post. Just to get things updated before Christmas, could you please change “MONKEY_TYPING” to “MONKEY-TYPING” so that this example will work.

Thanks so much for your Perl6 work! Merry Christmas!

Reply