Day 23 – Unary Sort

by

Most languages or libraries that provide a generic sort routine allow you to specify a comparator, that is a callback that tells the sort routine how two given elements compare. Perl is no exception.

For example in Perl 5, which defaults to lexicographic ordering, you can request numeric sorting like this:

 use v5;
 my @sorted = sort { $a <=> $b } @values;

Perl 6 offers a similar option:

 use v6;
 my @sorted = sort { $^a <=> $^b }, @values;

The main difference is that the arguments are not passed through the global variables $a and $b, but rather as arguments to the comparator. The comparator can be anything callable, that is a named or anonymous sub or a block. The { $^a <=> $^b} syntax is not special to sort, I have just used placeholder variables to show the similarity with Perl 5. Other ways to write the same thing are:

 my @sorted = sort -> $a, $b { $a <=> $b }, @values;
 my @sorted = sort * <=> *, @values;
 my @sorted = sort &infix:«<=>», @values;

The first one is just another syntax for writing blocks, * <=> * use * to automatically curry an argument, and the final one directly refers to the routine that implements the <=> "space ship" operator (which does numeric comparison).

But Perl strives not only to make hard things possible, but also to make simple things easy. Which is why Perl 6 offers more convenience. Looking at sorting code, one can often find that the comparator duplicates code. Here are two common examples:

 # sort words by a sort order defined in a hash:
 my %rank = a => 5, b => 2, c => 10, d => 3;
 say sort { %rank{$^a} <=> %rank{$^b} }, 'a'..'d';
 #          ^^^^^^^^^^     ^^^^^^^^^^  code duplication

 # sort case-insensitively
 say sort { $^a.lc cmp $^b.lc }, @words;
 #          ^^^^^^     ^^^^^^  code duplication

Since we love convenience and hate code duplication, Perl 6 offers a shorter solution:

 # sort words by a sort order defined in a hash:
 say sort { %rank{$_} }, 'a'..'d';

 # sort case-insensitively
 say sort { .lc }, @words;

sort is smart enough to recognize that the code object code now only takes a single argument, and now uses it to map each element of the input list to new values, which it then sorts with normal cmp sort semantics. But it returns the original list in the new order, not the transformed elements. This is similar to the Schwartzian Transform, but very convenient since it's built in.

So the code block now acts as a transformer, not a comparator.

Note that in Perl 6, cmp is smart enough to compare strings with string semantics and numbers with number semantics, so producing numbers in the transformation code generally does what you want. This implies that if you want to sort numerically, you can do that by forcing the elements into numeric context:

 my @sorted-numerically = sort +*, @list;

And if you want to sort in reverse numeric order, simply use -* instead.

The unary sort is very convenient, so you might wonder why the Perl 5 folks haven't adopted it yet. The answer is that since the sort routine needs to find out whether the callback takes one or two arguments, it relies on subroutine (or block) signatures, something not (yet?) present in Perl 5. Moreover the "smart" cmp operator, which compares number numerically and strings lexicographically, requires a type system which Perl 5 doesn't have.

I strongly encourage you to try it out. But be warned: Once you get used to it, you'll miss it whenever you work in a language or with a library that lacks this feature.

About these ads

5 Responses to “Day 23 – Unary Sort”

  1. Perl6: Unary Sort | Enjoying The Moment Says:

    […] via Hacker News http://perl6advent.wordpress.com/2013/12/23/day-23-unary-sort/ […]

  2. Ark-kun Says:

    In C# the .OrderBy extension method works the same. You supply the unary keySelector function and the method sorts using the extracted keys. Nut you can also supply a custom comparer for the keys. http://msdn.microsoft.com/ru-ru/library/system.linq.enumerable.orderby.aspx

    list.OrderBy(s => s.ToLower());

    Nice to see Perl catching up. I wish more languages had this feature.

  3. Randal L. Schwartz Says:

    Yeay! A built-in Schwartzian Transform!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

Join 43 other followers

%d bloggers like this: