Day 18 – Sized, Typed, Shaped

Day 18 – Sized, Typed, Shaped

If you’ve used Perl 5, you probably know PDL (Perl Data Language), the library to manipulate highly efficient data structures.

In Perl 6, we do not (yet) have full-fledged support for PDL (or a PDL-like module), but we do have support for specific features: sized types, shaped variables (and typed arrays).

The first feature, sized types, are low-level types that are composed of a generic low-level type name, and the number of bytes they use. Such an example is int1, int64 (aka int on 64-bit machines), buf8 (a “normal” byte buffer). If you’re interesting in reading the spec, it’s all in S09.

The second feature, shaped variables (still in S09), allow us to specify the fixed size of an array:

my @dwarves[7];

This means the only available indices are 0 to 6 (inclusive). This is an example from the Perl 6’s REPL (Read-Eval-Print-Loop. The input lines start with “>”):

> my @dwarves[7];
[(Any) (Any) (Any) (Any) (Any) (Any) (Any)]
> @dwarves[0] = 'Sleepy';
Sleepy
> @dwarves[7];
Index 7 for dimension 1 out of range (must be 0..6)

The last line of input mentions “dimension 1”. Indeed, Perl 6’s shaped variables can be multi-dimensional (see S09):

> my @ints[4;2];
[[(Any) (Any)] [(Any) (Any)] [(Any) (Any)] [(Any) (Any)]]
> @ints[4;3]
Index 3 for dimension 2 out of range (must be 0..1)
> @ints[4;1]
Index 4 for dimension 1 out of range (must be 0..3)
> @ints[0;0] = 1;
1
> @ints
[[1 (Any)] [(Any) (Any)] [(Any) (Any)] [(Any) (Any)]]

The first output shows us that Perl 6 filled our array with Any at first: this means the whole array is “empty”, and that it can store anything (since Any is a supertype of IntStr, etc).

We might not want that: our array should contain integers (Int). Instead, we can declare a typed array:

> my Int @a[8] = 0..7;
[0 1 2 3 4 5 6 7]
> @a[8] = 8;
Index 8 for dimension 1 out of range (must be 0..7)
> @a[0].WHAT.say # print the type
(Int)

This is all very useful in itself, but there’s a special property we can benefit from, by using everything presented in this post together: contiguous storage!

Natively-typed shaped arrays in Perl 6 are specced to take contiguous memory, meaning we get a very good memory layout, and use little memory

> my int @a[2;4] = 1..4, 5..8;
[[1 2 3 4] [5 6 7 8]]
> @a[0;0]++;
2
> @a.WHAT.say
(array[int])

Because we know how much memory each element takes, and how many elements we have, we can very easily calculate the array upfront, once and for all – or even calculate the size necessary to store the elements: we know that int, on our 64-bit computers (that’s int64), we need 2*4*8 bytes for our 2*4 array.

Wrapping up; that’s 64 bytes total (…almost), and we get to still write Perl 6! We can use the operations we know and love (even meta/hyper-operators) and al – they’re still arrays – with our cool performance boost. Amazing!

(note about the size: we need to add some overhead for the runtime; on MoarVM, that’s something like 16 bytes for GC header, 16 bytes for the dimensions list. That’s a grand total of 16+16+64=96 bytes. pretty cool!)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s