Looking for a common point
Greetings!
Those days I am spending some of my time working on foundation parts for, revealing a possible surprise, a LDAP (Lightweight Directory Access Protocol) implementation for Perl 6.
However, it is yet too early to talk about this one, so I will have some mystery blanket covering this topic for now, as we have another one – spacecrafts!
And a common point between spacecrafts and LDAP is: LDAP specification uses a notation called ASN.1
, which allows one to define an abstract type, using a specific textual syntax, and, with a help of ASN.1
compilers, create a type definition for particular programming language and what’s more: encoder and decoder for values of this type, which can serialize your value into some data which, for example, can be send over network and parsed nicely on another computer.
This way you can get a cross-platform types in an application made easy. Encoders and decoders can be generated automagically not only for some specified encoding format, but for a whole range of binary (e.g. BER
, PER
and others) and textual (e.g. SOAP
) encoding formats.
So, in order to get things done, I had to implement at least some subset of ASN.1
in Perl 6 – not the full specification, which is big, and looking only at features used in LDAP specification.
‘This sounds interesting, but where are our spacecrafts!?’, you may ask. Turns out that Rocket
type is the first thing you see at ASN.1 Playground website, which gives you free access to an ASN.1
compiler, which can be used as a reference!
ASN.1
and restrictions
Here is the fancy code:
World-Schema DEFINITIONS AUTOMATIC TAGS ::=
BEGIN
Rocket ::= SEQUENCE
{
name UTF8String (SIZE(1..16)),
message UTF8String DEFAULT "Hello World" ,
fuel ENUMERATED {solid, liquid, gas},
speed CHOICE
{
mph INTEGER,
kmph INTEGER
} OPTIONAL,
payload SEQUENCE OF UTF8String
}
END
Let’s quickly look over this definition:
Rocket
is aSEQUENCE
– a group of ordered values of some types, which can be seen as heterogeneous list/array or a class.- Fields
name
andmessage
haveUTF8String
type, which is, yes, one kind of string representation inASN.1
. Fieldname
has length restriction applied with(SIZE(1..16))
andmessage
has default value specified withDEFAULT "Hello World"
. - Field
fuel
hasENUMERATED
type: it is merely an enumeration of labels to choose from. - Field
speed
is aCHOICE
, which is a special type that describes a field which value can be one of types specified. Differently fromENUMERATED
, values are not just labels.OPTIONAL
keyword means, as you can guess, that this field might be omitted if not present. - Field
payload
is aSEQUENCE
again, but with a type specified. It means that we can have as many values ofUTF8String
s here as needed.
Here we will apply two important restrictions:
- We will use
Basic Encoding Rules
(BER
) – rules that specify encoding ofASN.1
types into a specific sequence of bytes. As said above, there are different formats, but we will use this one.
Basic Encoding Rules
standard is based on a thing called “TLV encoding” – a value of a type is encoded as a sequence of bytes that represents: “Tag”, “Length” and “Value” of certain value of type passed. Let’s look at it more closely… in a reversed order!
“Value” is a part that contains a byte representation of a value. Every type has its own encoding schema (INTEGER
is encoded differently from UTF8String
, for example).
“Length” is a number which represents number of bytes in “Value” part. This allows us to handle incremental parsing (and usual one too!) nicely. It also can have “unknown” value, which allows us to stream data with yet unknown length, but we will leave this aside.
“Tag” is, simply putting, a byte or a number of bytes using which we can determine what type we are having at hands. Its exact value is determined by number of tagging rules (“tagging schema”) and for good or worse different schemas exist.
And, if you have waited for a second restriction for some paragraphs already, here it is:
- We will use BER’s
IMPLICIT
type tagging schema here. As you can guess,EXPLICIT
tagging schema exists too, along withAUTOMATIC
(which is used in the Rocket example above).
Considering this, we need to change ASN.1
type above into this:
World-Schema DEFINITIONS IMPLICIT TAGS ::=
BEGIN
Rocket ::= SEQUENCE
{
name UTF8String (SIZE(1..16)),
message UTF8String DEFAULT "Hello World" ,
fuel ENUMERATED {solid, liquid, gas},
speed CHOICE
{
mph [0] INTEGER,
kmph [1] INTEGER
} OPTIONAL,
payload SEQUENCE OF UTF8String
}
END
Note IMPLICIT TAGS
is used instead of AUTOMATIC TAGS
and [$n]
-like strings in speed
field.
If you look at this schema, it turns out that it is, actually, ambiguous, because mph
and kmph
both have INTEGER
type. So if we have read an INTEGER
from a byte stream, was it a mph
value or a kmph
value? It makes a huge difference if we are talking about spacecrafts!
To avoid this confusion, special tags are used and here we are specifying what ones we want, because, differently from AUTOMATIC
schema, IMPLICIT
does not do it for us.
Gradual building. Question answering.
So, what we can do with all that in Perl 6? While compilers may be fun, compiling into Perl 6, in an extensible manner, with fancy features included? There has to be something more simple to play with.
Let’s say, we have a script that works with spacecrafts. Of course, we will need a type to represent ones, particularly a class, let’s call it Rocket
:
class Rocket {}
Of course, we want to know some data about it:
class Rocket {
has $.name;
has $.message is default("Hello World");
has $.fuel;
has $.speed;
has @.payload;
}
If we have to make our Rocket
definition more clear on what is what, let’s specify some types:
enum Fuel <Solid Liquid Gas>;
class Rocket {
has Str $.name;
has Str $.message is default("Hello World");
has Fuel $.fuel;
has $.speed;
has Str @.payload;
}
Now it starts to remind us of something…
Str
is similar toUTF8String
, except we cannot leave it like that, because inASN.1
we have not onlyUTF8String
, but alsoBIT STRING
,OCTET STRING
and other string types.Fuel
enum is similar toENUMERATED
type.- Sigil
@
of@.payload
tells us it is going to be a sequence, andStr
specifies type of its elements.
But while there are some similar points, there is not enough data for us from ASN.1
point of view. Let’s resolve those step by step!
How do we know that
Rocket
is an, at all,ASN.1
sequence type?
By applying a role: class Rocket does ASNSequence
.
How do we know exact order of fields?
By implementing a stubbed method from this role: method ASN-order { <$!name $!message $!fuel $!speed @!payload> }
.
How do we know that
$.speed
is optional?
Let’s just apply a trait on it! Traits allows us to execute a custom code on code parts and, particulary, Attribute
s. For example, imaginary API can be like this: has $.speed is optional
.
How do we know what
$.speed
is?
As CHOICE
type is “special”, but still first-class one (e.g. you can make it recursive), we need a role here: ASNChoice
comes to the rescue.
How do we know what type of
ASN.1
string is our Str type?
Let’s just write that has Str $.name is UTF8String;
.
How do we specify default value of a field?
While Perl 6 already has built-in is default
trait, bad thing for us is that we cannot “nicely” detect it. So we have to introduce yet another custom trait that will serve our purposes and apply built-in trait too: has Str $.message is default-value("Hello World");
Let’s answer all those questions in a single pack:
role ASNSequence { #`[ Elves Special Magic Truly Happens Here ] }
role ASNChoice { #`[ And even here ] }
class SpeedChoice does ASNChoice {
method ASN-choice() {
# Description of: names, tags, types specificed by this CHOICE
{ mph => (0 => Int), kmph => (1 => Int) }
}
}
class Rocket does ASNSequence {
has Str $.name is UTF8String;
has Str $.message is default-value("Hello World") is UTF8String;
has Fuel $.fuel;
has SpeedChoice $.speed is optional;
has Str @.payload is UTF8String;
method ASN-order { <$!name $!message $!fuel $!speed @!payload> }
}
And a value might look something like:
my $rocket = Rocket.new(
name => 'Falcon',
fuel => Solid,
speed => SpeedChoice.new((mph => 18000)),
payload => [ "Car", "GPS" ]);
The more answers, the more questions
For this tiny example (which, on the other hand, has number of ASN.1
features demonstrated) this is all we need to, practically, use instances of this class in our application with possibly encoding and decoding it all we want.
So what elves secretly do with our data? Let’s find out in next post!
One thought on “Day 14 – Designing a (little) Spacecraft with Perl 6”