Day 15 – A Simple Web Spider With Promises

Promises, Promises

Last summer, I applied for a programming job and the interviewer asked me to write a program that would crawl a given domain, only following links in that domain, and find all pages that it referenced. I was allowed to write the program in any language, but I chose to perform the task in the Go language because that is the primary language that this company uses. This is an ideal task for concurrent programming, and Go has very good modern, if somewhat low-level concurrency support. The main work in a web spider, which will be performed as many times as there are unique anchor links discovered in the domain, is to do an HTTP GET on each page and parse the page text for new links. This task may be safely done in parallel because there is no likelihood (unless you do it very badly) that any invocation of crawling code will interfere with any other invocation of it.

The creators of Go and Perl 6 were inspired by Sir Anthony Hoare’s seminal 1978 work “Communicating Sequential Processes”, though it is notable that Perl 6 code tends to be more concise and therefore easier to tuck into a blog post. Indeed, the Go designers invariably refer to their constructs as “concurrency primitives”. The concurrent spider code in Go that I wrote for my job application came in at about 200 lines, vs rather less than half that size in Perl 6.

So let’s look at how a simple web crawler may be implemented in Perl 6. The built-in Promise class allows you to start, schedule and examine the results from asynchronous computations. All you need to do is give a code reference to the Promise.start method, then call the await method, which blocks until the promise has finished executing. You may then test the result method to find out if the promise has been Kept or Broken.

You can run the code in this posting by saving it to a local file, e.g. web-spider.p6. Use zef to install HTML::Parser::XML and HTTP::UserAgent as well as IO::Socket::SSL if you wish to crawl https sites. I will warn you that SSL support seems a little ropey at present so it is best to stick to http sites. The MAIN sub in a Perl 6 program, when present, indicates a stand-alone program and this is where execution will start. The arguments to MAIN represent command line parameters. I wrote this program so that it will spider the Perlmonks site by default, but you can override that as follows:

$ perl6 web-spider.p6 [–domain=http://example.com]

Simple Perl 6 Domain Spider

use HTML::Parser::XML;
use XML::Document;
use HTTP::UserAgent;

sub MAIN(:$domain="http://www.perlmonks.org") {

    my $ua =  HTTP::UserAgent.new;
    my %url_seen;
    my @urls=($domain);

    loop {
        my @promises;
        while ( @urls ) {
            my $url = @urls.shift;
            my $p = Promise.start({crawl($ua, $domain, $url)});
            @promises.push($p);
        }
        await Promise.allof(@promises);
        for @promises.kv -> $index, $p {
            if $p.status ~~ Kept {
                my @results =  $p.result;
                for @results {
                    unless %url_seen{$_} {
                        @urls.push($_);
                        %url_seen{$_}++;
                    }
                }
            }
        }
        # Terminate if no more URLs to crawl
        if @urls.elems == 0 {
            last;
        }
    }
    say %url_seen.keys;
}

# Get page and identify urls linked to in it. Return urls.
sub crawl($ua, $domain, $url) {
    my $page = $ua.get($url);
    my $p = HTML::Parser::XML.new;
    my XML::Document $doc = $p.parse($page.content);
    # URLs to crawl
    my %todo;
    my @anchors = $doc.elements(:TAG<a>, :RECURSE);
    for @anchors -> $anchor {
        next unless $anchor.defined;
        my $href =  $anchor.attribs<href>;

        # Convert relative to absolute urls
        if $href.starts-with('/') or $href.starts-with('?') {
            $href = $domain ~ $href;
        }

        # Get unique urls from page
        if $href.starts-with($domain) {
              %todo{$href}++;
        }
    }
    my @urls = %todo.keys;

    return @urls;
}

In Conclusion

Concurrent programming will always have many pitfalls, from race conditions to resource starvation and deadlocks, but I think it’s clear that Perl 6 has gone quite some way towards making this form of programming much more accessible to everyone.

Day 14 – The Little Match Girl: Building and Testing Big Grammars in Perl 6

Perl 6 Grammars are great, but what is it like working with them in a project? Here is a bittersweet story of my experience before Christmas, and after Christmas. You can find the repository here. I do not come from a computer science background, so perhaps it will seem humble, but here are my pitfalls and triumphs as I learned Perl 6 Grammars.

The First Match

Like the Little Match Girl, our story takes place before Christmas. The Little Match Girl was tasked with selling a bundle of match sticks on Christmas Eve (New Years, actually. I did go back and read the story. Christmas just fits better with Perl 6), while I was tasked with extracting annotations from Modelica models to render as vector graphics. Now, Modelica is a wonderful object oriented modeling language, and I am going to completely gloss over it, except to mention that it has a very nice specification document (pdf) that contained a Concrete Syntax section in the Appendix. Perusing this section, I realized that the “syntactic meta symbols” and “lexical units” looked suspiciously like the Perl 6 Grammars that I had recently read a blog post about, and had been anxious to try out.

Example from Modelica Concrete Syntax:

class-definition :
[ encapsulated ] class-prefixes
class-specifier

Example of Perl 6 rule:

rule class_definition {
  [<|w>'encapsulated'<|w>]? 
  <class_prefixes>
  <class_specifier>
}

It was like the Little Match Girl striking the first match, and seeing for the first time a wonderful world beyond her stark reality. A warm little stove. And then it went out.

It was so close that I plopped it into a text editor, and replaced the not Perl 6 bits with some Perl 6 bits to see if it would run. It didn’t run. I hacked away at it, I pointed TOP at different bits to tackle smaller chunks. There were whitespace symbols everywhere, regexes, tokens, rules. I was able to parse some parts, others mysteriously didn’t work. Looking back, it must have been awful. In the meantime, we hacked together a traditional regular expression to extract the annotations, and I placed my Grammar on the shelf.

The Second Match

Not long after, the Grammar::Profiler and Grammar::Debugger were published, and I was inspired to give it another go. I was granted great insights into where my rules were behaving unexpectedly. I was able to drill down through the grammar deeper than before. The second match had been lit, and I was presented with a feast. And then it went out.

In the debugger, I dove into an abyss of backtracking. The profiler ran forever as it dove down into the morass, again and again. I was able to get much farther, but eventually ran into a wall. Success seemed so close, but there were too many missing pieces in my own experience, and documentation for me to get past the wall.

The Third Match

Time passed, and Christmas came. I had a new position, with time for personal projects. I had the ever improving Grammar documentation to guide me. I had read the book Working Effectively With Legacy Code. It was enough to warrant charging the hill once more.

Object orientation

This was the biggest breakthrough for me. When I understood from the documentation that Tokens, rules and regex were funny looking methods, I suddenly had all of the pieces. When I got home, I immediately checked if I could override TOP, and I checked if I could put the Grammar methods into a role. Both worked delightfully, and I was in business. Rather than having one monolithic, all-or-nothing Grammar, I could break it up into chunks. This greatly improved organization and testability of the code.

One particularly bodacious thing was that I was able to neatly split the grammar up into roles corresponding to those found in the Modelica specification.

lib
----Grammar
--------Modelica
------------LexicalConventions.pm6
------------ClassDefinition.pm6
------------Extends.pm6
------------ComponentClause.pm6
------------Modification.pm6
------------Equations.pm6
------------Expressions.pm6
--------Modelica.pm6

Unit testing: one layer at a time

Object orientation opened up a sensible scheme of unit testing, and saved me from the nonsense of ad hoc testing by passing bits of Modelica into the Grammar. You can inherit and override grammars as you would any other class. This allows you to test each rule or token separately, splitting your grammar up into bite-sized layers. You just override TOP with the rule or token to test, and override any dependencies with placeholder methods.

Definition of expression from Expressions.pm6:

rule expression {
  [
  <|w>'if'<|w> <expression> <|w>'then'<|w> <expression> [
  <|w>'elseif'<|w> <expression> <|w>'then'<|w> <expression>
  ]*
  <|w>'else'<|w> <expression>
  ]
  ||
  <simple_expression>
}

Here we see that expression depends on itself and simple_expression. In order to test, we replace the usual simple_expression rule with a placeholder. In this case it just matches the string 'simple_expression'.

Overridden test Grammar from Expressions.t:

grammar TestExpression is Grammar::Modelica {
rule TOP {^ <expression> $}
rule simple_expression { 'simple_expression' }
}
ok TestExpression.parse('simple_expression');
...

Regression testing is also much more pleasant when you can isolate the problematic portion of code, and create an overridden Grammar that targets it specifically.

<|w> is your friend

In my first efforts, trying to get things like Modelica reserved words working properly was one of the “banes of my existence”. That changed after I found the word boundary matching token <|w>. When I slap one on each side, it works, whether next to white space or a punctuation character.

From ComponentClause.pm6:

rule type_prefix {
  [<|w>[ 'flow' || 'stream' ]<|w>]?
  [<|w>[ 'discrete' || 'parameter' || 'constant' ]<|w>]?
  [<|w>[ 'input' || 'output' ]<|w>]?
}

Token, rule and regex

There is good documentation for these now, but I, also, will briefly contribute a description of my experience. I found that rule and its :sigspace magic was the best choice most of the time. token was helpful where tight control of format was needed.

regex is for backtracking. For Modelica, I have found it to be unhelpful, likely because it was designed to be a single pass language. token and rule work in the places I thought I needed it. All of my unit tests passed after I removed them, and the Grammar succeeded on four more Modelica Standard Library files. Only use this when you need it.

End With the Beginning

Another bit that was frustrating to me was class definition syntax. Modelica uses the form some_identifier ... end some_identifier for its classes. How to ensure that the same identifier was used at the beginning and end was troublesome for me. Fortunately, Perl 6 allows you to use a capture inside the Grammar methods. The (<IDENT>) capture below populates $0, which can then be used to ensure that our long_class_specifier ends with the proper identifier.

rule long_class_specifier {
  [(<IDENT>) <string_comment> <composition> <|w>'end'<|w> $0 ]
  ||
  [<|w>'extends'<|w> (<IDENT>) <class_modification>? <string_comment> <composition> <|w>'end'<|w> $0 ]
}

Integration Testing: lighting all the matches at once

After my unit tests were all passing, I felt a little trepidation. Sure it can parse my contrived test cases, but how will it do with real Modelica? With trembling hand, I fed it some of Michael Tiller’s example code from his Modelica e-book. It worked! No fiddling around with subtle things that I overlooked, no funny parsing bugs or eternal backtracking. Just success.

Now, stars do occasionally align. Miracles do happen. Sufficiently clever unit tests can be remarkably good at preventing bugs. I have been around the block enough times to verify. Recalling a presentation by Damian Conway, I decided to run it against the entire Modelica Standard Library. Not exactly all of CPAN, but 305 files is better than the mere two example models I had tried so far.

I wrote the script, pointed it at the Modelica directory, and fired it up. It churned through the library and wheezed to a stop. 150 failures. Now that is familiar territory. After several iterations, I am down to 66 failures when I run it on my parse_modelica_library branch. I just go through a file that is failing, isolate the code that is having issues, and write a regression test for it.

So, in the end the Little Match Girl lit all the rest of her bundle. Then, she died. Don’t die, but you can light all 305 matches with me, in parallel, with examples/parseThemAll.p6:

#!perl6

use v6;
use Test;
use lib '../lib';
use Grammar::Modelica;


plan 305;

sub light($file) {
  my $fh = open $file, :r;
  my $contents = $fh.slurp-rest;
  $fh.close;

  my $match = Grammar::Modelica.parse($contents);
  say $file;
  ok $match;
}

sub MAIN($modelica-dir) {
    say "directory: $modelica-dir";
    die "Can't find directory" if ! $modelica-dir.IO.d;

    # modified from the lovely docs at
    # https://docs.perl6.org/routine/dir
    my @stack = $modelica-dir.IO;
    my @files;
    while @stack {
      for @stack.pop.dir -> $path {
        light($path) if $path.f && $path.extension.lc eq 'mo';
        @stack.push: $path if $path.d;
      }
    }
    # faster to do in parallel
    @files.race.map({light($_)});
}

I will see how many more I can persuade to pass before Christmas. Then perhaps I will figure out how to write some rules to build a QAST.

Merry Christmas!

Day 13 – Mining Wikipedia with Perl 6

Introduction

Hello, everyone!

Today, let me introduce how to mine Wikipedia Infobox with Perl 6.

Wikipedia Infobox plays a very important role in Natural Language Processing, and there are many applications that leverage Wikipedia Infobox:

  • Building a Knowlege Base (e.g. DBpedia [0])
  • Ranking the importance of attributes [1]
  • Question Answering [2]

Among them, I’ll focus on the infobox extraction issues and demonstrate how to parse the sophisticated structures of the infoboxes with Grammar and Actions.

Are Grammar and Actions difficult to learn?

No, they aren’t!

You only need to know just five things:

  • Grammar
    • token is the most basic one. You may normally use it.
    • rule makes whitespace significant.
    • regex makes match engine backtrackable.
  • Actions
    • make prepares an object to return when made calls on it.
    • made calls on its invocant and returns the prepared object.

For more info, see: https://docs.perl6.org/language/grammars

What is Infobox?

Have you ever heard the word “Infobox”?

For those who haven’t heard it, I’ll explain it briefly.

An easy way to understand Infobox is by using a real example:

perl6infobox

As you can see, the infobox displays the attribute-value pairs of the page’s subject at the top-right side of the page. For example, in this one, it says the designer (ja: 設計者) of Perl 6 is Larry Wall (ja: ラリー・ウォール).

For more info, see: https://en.wikipedia.org/wiki/Help:Infobox

First Example: Perl 6

Firstly to say, I’ll demonstrate the parsing techniques using Japanese Wikipedia not with English Wikipedia.

The main reason is that parsing Japanese Wikipedia is my $dayjob :)

The second reason is that I want to show how easily Perl 6 can handle Unicode strings.

Then, let’s start parsing the infobox in the Perl 6 article!

The code of the article written in wiki markup is:

{{Comp-stub}}
{{Infobox プログラミング言語
|名前 = Perl 6
|ロゴ = [[Image:Camelia.svg|250px]]
|パラダイム = [[マルチパラダイムプログラミング言語|マルチパラダイム]]
|登場時期 = [[2015年]]12月25日
|設計者 = [[ラリー・ウォール]]
|最新リリース = Rakudo Star 2016.04
|型付け = [[動的型付け]], [[静的型付け]]
|処理系 = [[Rakudo]]
|影響を受けた言語 = [[Perl|Perl 5]], [[Smalltalk]], [[Haskell]], [[Ruby]]
|ライセンス = [[Artistic License 2]]
|ウェブサイト = [https://perl6.org/ Perl6.org]
}}
{{プログラミング言語}}
'''Perl 6'''(パールシックス)は、[[ラリー・ウォール]]により設計された[[オブジェクト指向]][[スクリプト言語]]である。
Perl 6は、[[2000年]]に[[Perl]]の次期メジャーバージョンとして設計が始められ、[[2015年]]12月25日に公式のPerl 6正式安定版がリリースされた。しかし、言語仕様は現在のPerl (Perl 5)と互換性がなく、既存のPerl 5のソフトウェアをPerl 6用に「アップグレ
ード」するのは極めて困難である。したがって現在はPerl 5とPerl 6は別の言語であると考えられており、Perl 6はPerl 5の次期バージョンではないとされている。換言すれば、Perl 6はPerl 5から移行対象とはみなされていない。

view raw
example1-infobox.txt
hosted with ❤ by GitHub

There are three problematic portions of the code:

  1. There are superfluous elements after the infobox block, such as the template {{プログラミング言語}} and the lead sentence starting with '''Perl 6'''.
  2. We have to discriminate three types of tokens: anchor text (e.g. [[Rakudo]]), raw text (e.g. Rakudo Star 2016.04), weblink (e.g. [https://perl6.org/ Perl6.org]).
  3. The infobox doesn’t start at the top position of the article. In this example, {{Comb-stub}} is at the top of the article.

OK, then I’ll show how to solve the above problems in the order of Grammar, Actions, Caller (i.e. The portions of the code that calls Grammar and Actions).

Grammar

The code for Grammar is:

grammar Infobox::Grammar {
token TOP { <infobox> .+ } # (#1)
token infobox { '{{Infobox' <.ws> <name> \n <propertylist> '}}' }
token name { <[\n]>+ }
token propertylist {
[
| <property> \n
| \n
]+
}
token property {
'|' <key=.keycontent> '=' <value=.valuecontentlist>
}
token key-content { <[=\n]>+ }
token value-content-list {
<valuecontent>+
}
token value-content { # (#6)
[
| <anchortext>
| <weblink>
| <rawtext>
| <delimiter>
]+
}
token anchortext { '[[' <[\n]>+? ']]' } # (#2)
token weblink { '[' <[\n]>+? ']' } # (#3)
token rawtext { <[\|\[\]\n、\,\<\>\}\{]>+ } # (#4)
token delimiter { [ '、' | ',' ] } # (#5)
}

view raw
example1-grammar.p6
hosted with ❤ by GitHub

  • Solutions to the problem 1:
    • Use .+ to match superfluous portions. (#1)
  • Solutions to the problem 2:
    • Prepare three types of tokens: anchortext (#2), weblink (#3), and rawtext (#4).
      • The tokens may be separated by delimiter (e.g. ,), so prepare the token delimiter. (#5)
    • Represent the token value-content as an arbitrary length sequence of the four tokens (i.e. anchortext, weblink, rawtext, delimiter). (#6)
  • Solutions to the problem 3:
    • There are no particular things to mention.

Actions

The code for Actions is:

class Infobox::Actions {
method TOP($/) { make $<infobox>.made }
method infobox($/) {
make %( name => $<name>.made, propertylist => $<propertylist>.made )
}
method name($/) { make ~$/.trim }
method propertylist($/) {
make $<property>>>.made
}
method property($/) {
make $<key>.made => $<value>.made
}
method key-content($/) { make $/.trim }
method value-content-list($/) {
make $<value-content>>>.made
}
method value-content($/) { # (#1)
my $rawtext = $<rawtext>>>.made>>.trim.grep({ $_ ne "" });
make %(
anchortext => $<anchortext>>>.made,
weblink => $<weblink>>>.made,
rawtext => $rawtext.elems == 0 ?? $[] !! $rawtext.Array
);
}
method anchortext($/) {
make ~$/;
}
method weblink($/) {
make ~$/;
}
method rawtext($/) {
make ~$/;
}
}

view raw
example1-actions.p6
hosted with ❤ by GitHub

  • Solutions to the problem 2:
    • Make the token value-content consist of the three keys: anchortext, weblink,  and rawtext.
  • Solutions to the problem 1 and 3:
    • There are no particular things to mention.

Caller

The code for Caller is:

my @lines = $*IN.lines;
while @lines {
my $chunk = @lines.join("\n"); # (#1)
my $result = Infobox::Grammar.parse($chunk, actions => Infobox::Actions).made; # (#2)
if $result<name>:exists {
$result<name>.say;
for @($result<propertylist>) -> (:$key, :value($content-list)) { # (#3)
$key.say;
for @($content-list) -> $content {
$content.say;
}
}
}
shift @lines;
}

view raw
example1-caller.p6
hosted with ❤ by GitHub

  • Solutions to the problem 3:
    • Read the article line-by-line and make a chunk which contains the lines between the current line and the last line.  (#1)
    • If the parser determines that:
      • The chunk doesn’t contain the infobox, it returns an undefined value. One of the good ways to receive an undefined value is to use $ sigil. (#2)
      • The chunk contains the infobox, it returns a defined value. Use @()contextualizer and iterate the result. (#3)
  • Solutions to the problem 1 and 2:
    • There are no particular things to mention.

Running the Parser

Are you ready?
It’s time to run the 1st example!

$ perl6 parser.p6 < perl6.txt
プログラミング言語
名前
{anchortext => [], rawtext => [Perl 6], weblink => []}
ロゴ
{anchortext => [[[Image:Camelia.svg|250px]]], rawtext => [], weblink => []}
パラダイム
{anchortext => [[[マルチパラダイムプログラミング言語|マルチパラダイム]]], rawtext => [], weblink => []}
登場時期
{anchortext => [[[2015年]]], rawtext => [12月25日], weblink => []}
設計者
{anchortext => [[[ラリー・ウォール]]], rawtext => [], weblink => []}
最新リリース
{anchortext => [], rawtext => [Rakudo Star 2016.04], weblink => []}
型付け
{anchortext => [[[動的型付け]] [[静的型付け]]], rawtext => [], weblink => []}
処理系
{anchortext => [[[Rakudo]]], rawtext => [], weblink => []}
影響を受けた言語
{anchortext => [[[Perl|Perl 5]] [[Smalltalk]] [[Haskell]] [[Ruby]]], rawtext => [], weblink => []}
ライセンス
{anchortext => [[[Artistic License 2]]], rawtext => [], weblink => []}
ウェブサイト
{anchortext => [], rawtext => [], weblink => [[https://perl6.org/ Perl6.org]]}

view raw
example1-result.txt
hosted with ❤ by GitHub

The example we have seen may be too easy for you. Let’s challenge more harder one!

Second Example: Albert Einstein

As the second example, let’s parse the infobox of Albert Einstein.

The code of the article written in wiki markup is:

{{Infobox Scientist
|name = アルベルト・アインシュタイン
|image = Einstein1921 by F Schmutzer 2.jpg
|caption = [[1921年]]、[[ウィーン]]での[[講義]]中
|birth_date = {{生年月日と年齢|1879|3|14|no}}
|birth_place = {{DEU1871}}<br>[[ヴュルテンベルク王国]][[ウルム]]
|death_date = {{死亡年月日と没年齢|1879|3|14|1955|4|18}}
|death_place = {{USA1912}}<br />[[ニュージャージー州]][[プリンストン (ニュージャージー州)|プリンストン]]
|residence = {{DEU}}<br />{{ITA}}<br>{{CHE}}<br />{{AUT}}(現在の[[チェコ]])<br />{{BEL}}<br />{{USA}}
|nationality = {{DEU1871}}、ヴュルテンベルク王国(1879-96)<br />[[無国籍]](1896-1901)<br />{{CHE}}(1901-55)<br />{{AUT1867}}(1911-12)<br />{{DEU1871}}、{{DEU1919}}(1914-33)<br />{{USA1912}}(1940-55)
| spouse = [[ミレヴァ・マリッチ]]&nbsp;(1903-1919)<br />{{nowrap|{{仮リンク|エルザ・アインシュタイン|en|Elsa Einstein|label=エルザ・レーベンタール}}&nbsp;(1919-1936)}}
| children = [[リーゼル・アインシュタイン|リーゼル]] (1902-1903?)<br />[[ハンス・アルベルト・アインシュタイン|ハンス
・アルベルト]] (1904-1973)<br />[[エドゥアルト・アインシュタイン|エドゥアルト]] (1910-1965)
|field = [[物理学]]<br />[[哲学]]
|work_institution = {{Plainlist|
* [[スイス特許庁]] ([[ベルン]]) (1902-1909)
* {{仮リンク|ベルン大学|en|University of Bern}} (1908-1909)
* [[チューリッヒ大学]] (1909-1911)
* [[プラハ・カレル大学]] (1911-1912)
* [[チューリッヒ工科大学]] (1912-1914)
* [[プロイセン科学アカデミー]] (1914-1933)
* [[フンボルト大学ベルリン]] (1914-1917)
* {{仮リンク|カイザー・ヴィルヘルム協会|en|Kaiser Wilhelm Society|label=カイザー・ヴィルヘルム研究所}} (化学・物理学研究所長, 1917-1933)
* [[ドイツ物理学会]] (会長, 1916-1918)
* [[ライデン大学]] (客員, 1920-)
* [[プリンストン高等研究所]] (1933-1955)
* [[カリフォルニア工科大学]] (客員, 1931-33)
}}
|alma_mater = [[チューリッヒ工科大学]]<br />[[チューリッヒ大学]]
|doctoral_advisor = {{仮リンク|アルフレート・クライナー|en|Alfred Kleiner}}
|academic_advisors = {{仮リンク|ハインリヒ・フリードリヒ・ウェーバー|en|Heinrich Friedrich Weber}}
|doctoral_students =
|known_for = {{Plainlist|
*[[一般相対性理論]]
*[[特殊相対性理論]]
*[[光電効果]]
*[[ブラウン運動]]
*[[E=mc2|質量とエネルギーの等価性]](E=mc<sup>2</sup>)
*[[アインシュタイン方程式]]
*[[ボース分布関数]]
*[[宇宙定数]]
*[[ボース=アインシュタイン凝縮]]
*[[EPRパラドックス]]
*{{仮リンク|古典統一場論|en|Classical unified field theories}}
}}
| influenced = {{Plainlist|
* {{仮リンク|エルンスト・G・シュトラウス|en|Ernst G. Straus}}
* [[ネイサン・ローゼン]]
* [[レオ・シラード]]
}}
|prizes = {{Plainlist|
*{{仮リンク|バーナード・メダル|en|Barnard Medal for Meritorious Service to Science}}(1920)
*[[ノーベル物理学賞]](1921)
*[[マテウチ・メダル]](1921)
*[[コプリ・メダル]](1925)
*[[王立天文学会ゴールドメダル]](1926)
*[[マックス・プランク・メダル]](1929)
}}
|religion =
|signature = Albert Einstein signature 1934.svg
|footnotes =
}}
{{thumbnail:begin}}
{{thumbnail:ノーベル賞受賞者|1921年|ノーベル物理学賞|光電効果の法則の発見等}}
{{thumbnail:end}}
'''アルベルト・アインシュタイン'''<ref group="†">[[日本語]]における表記には、他に「アル{{Underline|バー}}ト・アインシュine|バー}}ト・アイン{{Underline|ス}}タイン」([[英語]]の発音由来)がある。</ref>({{lang-de-short|Albert Einstein}}<ref ɛrt ˈaɪnˌʃtaɪn}} '''ア'''ルベルト・'''ア'''インシュタイン、'''ア'''ルバート・'''ア'''インシュタイン</ref><ref group="†"taɪn}} '''ア'''ルバ(ー)ト・'''ア'''インスタイン、'''ア'''ルバ(ー)'''タ'''インスタイン</ref><ref>[http://dictionary.rein Einstein] (Dictionary.com)</ref><ref>[http://www.oxfordlearnersdictionaries.com/definition/english/albert-einstein?q=Albert+Einstein Albert Einstein] (Oxford Learner's Dictionaries)</ref>、[[1879年]][[3月14日]] – [[1955年]][[4月18日]])ツ]]生まれの[[理論物理学者]]である。

view raw
example2-infobox.txt
hosted with ❤ by GitHub

As you can see, there are five new problems here:

  1. Some of the templates
    1. contain newlines; and
    2. are nesting (e.g. {{nowrap|{{仮リンク|...}}...}})
  2. Some of the attribute-value pairs are empty.
  3. Some of the value-sides of the attribute-value pairs
    1. contain break tag; and
    2. consist of different types of the tokens (e.g. anchortext and rawtext).
      So you need to add positional information to represent the dependency between tokens.

I’ll show how to solve the above problems in the order of Grammar, Actions.

The code of the Caller is the same as the previous one.

Grammar

The code for Grammar is:

grammar Infobox::Grammar {
token TOP { <infobox> .+ }
token infobox { '{{Infobox' <.ws> <name> \n <propertylist> '}}' }
token name { <[\n]>+ }
token propertylist {
[
| <property> \n
| \n
]+
}
token property {
[
| '|' <key=.keycontent> '=' <value=.valuecontentlist>
| '|' <key=.keycontent> '=' # (#4)
]
}
token key-content { <[=\n]>+ }
token value-content-list {
[
| <valuecontent> <br> # (#6)
| <valuecontent>
| <br>
]+
}
token value-content-list-nl { # (#1)
[
| <valuecontent> <br> # (#7)
| <valuecontent>
| <br>
]+ % \n
}
token value-content {
[
| <anchortext>
| <weblink>
| <rawtext>
| <template>
| <delimiter>
| <sup>
]+
}
token br { # (#5)
[
| '<br />'
| '<br/>'
| '<br>'
]
}
token template {
[
| '{{' <[\n]>+? '}}'
| '{{nowrap' '|' <valuecontentlist> '}}' # (#3)
| '{{Plainlist' '|' \n <valuecontentlistnl> \n '}}' # (#2)
]
}
token anchortext { '[[' <[\n]>+? ']]' }
token weblink { '[' <[\n]>+? ']' }
token rawtext { <[\|\[\]\n、\,\<\>\}\{]>+ }
token delimiter { [ '、' | ',' | '&nbsp;' ] }
token sup { '<sup>' <[\n]>+? '</sup>'}
}

view raw
example2-grammar.p6
hosted with ❤ by GitHub

  • Solutions to the problem 1.1:
    • Create the token value-content-list-nl which is the newline separated version of the token value-content-list. It is useful to use modified quantifier % to represent this kind of sequence. (#1)
    • Create the token template. In this one, define a sequence that represents Plainlist template. (#2)
  • Solutions to the problem 1.2:
    • Make the token template enable to call the token value-content-list. This modification triggers recursive call and captures nesting structure, because  the token value-content-list contains the token template. (#3)
  • Solutions to the problem 2:
    • In the token property, define a sequence that value-side is empty (i.e. a sequence that ends with ‘=’). (#4)
  • Solutions to the problem 3.1:
    • Create the token br (#5)
    •  Let the token br follow the token value-content in the two tokens:
      • The token value-content-list (#6)
      • The token value-content-list-nl (#7)

Actions

The code for Actions is:

class Infobox::Actions {
method TOP($/) { make $<infobox>.made }
method infobox($/) {
make %( name => $<name>.made, propertylist => $<propertylist>.made )
}
method name($/) { make $/.trim }
method propertylist($/) {
make $<property>>>.made
}
method property($/) {
make $<key>.made => $<value>.made
}
method key-content($/) { make $/.trim }
method value-content-list($/) {
make $<value-content>>>.made
}
method value-content($/) {
my $rawtext = $<rawtext>>>.made>>.trim.grep({ $_ ne "" });
make %(
anchortext => $<anchortext>>>.made,
weblink => $<weblink>>>.made,
rawtext => $rawtext.elems == 0 ?? $[] !! $rawtext.Array,
template => $<template>>>.made;
);
}
method template($/) {
make %(body => ~$/, from => $/.from, to => $/.to); # (#1)
}
method anchortext($/) {
make %(body => ~$/, from => $/.from, to => $/.to); # (#2)
}
method weblink($/) {
make %(body => ~$/, from => $/.from, to => $/.to); # (#3)
}
method rawtext($/) {
make %(body => ~$/, from => $/.from, to => $/.to); # (#4)
}
}

view raw
example2-actions.p6
hosted with ❤ by GitHub

  • Solutions to the problem 3.2:
    • Use Match.from and Match.to to get the match starting position and the match ending position respectively when calling make. (#1 ~ #4)

Running the Parser

It’s time to run!

$ perl6 parser.p6 < einstein.txt
Scientist
name
{anchortext => [], rawtext => [{body => アルベルト・アインシュタイン, from => 27, to => 42}], template => [], weblink => []}
image
{anchortext => [], rawtext => [{body => Einstein1921 by F Schmutzer 2.jpg, from => 51, to => 85}], template => [], weblink => []}
caption
{anchortext => [{body => [[1921年]], from => 97, to => 106} {body => [[ウィーン]], from => 107, to => 115} {body => [[講義]], from => 117, to => 123}], rawtext => [{body => , from => 96, to => 97} {body => での, from => 115, to => 117} {body => 中, from => 123, to => 124}], template => [], weblink => []}
birth_date
{anchortext => [], rawtext => [{body => , from => 138, to => 139}], template => [{body => {{生年月日と年齢|1879|3|14|no}}, from => 139, to => 163}], weblink => []}
birth_place
{anchortext => [], rawtext => [{body => , from => 178, to => 179}], template => [{body => {{DEU1871}}, from => 179, to => 190}], weblink => []}
{anchortext => [{body => [[ヴュルテンベルク王国]], from => 194, to => 208} {body => [[ウルム]], from => 208, to => 215}], rawtext => [], template => [], weblink => []}
death_date
{anchortext => [], rawtext => [{body => , from => 229, to => 230}], template => [{body => {{死亡年月日と没年齢|1879|3|14|1955|4|18}}, from => 230, to => 263}], weblink => []}
death_place
{anchortext => [], rawtext => [{body => , from => 278, to => 279}], template => [{body => {{USA1912}}, from => 279, to => 290}], weblink => []}
{anchortext => [{body => [[ニュージャージー州]], from => 296, to => 309} {body => [[プリンストン (ニュージャージー州)|プリンストン]], from => 309, to => 338}], rawtext => [], template => [], weblink => []}
residence
{anchortext => [], rawtext => [{body => , from => 351, to => 352}], template => [{body => {{DEU}}, from => 352, to => 359}], weblink => []}
{anchortext => [], rawtext => [], template => [{body => {{ITA}}, from => 365, to => 372}], weblink => []}
{anchortext => [], rawtext => [], template => [{body => {{CHE}}, from => 376, to => 383}], weblink => []}
{anchortext => [{body => [[チェコ]], from => 400, to => 407}], rawtext => [{body => (現在の, from => 396, to => 400} {body => ), from => 407, to => 408}], template => [{body => {{AUT}}, from => 389, to => 396}], weblink => []}
{anchortext => [], rawtext => [], template => [{body => {{BEL}}, from => 414, to => 421}], weblink => []}
{anchortext => [], rawtext => [], template => [{body => {{USA}}, from => 427, to => 434}], weblink => []}
nationality
{anchortext => [], rawtext => [{body => , from => 449, to => 450} {body => ヴュルテンベルク王国(1879-96), from => 462, to => 481}], template => [{body => {{DEU1871}}, from => 450, to => 461}], weblink => []}
{anchortext => [{body => [[無国籍]], from => 487, to => 494}], rawtext => [{body => (1896-1901), from => 494, to => 505}], template => [], weblink => []}
{anchortext => [], rawtext => [{body => (1901-55), from => 518, to => 527}], template => [{body => {{CHE}}, from => 511, to => 518}], weblink => []}
{anchortext => [], rawtext => [{body => (1911-12), from => 544, to => 553}], template => [{body => {{AUT1867}}, from => 533, to => 544}], weblink => []}
{anchortext => [], rawtext => [{body => (1914-33), from => 582, to => 591}], template => [{body => {{DEU1871}}, from => 559, to => 570} {body => {{DEU1919}}, from => 571, to => 582}], weblink => []}
{anchortext => [], rawtext => [{body => (1940-55), from => 608, to => 617}], template => [{body => {{USA1912}}, from => 597, to => 608}], weblink => []}
spouse
{anchortext => [{body => [[ミレヴァ・マリッチ]], from => 634, to => 647}], rawtext => [{body => , from => 633, to => 634} {body => (1903-1919), from => 653, to => 664}], template => [], weblink => []}
{anchortext => [], rawtext => [], template => [{body => {{nowrap|{{仮リンク|エルザ・アインシュタイン|en|Elsa Einstein|label=エルザ・レーベンタール}}&nbsp;(1919-1936)}}, from => 670, to => 754}], weblink => []}
children
{anchortext => [{body => [[リーゼル・アインシュタイン|リーゼル]], from => 771, to => 793}], rawtext => [{body => , from => 770, to => 771} {body => (1902-1903?), from => 793, to => 806}], template => [], weblink => []}
{anchortext => [{body => [[ハンス・アルベルト・アインシュタイン|ハンス・アルベルト]], from => 812, to => 844}], rawtext => [{body => (1904-1973), from => 844, to => 856}], template => [], weblink => []}
{anchortext => [{body => [[エドゥアルト・アインシュタイン|エドゥアルト]], from => 862, to => 888}], rawtext => [{body => (1910-1965), from => 888, to => 900}], template => [], weblink => []}
field
{anchortext => [{body => [[物理学]], from => 910, to => 917}], rawtext => [{body => , from => 909, to => 910}], template => [], weblink => []}
{anchortext => [{body => [[哲学]], from => 923, to => 929}], rawtext => [], template => [], weblink => []}
work_institution
{anchortext => [], rawtext => [{body => , from => 949, to => 950}], template => [{body => {{Plainlist|
* [[スイス特許庁]] ([[ベルン]]) (1902-1909)
* {{仮リンク|ベルン大学|en|University of Bern}} (1908-1909)
* [[チューリッヒ大学]] (1909-1911)
* [[プラハ・カレル大学]] (1911-1912)
* [[チューリッヒ工科大学]] (1912-1914)
* [[プロイセン科学アカデミー]] (1914-1933)
* [[フンボルト大学ベルリン]] (1914-1917)
* {{仮リンク|カイザー・ヴィルヘルム協会|en|Kaiser Wilhelm Society|label=カイザー・ヴィルヘルム研究所}} (化学・物理学研究所長, 1917-1933)
* [[ドイツ物理学会]] (会長, 1916-1918)
* [[ライデン大学]] (客員, 1920-)
* [[プリンストン高等研究所]] (1933-1955)
* [[カリフォルニア工科大学]] (客員, 1931-33)
}}, from => 950, to => 1409}], weblink => []}
alma_mater
{anchortext => [{body => [[チューリッヒ工科大学]], from => 1424, to => 1438}], rawtext => [{body => , from => 1423, to => 1424}], template => [], weblink => []}
{anchortext => [{body => [[チューリッヒ大学]], from => 1444, to => 1456}], rawtext => [], template => [], weblink => []}
doctoral_advisor
{anchortext => [], rawtext => [{body => , from => 1476, to => 1477}], template => [{body => {{仮リンク|アルフレート・ク
ライナー|en|Alfred Kleiner}}, from => 1477, to => 1516}], weblink => []}
academic_advisors
{anchortext => [], rawtext => [{body => , from => 1537, to => 1538}], template => [{body => {{仮リンク|ハインリヒ・フリ
ードリヒ・ウェーバー|en|Heinrich Friedrich Weber}}, from => 1538, to => 1593}], weblink => []}
doctoral_students
Nil
known_for
{anchortext => [], rawtext => [{body => , from => 1627, to => 1628}], template => [{body => {{Plainlist|
*[[一般相対性理論]]
*[[特殊相対性理論]]
*[[光電効果]]
*[[ブラウン運動]]
*[[E=mc2|質量とエネルギーの等価性]](E=mc<sup>2</sup>)
*[[アインシュタイン方程式]]
*[[ボース分布関数]]
*[[宇宙定数]]
*[[ボース=アインシュタイン凝縮]]
*[[EPRパラドックス]]
*{{仮リンク|古典統一場論|en|Classical unified field theories}}
}}, from => 1628, to => 1861}], weblink => []}
influenced
{anchortext => [], rawtext => [{body => , from => 1877, to => 1878}], template => [{body => {{Plainlist|
* {{仮リンク|エルンスト・G・シュトラウス|en|Ernst G. Straus}}
* [[ネイサン・ローゼン]]
* [[レオ・シラード]]
}}, from => 1878, to => 1968}], weblink => []}
prizes
{anchortext => [], rawtext => [{body => , from => 1978, to => 1979}], template => [{body => {{Plainlist|
*{{仮リンク|バーナード・メダル|en|Barnard Medal for Meritorious Service to Science}}(1920)
*[[ノーベル物理学賞]](1921)
*[[マテウチ・メダル]](1921)
*[[コプリ・メダル]](1925)
*[[王立天文学会ゴールドメダル]](1926)
*[[マックス・プランク・メダル]](1929)
}}, from => 1979, to => 2181}], weblink => []}
religion
Nil
signature
{anchortext => [], rawtext => [{body => Albert Einstein signature 1934.svg, from => 2206, to => 2241}], template => [], weblink => []}
footnotes
Nil

view raw
example2-result.txt
hosted with ❤ by GitHub

Conclusion

I demonstrated the parsing techniques of the infoboxes. I highly recommend you to create your own parser if you have a chance to use Wikipedia as a resource for NLP. It will deepen your knowledge about not only Perl 6 but also Wikipedia.

See you again!

Citations

[0] Lehmann, Jens, et al. “DBpedia–a large-scale, multilingual knowledge base extracted from Wikipedia.” Semantic Web 6.2 (2015): 167-195.

[1] Ali, Esraa, Annalina Caputo, and Séamus Lawless. “Entity Attribute Ranking Using Learning to Rank.”

[2] Morales, Alvaro, et al. “Learning to answer questions from wikipedia infoboxes.” Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016.

License

All of the materials from Wikipedia are licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License.


Itsuki Toyota
A web developer in Japan.

Day 12 – The Year of Perl 6 Books

We can quibble all day if 2017 was the year of the Linux desktop, but there can be little doubt that it was the year of the Perl 6 book.

Perl 6 at a Glance

December 2016 brought us the ebook launch of Perl 6 at a Glance by Andrew Shitov, and then in 2017 the print version came it. It is an introduction to Perl 6 that targets programmers already familiar with another language.

It is the first of a generation of “modern” Perl 6 books. There weren’t many Perl 6 books before, most notably “Perl 6 and Parrot Essentials”, which was written when Perl 6 was very much a language in flux. The December 2015 release of the Perl 6 language version v6.c (and the accompanying Rakudo Perl 6 compiler) finally offers enough stability to make Perl 6 books work.

Think Perl 6

The next book released in 2017 was “Think Perl 6: How to Think Like a Computer Scientist”. It is a Perl 6 adaptation of Allen Downey’s great book Think Python: How to Think Like a Computer Scientist, lovingly ported to Perl 6 by Laurent Rosenfeld. It is available in print from O’Reilly, and freely available as an ebook from Green Tea Press. It is also available under an Open Source license in its source form (LaTeX) on GitHub.

“Think Perl 6” is an introduction to programming and computer science that happens to use Perl 6 as its primary tool. It targets the absolute beginner, and goes into a lot of detail on basic concepts such as branches, loops, variables, expressions, functions, recursion and so on.

Learning to Program with Perl 6

A book I haven’t had on my radar until it was available for purchase on Amazon was Learning to program with Perl 6: First Steps: Getting into programming without leaving the command line by JJ Merelo. You can buy it on Amazon pretty cheaply, or check it out on GitHub, where you can find a musical as bonus material.

It mostly targets beginners, and also discusses some things related to programming, like the use of GitHub, some shell features, and SSH. It is a light-hearted introduction into computing and Perl 6.

Perl 6 Fundamentals

Perl 6 Fundamentals started its life as “Perl 6 by Example”, written by Moritz Lenz, aka yours truly. (Yes, authors write about themselves in the third person. That “About the Author” section in each book? Written by the author. In third person. Weird). When Apress acquired the book, it was renamed to Perl 6 Fundamentals: A Primer with Examples, Projects, and Case Studies. It is available from everywhere that you can buy books. At least I hope so :-)

Each chapter focuses on one (at least somewhat) practical example, and uses that as an excuse to talk about various Perl 6 features, including concurrency, functional programming, grammars, and calling Python libraries through Inline::Python. (You can read the chapter about Inline::Python over at perltricks.com.) It targets programmers with previous experience, though not necessarily Perl 6 (or Perl 5) experience.

Larry Wall has kindly written a foreword for the book.

Perl 6 Deep Dive

Andrew Shitov’s second Perl 6 book, Perl 6 Deep Dive is, as the name suggests, more comprehensive and a, well, deeper dive, than “Perl 6 at Glance”, though somewhat similar in style. With more than 350 pages, it seems to have the largest coverage of Perl 6 features of any book so far.

Using Perl 6

Guess who’s released a third Perl 6 book within one year? That’s right, Andrew Shitov again. Somebody give that man a medal! Using Perl 6 is a collection of 100 programming challenges/problems and their solution in Perl 6. This is what Andrew wrote about it:

About a year ago, I decided to write a book about using Perl 6. Later, the plans changed and I published “Perl 6 at a Glance”, after which I wanted to write “Migrating to Perl 6” but instead wrote “Perl 6 Deep Dive” for Packt Publishing. Here and there, I was giving trainings on Perl 5, Python, and JavaScript, and was always suffering from finding a good list of easy tasks that a newcomer can use to train their skills in the language. Finally, I made up the list and solved it in Perl 6. This is the content of “Using Perl 6” — a book with solutions to 100 programming challenges, from simple to intermediate, with explanations of the Perl 6 constructions used.

Since his fourth book, *Migrating to Perl 6 will be released in 2018, it doesn’t get its own section. Take that, Andy! :-) This is the “Perl 6 for Perl 5 programmers” book that people (Perl 5 people, mostly) have been asking for on IRC and some other media.

And of course I won’t mention Andrew’s kickstarter for a cookbook-style project, because that will further skew the stats. Ooops, I just did. Hmm. Well, go support the man!

Parsing with Perl 6 Regexes and Grammars

After writing a general Perl 6 book, I wanted to focus on a narrower topic. A non-representative poll on twitter confirmed my suspicion that regexes and grammars would be the best niche, and so Parsing with Perl 6 Regexes and Grammars: A Recursive Descent into Parsing was born.

It requires basic programming knowledge to read, but no prior exposure to regexes to Perl. It goes from the building blocks of regexes to fully-featured parsers, including Abstract Syntax Tree generation and error reporting. A discussion of three very different example parsers concludes the nearly 200 pages, which could have also been titled far more than you ever wanted to know about parsing with Perl 6.

Right now, the ebook version is available for purchase, and I hope that the print version will be ready by Christmas. (And I’m talking about Christmas 2017, to be sure :)

Books in the Pipeline

I’d be remiss if I didn’t point out two more books that aren’t available yet, but might be in the next months or years.

brian d foy works on Learning Perl 6. In his last update, he shares that the first draft of the book is written, with a plan for things that need rewriting.

Gabor Szabo crowd-funded a book on Web Application Development in Perl 6 using the Bailador framework. The earlier chapters are mostly fleshed out, and the later chapters mostly exist as skeletons. Gabor expects it to be finished in 2018.

Keeping Track

The flood of Perl 6 books has made it hard for newcomers to decide which book to read, so I created the https://perl6book.com/ website that has one-line summaries, and a flow-chart for deciding which book to buy.

Even though I had input from other Perl 6 authors, it certainly reflects my biases. But, you can help to improve it!

Summary

With 7 Perl 6 books published by three major publishers in 2017, it’s been a fantastic year. I am also very happy with the diversity of the books, their target audience and styles. I hope you are too!

A final plea: If you have read any of these books, please give the author some feedback. They put incredible amounts of work into those, and feedback helps the author’s learning process and motivation. And if you liked a book, maybe even give it 5 stars on Amazon and write a line or two about why you liked it.

Day 11 — All the Stars of Perl 6, or * ** *

Today’s blog post in this year’s Perl 6 Advent Calendar is full of snowflakes. We’ll go through the constructs that employ the * character. In Perl 6, you may call it a star (or asterisk if you prefer) or a whatever, depending on the context.

Perl 6 is not a cryptic programming language; its syntax is in many aspects much more consistent than that of Perl 5. On the other side, some areas require spending time to start feeling confident in the syntax.

Let’s go through different usages of *, starting with the most simple, aiming to understand the most brain-breaking ones such as * ** *.

The first couple of usages is simple and does not require many comments:

1. Multiplication

A single star is used for multiplication. Strictly speaking, this is an infix operator infix:, whose return value is Numeric.

say 20 * 18; # 360

2. Power

The double star ** is the exponentiation operator. Again, this is an infix: that returns the Numeric result, calculating the power for the given two values.

say pi ** e; # 22.4591577183611

* * *

The same two tokens, * or **, are also used in regexes, where they mean different things. One of the features of Perl 6 is that it can easily switch between different languages inside itself. Both regexes and grammars are examples of such inner languages, where the same symbols can mean different things from what they mean in Perl 6 itself (if they have any meaning at all there).

3. Zero or more repetitions

The * quantifier. This syntax item works similarly to its behaviour in Perl 5: allows zero or more repetitions of the atom.

my $weather = '*****';
my $snow = $weather ~~ / ('*'*) /;
say 'Snow level is ' ~ $snow.chars; # Snow level is 5

Of course, we also see here another usage of the same character, the '*' literal.

4. Min to max repetitions

The double ** is a part of another quantifier that specifies the minimum and the maximum number of repetitions:

my $operator = '..';
say "'$operator' is a valid Perl 6 operator"
    if $operator ~~ /^ '.' ** 1..3 $/;

In this example, it is expected that the dot is repeated one, two, or three times; not less and not more.

Let’s look a bit ahead, and use a star in the role (role as in theatre, not in Perl 6’s object-oriented programming) of the Whatever symbol:

my $phrase = 'I love you......';
say 'You are so uncertain...'
    if $phrase ~~ / '.' ** 4..* /;

The second end of the range is open, and the regex accepts all the phrases with more than four dots in it.

5. Slurpy arguments

A star before an array argument in a sub’s signature means a slurpy argument—the one that consumes separate scalar parameters into a single array.

list-gifts('chocolade', 'ipad', 'camelia', 'perl6');

sub list-gifts(*@items) {
    say 'Look at my gifts this year:';
    .say for @items;
}

Hashes also allow celebrating slurpy parameters:

dump(alpha => 'a', beta => 'b'); # Prints:
                                 # alpha = a
                                 # beta = b

sub dump(*%data) {
    for %data.kv {say "$^a = $^b"}
}

Notice that unlike Perl 5, the code does not compile if you omit the star in the function signature, as Perl 6 expects exactly what is announced:

Too few positionals passed; expected 1 argument but got 0

6. Slurpy-slurpy

The **@ is also working but notice the difference when you pass arrays or lists.

With a single star:

my @a = < chocolade ipad >;
my @b = < camelia perl6 >;

all-together(@a, @b);
all-together(['chocolade', 'ipad'], ['camelia', 'perl6']);
all-together(< chocolade ipad >, < camelia perl6 >);

sub all-together(*@items) {
    .say for @items;
}

Currently, each gift is printed on a separate line regardless the way the argument list was passed.

With a double star:

keep-groupped(@a, @b);
keep-groupped(['chocolade', 'ipad'], ['camelia', 'perl6']);
keep-groupped(< chocolade ipad >, < camelia perl6 >);

sub keep-groupped(**@items) {
    .say for @items;
}

This time, the @items array gets two elements only, reflecting the structural types of the arguments:

[chocolade ipad]
[camelia perl6]

or

(chocolade ipad)
(camelia perl6)

7. Dynamic scope

The * twigil, which introduces dynamic scope. It is easy to confuse the dynamic variables with global variables but examine the following code.

sub happy-new-year() {
    "Happy new $*year year!"
}

my $*year = 2018;
say happy-new-year();

If you omit the star, the code cannot be run:

Variable '$year' is not declared

The only way to make it correct is to move the definition of $year above the function definition. With the dynamic variable $*year, the place where the function is called defines the result. The $*year variable is not visible in the outer scope of the sub, but it is quite visible in the dynamic scope.

For the dynamic variable, it is not important whether you assign a new value to an existing variable or create a new one:

sub happy-new-year() {
    "Happy new $*year year!"
}

my $*year = 2018;
say happy-new-year();

{
    $*year = 2019;        # New value
    say happy-new-year(); # 2019
}

{
    my $*year = 2020;     # New variable
    say happy-new-year(); # 2020
}

8. Compiler variables

A number of dynamic pseudo-constants come with Perl 6, for example:

say $*PERL;      # Perl 6 (6.c)
say @*ARGS;      # Prints command-line arguments
say %*ENV<HOME>; # Prints home directory

9. All methods

The .* postfix pseudo-operator calls all the methods with the given name, which can be found for the given object, and returns a list of results. In the trivial case you get a scholastically absurd code:

6.*perl.*say; # (6 Int.new)

The code with stars is a bit different from doing it simple and clear:

pi.perl.say; # 3.14159265358979e0 (notice the scientific
             # format, unlike pi.say)

The real power of the .* postfix comes with inheritance. It helps to reveal the truth sometimes:

class Present {
    method giver() {
        'parents'
    }
}

class ChristmasPresent is Present {
    method giver() {
        'Santa Claus'
    }
}

my ChristmasPresent $present;

$present.giver.say;             # Santa Claus
$present.*giver.join(', ').say; # Santa Claus, parents

Just a star but what a difference!

* * *

Now, to the most mysterious part of the star corner of Perl 6. The next two concepts, the Whatever and the WhateverCode classes, are easy to mix up with each other. Let’s try to do it right.

10. Whatever

A single * can represent Whatever. Whatever in Perl 6 is a predefined class, which introduces some prescribed behaviour in a few useful cases.

For example, in ranges and sequences, the final * means infinity. We’ve seen an example today already. Here is another one:

.say for 1 .. *;

This one-liner has a really high energy conversion efficiency as it generates an infinite list of increasing integers. Press Ctrl+C when you are ready to move on.

The range 1 .. * is the same as 1 .. Inf. You can clearly see that if you go to the Rakudo Perl 6 sources and find the following definitions in the implementation of the Range class in the src/core/Range.pm file:

multi method new(Whatever \min,Whatever \max,:$excludes-min,:$excludes-max){
    nqp::create(self)!SET-SELF(-Inf,Inf,$excludes-min,$excludes-max,1);
}
multi method new(Whatever \min, \max, :$excludes-min, :$excludes-max) {
    nqp::create(self)!SET-SELF(-Inf,max,$excludes-min,$excludes-max,1);
}
multi method new(\min, Whatever \max, :$excludes-min, :$excludes-max) {
    nqp::create(self)!SET-SELF(min,Inf,$excludes-min,$excludes-max,1);
}

The three multi constructors describe the three cases: * .. *, * .. $n and $n .. *, which are immediately translated to -Inf .. Inf, -Inf .. $n and $n .. Inf.

As a side Christmas tale, here’s a tiny excursus showing that * is not just an Inf. There were two commits to src/core/Whatever.pm:

First, on 16 September 2015, “Make Whatever.new == Inf True:”

  my class Whatever {
      multi method ACCEPTS(Whatever:D: $topic) { True }
      multi method perl(Whatever:D:) { '*' }
+     multi method Numeric(Whatever:D:) { Inf }
  }

In a few weeks, on 23 October 2015, “* no longer defaults to Inf,” This is to protect extensibility of * to other dwimmy situations:

  my class Whatever {
      multi method ACCEPTS(Whatever:D: $topic) { True }
      multi method perl(Whatever:D:) { '*' }
-     multi method Numeric(Whatever:D:) { Inf }
  }

Returning to our more practical problems, let’s create our own class that makes use of the whatever symbol *. Here is a simple example of a class with a multi-method taking either an Int value or a Whatever.

class N {
    multi method display(Int $n) {
        say $n;
    }

    multi method display(Whatever) {
        say 2000 + 100.rand.Int;
    }
}

In the first case, the method simply prints the value. The second method prints a random number between 2000 and 2100 instead. As the only argument of the second method is Whatever, no variable is needed in the signature.

Here is how you use the class:

my $n = N.new;
$n.display(2018);
$n.display(*);

The first call echoes its argument, while the second one prints something random.

The Whatever symbol can be held as a bare Whatever. Say, you create an echo function and pass the * to it:

sub echo($x) {
    say $x;
}

echo(2018); # 2018
echo(*);    # *

This time, no magic happens, the program prints a star.

And now we are at a point where a tiny thing changes a lot.

11. WhateverCode

Finally, it’s time to talk about WhateverCode.

Take an array and print the last element of it. If you do it in the Perl 5 style, you’d type something like @a[-1]. In Perl 6, that generates an error:

Unsupported use of a negative -1 subscript
to index from the end; in Perl 6 please
use a function such as *-1

The compiler suggests to use a function such as *-1. Is it a function? Yes, more precisely, a WhateverCode block:

say (*-1).WHAT; # (WhateverCode)

Now, print the second half of an array:

my @a = < one two three four five six >;
say @a[3..*]; # (four five six)

An array is indexed with the range 3..*. The Whatever star as the right end of the range means to take all the rest from the array. The type of 3..* is a Range:

say (3..*).WHAT; # (Range)

Finally, take one element less. We’ve already seen that to specify the last element a function such as *-1 must be used. The same can be done at the right end of the range:

say @a[3 .. *-2]; # (four five)

At this point, the so-called Whatever-currying happens and a Range becomes a WhateverCode:

say (3 .. *-2).WHAT; # (WhateverCode)

WhateverCode is a built-in Perl 6 class name; it can easily be used for method dispatching. Let’s update the code from the previous section and add a method variant that expects a WhateverCode argument:

class N {
    multi method display(Int $n) {
        say $n;
    }

    multi method display(Whatever) {
        say 2000 + 100.rand.Int;
    }

    multi method display(WhateverCode $code) {
        say $code(2000 + 100.rand.Int);
    }
}

Now, the star in the argument list falls into either display(Whatever) or display(WhateverCode):

N.display(2018);     # display(Int $n)

N.display(*);        # display(Whatever)

N.display(* / 2);    # display(WhateverCode $code)
N.display(* - 1000); # display(WhateverCode $code)

Once again, look at the signature of the display method:

multi method display(WhateverCode $code)

The $code argument is used as a function reference inside the method:

say $code(2000 + 100.rand.Int);

The function takes an argument but where is it going to? Or, in other words, what and where is the function body? We called the method as N.display(* / 2) or N.display(* - 1000). The answer is that both * / 2 and * - 1000 are functions! Remember the compiler’s hint about using a function such as *-1?

The star here becomes the first function argument, and thus * / 2 is equivalent to {$^a / 2}, while * - 1000 is equivalent to {$^a - 1000}.

Does it mean that $^b can be used next to $^a? Sure! Make the WhateverCode block accept two arguments. How do you indicate the second of them? Not a surprise, with another star! Let us add the fourth variant of the display method to our class:

multi method display(WhateverCode $code 
                     where {$code.arity == 2}) {
    say $code(2000, 100.rand.Int);
}

Here, the where block is used to narrow the dispatching down to select only those WhateverCode blocks that have two arguments. Having this done, two snowflakes are allowed in the method call:

N.display( * + * );
N.display( * - * );

The calls define the function $code that is used to calculate the result. So, the actual operation behind the N.display( * + * ) is the following: 2000 + 100.rand.Int.

Need more snow? Add more stars:

N.display( * * * );
N.display( * ** * );

Similarly, the actual calculations inside are:

2000 * 100.rand.Int

and

2000 ** 100.rand.Int

Congratulations! You can now parse the * ** * construct as effortlessly as the compiler does it.

Homework

Perl 6 gave us so many Christmas gifts so far. Let’s make an exercise in return and answer the question: What does each star mean in the following code?

my @n = 
    ((0, 1, * + * ... *).grep: *.is-prime).map: * * * * *;
.say for @n[^5];

D’oh. I suggest we start transforming the code to get rid of all the stars and to use different syntax.

The * after the sequence operator ... means to generate the sequence infinitely, so replace it with Inf:

((0, 1, * + * ... Inf).grep: *.is-prime).map: * * * * *

The two stars * + * in the generator function can be replafced with a lambda function with two explicit arguments:

((0, 1, -> $x, $y {$x + $y} ... Inf).grep: 
    *.is-prime).map: * * * * *

Now, a simple syntax alternation. Replace the .grep: with a method call with parentheses. Its argument *.is-prime becomes a codeblock, and the star is replaced with the default variable $_. Notice that no curly braces were needed while the code was using a *:

(0, 1, -> $x, $y {$x + $y} ... Inf).grep({
    $_.is-prime
}).map: * * * * *

Finally, the same trick for .map: but this time there are three arguments for this method, thus, you can write {$^a * $^b * $^c} instead of * * * * *, and here’s the new variant of the complete program:

my @n = (0, 1, -> $x, $y {$x + $y} ... Inf).grep({
        $_.is-prime
    }).map({
        $^a * $^b * $^c
    });
.say for @n[^5];

Now it is obvious that the code prints five products of the groups of three prime Fibonacci numbers.

Additional assignments

In textbooks, the most challenging tasks are marked with a *. Here are a couple of them for your to solve yourself.

  1. What is the difference between chdir('/') and &*chdir('/') in Perl 6?
  2. Explain the following Perl 6 code and modify it to demonstrate its advantages: .say for 1...**.

❄ ❄ ❄

That’s all for today. I hope that you enjoyed the power and expressiveness of Perl 6. Today, we talked about a single ASCII character only. Imagine how vast Perl 6’s Universe is if you take into account that the language offers the best Unicode support among today’s programming languages.

Enjoy Perl 6 today and spread the word! Stay tuned to the Perl 6 Advent Calendar; more articles are waiting for your attention, the next coming already tomorrow.

Andrew Shitov

Day 10 – Wrapping Rats

Going down chimneys is a dangerous business.

Chimneys can be narrow, high, and sometimes not well constructed to begin with.

This year, Santa wants to be prepared. Therefore, he is combining a chimney inspection with the delivery of presents.

A chimney inspection involves ensuring that every layer of bricks is at the correct height; i.e. that the layers of mortar are consistent, and that the bricks are also a consistent height.

For instance, for bricks that are 2¼” high, and mortar that is ⅜” thick, the sequence of measurements should look like this:

                       🎅 
                      ─██─
                       ||
 layer                                      total
       ░░░░░░░░░░ ░░░░░░░░░░░░░░░ ░░░░░░░░░░
  2¼   ░░░░░░░░░░ ░░░░░░░░░░░░░░░ ░░░░░░░░░░
       ░░░░░░░░░░ ░░░░░░░░░░░░░░░ ░░░░░░░░░░
   ⅜                                        ‾‾???
       ░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░░
  2¼   ░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░░
       ░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░░
   ⅜                                        ‾‾5⅝
       ░░░░░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░
  2¼   ░░░░░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ 
       ░░░░░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░
   ⅜                                        ‾‾3
       ░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░
  2¼   ░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░ 
       ░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░
   ⅜   _____________________________________‾‾⅜ 

The plan is for the Elves to do the dangerous descent to the bottom, tape measure in hand, and then come back up, ensuring that the top of each brick layer is at precisely the correct place on the tape measure.

One particular Elf, named Elvis, has taken it upon himself to write a program to help out with the task of computing this sequence of heights.

Being lazy, Elvis did not even want to add any of the fractions above, and wanted the program to do all the work. He also did not want to exert the mental effort required to figure out the formula for the height of each layer. Luckily, he was using Perl 6, which properly turns unicode fractions into rational numbers (type Rat), and also has a sequence operator (...) which figures out arithmetic sequences based on the first few terms.

So, Elvis’ first cut at the program looked like this:

my @heights = 0, ⅜, 3, 5+⅝ ... *;

say @heights[^10].join(', ')

This gave him the first 10 heights that he needed:

0, 0.375, 3, 5.625, 8.25, 10.875, 13.5, 16.125, 18.75, 21.375

While this was correct, it was hard to use. The tape measure had fractions of an inch, not decimals. The output Elvis really wanted was fractions.

Fortunately, he knew that using join turned the Rats into strings, Turning a Rat into a Str is done by calling the Str method of the Rat class. So, by modifying the behavior of Rat.Str, he figured he could make the output prettier.

The way he decided to do this was to wrap the Str method (aka using the decorator pattern), like so:

Rat.^find_method('Str').wrap:
  sub ($r) {
    my $whole = $r.Int || "";
    my $frac = $r - $whole;
    return "$whole" unless $frac > 0;
    return "$whole" ~ <⅛ ¼ ⅜ ½ ⅝ ¾ ⅞>[$frac * 8 - 1];
  }

In other words, when stringifying a Rat, return the whole portion unless there is a fractional portion. Then treat the fractional portion as the number of eighths, and use that as an index into an array to look up the right unicode fraction.

He combined that with his first program to get this sequence of heights:

0, ⅜, 3, 5⅝, 8¼, 10⅞, 13½, 16⅛, 18¾, 21⅜

“Hooray!” he thought. “Exactly what I need.”

Santa took a look at the program and said “Elvis, this is clever, but not quite enough. While most brick dimensions are multiples of ⅛ , that might not be true of mortar levels. Can you make your program handle those cases, too?”

“Sure” said Elvis with a wry smile. And he added this line into his wrapper function:

return "$whole {$frac.numerator}⁄{$frac.denominator}"
   unless $frac %% ⅛;

using the “is divisible by” operator (%%), to ensure that the fraction was evenly divisible into eighths, and if not to just print the numerator and denominator explicitly. Then for mortar that was ⅕” thick, the sequence:

my @heights = 0, ⅕,
                 ⅕ + 2+¼ + ⅕,
                 ⅕ + 2+¼ + ⅕
                   + 2+¼ + ⅕ ... *;
say @heights[^10].join(', ');
0,  1⁄5, 2 13⁄20, 5 1⁄10, 7 11⁄20, 10, 12 9⁄20, 14 9⁄10, 17 7⁄20, 19 4⁄5

“Actually”, Santa said, “now that I look at it, maybe this isn’t useful — the tape measure only has sixteenths of an inch, so it would be better to round to the nearest sixteenth of an inch.”

tape-measure

Elvis added a call to round to end up with:


Rat.^find_method('Str').wrap:
  sub ($r) {
        my $whole = $r.Int || '';
        my $frac = $r - $whole;
        return "$whole" unless $frac > 0;
        my $rounded = ($frac * 16).round/16;
        return "$whole" ~ <⅛ ¼ ⅜ ½ ⅝ ¾ ⅞>[$frac * 8 - 1] if $rounded %% ⅛;
        return "$whole {$rounded.numerator}⁄{$rounded.denominator}";
  }

which gave him

0,  3⁄16, 2⅝, 5⅛, 7 9⁄16, 10, 12 7⁄16, 14⅞, 17¼, 19 13⁄16

He showed his program to Elivra the Elf who said, “What a coincidence, I wrote a program that is almost exactly the same! Except, I also wanted to know where the bottoms of the layers of bricks are. I couldn’t use a sequence operator for this, since it isn’t an arithmetic progression, but I could use a lazy gather and an anonymous stateful variable! Like this:


my \brick = 2 + ¼;
my \mortar = ⅜;
my @heights = lazy gather {
    take 0;
    loop { take $ += $_ for mortar, brick }
}

Elvira’s program produced:

0, ⅜, 2⅝, 3, 5¼, 5⅝, 7⅞, 8¼, 10½, 10⅞

i.e. both the tops and the bottoms of the layers of bricks:

                     \ 🎅 /
                       ██
                       ||
 layer                                      total
       ░░░░░░░░░░ ░░░░░░░░░░░░░░░ ░░░░░░░░░░
  2¼   ░░░░░░░░░░ ░░░░░░░░░░░░░░░ ░░░░░░░░░░
       ░░░░░░░░░░ ░░░░░░░░░░░░░░░ ░░░░░░░░░░
   ⅜                                        ‾‾8¼
       ░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░░‾‾7⅞
  2¼   ░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░░
       ░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░░
   ⅜                                        ‾‾5⅝
       ░░░░░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░‾‾5¼
  2¼   ░░░░░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ 
       ░░░░░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░
   ⅜                                        ‾‾3
       ░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░‾‾2⅝
  2¼   ░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░ 
       ░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░
   ⅜   _____________________________________‾‾⅜
                                            ‾‾0

With their programs in hand, the Elves checked out the chimneys and Santa made it through another holiday season without any injuries.

Day 9 – HTTP and Web Sockets with Cro

It’s not only Christmas time when gifts are given. This summer at the Swiss Perl Workshop – beautifully situated up in the Alps – I had the pleasure of revealing Cro. Cro is a set of libraries for building services in Perl 6, together with a little development tooling to stub, run, and trace services. Cro is intially focused on building services with HTTP (including HTTP/2.0) and web sockets, but early support for ZeroMQ is available, and a range of other options are planned for the future.

Reactive pipelines

Cro follows the Perl design principle of making the easy things easy, and the hard things possible. Much like Git, Cro can be thought of as having porcelain (making the easy things easy) and plumbing (making the hard things possible). The plumbing level consists of components that are composed to form pipelines. The components come in different shapes, such as sources, transforms, and sinks. Here’s a transform that turns a HTTP request into a HTTP response:

use Cro;
use Cro::HTTP::Request;
use Cro::HTTP::Response;
class MuskoxApp does Cro::Transform {
method consumes() { Cro::HTTP::Request }
method produces() { Cro::HTTP::Response }
method transformer(Supply $pipeline --> Supply) {
supply whenever $pipeline -> $request {
given Cro::HTTP::Response.new(:$request, :200status) {
.append-header: "Content-type", "text/html";
.set-body: "Muskox Rocks!\n".encode('ascii');
.emit;
}
}
}
}

Now, let’s compose it with a TCP listener, a HTTP request parser, and a HTTP response serializer:

use Cro::TCP;
use Cro::HTTP::RequestParser;
use Cro::HTTP::ResponseSerializer;
my $server = Cro.compose:
Cro::TCP::Listener.new(:port(4242)),
Cro::HTTP::RequestParser.new,
MuskoxApp,
Cro::HTTP::ResponseSerializer;

That gives back a Cro::Service, which we can now start, and stop upon Ctrl+C:

$server.start;
react whenever signal(SIGINT) {
$server.stop;
exit;
}

Run it. Then curl it.

$ curl http://localhost:4242/
Muskox Rocks!

Not bad. But what if we wanted a HTTPS server? Provided we’ve got key and certificate files handy, that’s just a case of replacing the TCP listener with a TLS listener:

use Cro::TLS;
my $server = Cro.compose:
Cro::TLS::Listener.new(
:port(4242),
:certificate-file('certs-and-keys/server-crt.pem'),
:private-key-file('certs-and-keys/server-key.pem')
),
Cro::HTTP::RequestParser.new,
MuskoxApp,
Cro::HTTP::ResponseSerializer;

Run it. Then curl -k it.

$ curl -k https://localhost:4242/
Muskox Rocks!

And middleware? That’s just another component to compose into the pipeline. Or, seen another way, with Cro everything is middleware. Even the request parser or response serializer can be easily replaced, should the need arise (which sounds like an odd thing to need, but that’s effectively what implementing FastCGI would involve).

So, that’s how Cro is plumbed. It also requires an amount of boilerplate to work at this level. Bring in the porcelain!

HTTP server, the easy way

The Cro::HTTP::Server class gets rid of the boilerplate of building the HTTP processing pipeline. The example from earlier becomes just:

use Cro;
use Cro::HTTP::Server;
class MuskoxApp does Cro::Transform {
method consumes() { Cro::HTTP::Request }
method produces() { Cro::HTTP::Response }
method transformer(Supply $pipeline --> Supply) {
supply whenever $pipeline -> $request {
given Cro::HTTP::Response.new(:$request, :200status) {
.append-header: "Content-type", "text/html";
.set-body: "Muskox Rocks!\n".encode('ascii');
.emit;
}
}
}
}
my $server = Cro::HTTP::Server.new: :port(4242), :application(MuskoxApp);
$server.start;
react whenever signal(SIGINT) {
$server.stop;
exit;
}

There’s no magic here; it really is just a more convenient way to compose a pipeline. And while that’s only so much of a saving for HTTP/1.*, a HTTP/2.0 pipeline involves some more components, and a pipeline that supports both is a bit more involved still. By comparison, it’s easy to configure Cro::HTTP::Server to do HTTPS with support for both HTTP/1.1 and HTTP/2.0:

my %tls =
:certificate-file('certs-and-keys/server-crt.pem'),
:private-key-file('certs-and-keys/server-key.pem');
my $server = Cro::HTTP::Server.new: :port(4242), :application(MuskoxApp),
:%tls, :http<1.1 2>;

The route to happiness

A web application in Cro is ultimately always a transform that turns a HTTP request into a HTTP response. It’s very rare to want to process all requests in exactly the same way, however. Typically, different URLs should be routed to different handlers. Enter Cro::HTTP::Router:

use Cro::HTTP::Router;
use Cro::HTTP::Server;
my $application = route {
get -> {
content 'text/html', 'Do you like dugongs?';
}
}
my $server = Cro::HTTP::Server.new: :port(4242), :$application;
$server.start;
react whenever signal(SIGINT) {
$server.stop;
exit;
}

The object returned by a route block does the Cro::Transform role, meaning it would work just fine to use it with Cro.compose(...) plumbing too. It’s a good bit more convenient to write an application using the router, however! Let’s look at the get call a little more closely:

get -> {
content 'text/html', 'Do you like dugongs?';
}

Here, get is saying that this handler will only deal with HTTP GET requests. The empty signature of the pointy block means no URL segments are expected, so this route only applies to /. Then, instead of having to make a response object instance, add a header, and encode a string, the content function does it all.

The router is built to take advantage of Perl 6 signatures, and also to behave in a way that will feel natural to Perl 6 programmers. Route segments are set up by declaring parameters, and literal string segments match literally:

get -> 'product', $id {
content 'application/json', {
id => $id,
name => 'Arctic fox photo on canvas'
}
}

A quick check with curl shows that it takes care of serializing the JSON for us also:

$ curl http://localhost:4242/product/42
{"name": "Arctic fox photo on canvas","id": "42"}

The JSON body serializer is activated by the content type. It’s possible, and pretty straightforward, to implement and plug in your own body serializers.

Want to capture multiple URL segments? Slurpy parameters work too, which is handy in combination with static for serving static assets, perhaps multiple levels of directories deep:

get -> 'css', *@path {
static 'assets/css', @path;
}

Optional parameters work for segments that may or may not be provided. Using subset types to constrain the allowed values work too. And Int will only accept requests where the value in the URL segment parses as an integer:

get -> 'product', Int $id {
content 'application/json', {
id => $id,
name => 'Arctic fox photo on canvas'
}
}

Named parameters are used to receive query string arguments:

get -> 'search', :$query {
content 'text/plain', "You searched for $query";
}

Which would be populated in a request like this:

$ curl http://localhost:4242/search?query=llama
You searched for llama

These too can be type constrained and/or made required (named parameters are optional by default in Perl 6). The Cro router tries to help you do HTTP well by giving a 404 error for failure to match a URL segments, 405 (method not allowed) when segments would match but the wrong method is used, and 400 when the method and segments are fine, but there’s a problem with the query string. Named parameters, through use of the is header and is cookie traits, can also be used to accept and/or constrain headers and cookies.

Rather than chugging through the routes one at a time, the router compiles all of the routes into a Perl 6 grammar. This means that routes will be matched using an NFA, rather than having to chug through them one at a time. Further, it means that the Perl 6 longest literal prefix rules apply, so:

get -> 'product', 'index' { ... }
get -> 'product', $what { ... }

Will always prefer the first of those two for a request to /product/index, even if you wrote them in the opposite order:

get -> 'product', $what { ... }
get -> 'product', 'index' { ... }

Middleware made easier

It’s fun to say that HTTP middleware is just a Cro::Transform, but it’d be less fun to write if that was all Cro had to offer. Happily, there are some easier options. A route block can contain before and after blocks, which will run before and after any of the routes in the block have been processed. So, one could add HSTS headers to all responses:

my $application = route {
after {
header 'Strict-transport-security', 'max-age=31536000; includeSubDomains';
}
# Routes here...
}

Or respond with a HTTP 403 Forbidden for all requests without an Authorization header:

my $application = route {
before {
unless .has-header('Authorization') {
forbidden 'text/plain', 'Missing authorization';
}
}
# Routes here...
}

Which behaves like this:

$ curl http://localhost:4242/
Missing authorization
$ curl -H"Authorization: Token 123" http://localhost:4242/
<strong>Do you like dugongs?</strong>

It’s all just a Supply chain

All of Cro is really just a way of building up a chain of Perl 6 Supply objects. While the before and after middleware blocks are convenient, writing middleware as a transform provides access to the full power of the Perl 6 supply/whenever syntax. Thus, should you ever need to take a request with a session token and make an asynchronous call to a session database, and only then either emit the request for further processing (or do a redirection to a login page), it’s possible to do it – in a way that doesn’t block other requests (including those on the same connection).

In fact, Cro is built entirely in terms of the higher level Perl 6 concurrency features. There’s no explicit threads, and no explicit locks. Instead, all concurrency is expressed in terms of Perl 6 Supply and Promise, and it is left to the Perl 6 runtime library to scale the application over multiple threads.

Oh, and WebSockets?

It turns out Perl 6 supplies map really nicely to web sockets. So nicely, in fact, that Cro was left with relatively little to add in terms of API. Here’s how an (overly) simple chat server backend might look:

my $chat = Supplier.new;
get -> 'chat' {
# For each request for a web socket...
web-socket -> $incoming {
# We start this bit of reactive logic...
supply {
# Whenever we get a message on the socket, we emit it into the
# $chat Supplier
whenever $incoming -> $message {
$chat.emit(await $message.body-text);
}
# Whatever is emitted on the $chat Supplier (shared between all)
# web sockets), we send on this web socket.
whenever $chat -> $text {
emit $text;
}
}
}
}

Note that doing this needs a use Cro::HTTP::Router::WebSocket; to import the module providing the web-socket function.

In summary

This is just a glimpse at what Cro has to offer. There wasn’t space to talk about the HTTP and web socket clients, the cro command line tool for stubbing and running projects, the cro web tool that provides a web UI for doing the same, or that if you stick CRO_TRACE=1 into your environment you get lots of juicy debugging details about request and response processing.

To learn more, check out the Cro documentation, including a tutorial about building a Single Page Application. And if you’ve more questions, there’s also a recently-created #cro IRC channel on Freenode.

Day 8 – Adventures in NQP Land: Hacking the Rakudo Compiler

With apologies to the old Christmas classic, “The Twelve Days of Christmas,” I give you the first line of a Perl 6 version:

On the first day of Christmas, my true love gave to me, a Perl table in a pod tree…

But the table I got wasn’t very pretty!

Background

My first real contact with Perl 6 came in the spring of 2015 when I decided to check on its status and found it was ready for prime time. After getting some experience with the language I started contributing to the docs in places where I could help. One of my first contributions to the docs was to clean up one of the tables which was not rendering nicely . During my experiments with pod tables on my local host I tried the following table:

=begin table
-r0c0 r0c1
=end table

which caused Perl 6 to throw an ugly, LTA (less than awesome) exception message:

"===SORRY!=== Cannot iterate object with P6opaque representation"

I worked around the problem but it nagged at me so I started investigating the guts of pod and tables. That led me to the source of the problem in github.com/rakudo/src/Perl6/Pod.nqp.

In fact, the real problem for many pod table issues turned out to be in that file.

Not Quite Perl (NQP)

nqp is an intermediate language used to build the Rakudo Perl 6 compiler. Its repository is found here. The rest of this article is about modifying nqp code in the rakudo compiler found in its repository here. Rakudo also has a website here.

Before getting too far I first read the available information about Rakudo and NQP here:

Then I started practicing nqp coding by writing and running small nqp files like this (file “hello.nqp”):

say("Hello, world!");

which, when executed, gave the expected results:

$ nqp hello.nqp
Hello, world!

Note that say() is one of the few nqp opcodes that doesn’t require the nqp:: prefix.

Into the trenches

The purpose of the Perl6::Pod class, contained in the rakudo/src/Perl6/Pod.nqp file, is to take pod grammar matches and transform them into Perl 6 pod class definitions, in rakudo/src/core/Pod.pm, for further handling by renderers in Perl 6 land. For tables that means anything represented in any legal pod form as described by the Perl 6 documentation design Synopsis S26, the Perl 6 test suite specs, and the Perl 6 docs has to be transformed into a Perl 6 Pod::Block::Table class object with this form as described in file rakudo/src/core/Pod.pm:

configuration information
a header line with N cells
M content lines, each with N cells

I wanted the nqp table pod handling to be robust and able to automatically fix some format issues (with a warning to the author) or throw an exception (gracefully) with detailed information of problems to enable the author to fix the pod input.

Workspace and tools

I needed two cloned repositories: rakudo and roast. I also needed forks of those same repositories on github so I could create pull requests (PRs) for my changes. I found a very handy Perl 5 tool in CPAN module App::GitGot. Using got allowed me to easily set up all four repos. (Note that got requires that its target repo not exist either in the desired local directory or the user’s github account.) After configuring got I went to a suitable directory to contain both repos and executed the following:

got fork https://github.com/rakudo/rakudo.git
got fork https://github.com/perl6/roast.git

which resulted in a subdirectories rakudo and roast containing the cloned repos and new forks of rakudo and roast on my github account. In the rakudo directory one can see the default setup for easy creation of PRs:

$ git remote -v
origin git@github.com:tbrowder/rakudo.git (fetch)
origin git@github.com:tbrowder/rakudo.git (push)
upstream https://github.com/rakudo/rakudo.git (fetch)
upstream https://github.com/rakudo/rakudo.git (push)

There are similar results in the roast repo.

Next, I renamed the roast repo as a subdirectory of rakudo (“rakudo/t/spec”) so it functions as a subgit of the local rakudo.

Finally, I created several bash scripts to ease configuring rakudo for installation in the local repo directory, setting the environment, and running tests:

  • rakudo-local-config.sh
  • run-table-tests.sh
  • set-rakudo-envvars.sh

(See all scripts mentioned here at https://github.com/tbrowder/nqp-tools.)

To complete the local working environment you will need to install some local modules so you must change your path and install a local copy of the zef installer. Follow these steps in your rakudo directory (from advice from @Zoffix):

git clone https://github.com/ugexe/zef
export PATH=`pwd`/install/bin:$PATH
cd zef; perl6 -Ilib bin/zef install .
cd ..
export PATH=`pwd`/install/share/perl6/site/bin:$PATH
zef install Inline::Perl5

Then install any other module you need, e.g.,

zef install Debugger::UI::CommandLine
zef install Grammar::Debugger

Hacking

Now start hacking away. When ready for a build, execute:

make
make install

The make install step is critical because otherwise, with the local environment we set up, the new Perl 6 executables won’t be found.

Debugging for me was laborious, with rebuilds taking about three minutes each. The debugger (perl6-debug-m) would have been very useful but I could not install the required Debbugger::UI::CommandLine module so it would be recognized by the locally-installed perl6-debug-m. The primary method I used was inserted print statements plus using the --ll-exception option of perl6. Of major note, though, is that this author is a Perl 6 newbie and made lots of mistakes, and did not always remember the fixes, hence this article. (Note I likely would have used the debugging tools but, at the time I started, I did not ask for help and did not have the advice provided shown above.)

Testing

It goes without saying that a good PR will include tests for the changes. I always create a roast branch with the same name as my rakudo branch. Then I submit both PRs and I refer to the roast PR in the rakudo PR and vice versa. I note for the roast PR that it requires the companion rakudo PR for it to pass all tests.

See Ref. 5 for much more detail on specialized test scripts for fudging and other esoteric testing matters.

Documentation

I try to keep the Perl 6 pod table documentation current with my fixes.

NQP lessons learned

  • LTA error messages are a fact of life, e.g., “Cannot invoke this object…”, which can be caused by many things, including a misspelled identifier (NQP issue filed, early report is it may be impossible to fix anytime soon).

  • Ensure all nqp opcodes have the nqp:: prefix (except the few built-ins)

  • Practice with new code in an nqp-only sand-box.

Success!

I have now had two major Perl 6 POD (and roast) PRs accepted and merged, and am working on one more “easy” one which I should finish this week. The PRs are:

  1. Rakudo PR #1240

The Rakudo PR provided fixes for RTs #124403, #128221, #132341, #13248, and #129862. It was accompanied by roast PR #353.

The PR allowed the problem table above to be rendered properly. It also added warnings for questionable tables, added Rakudo environment variables RAKUDO_POD6_TABLE_DEBUG to aid users in debugging tables (see docs, User Debugging), and allows short rows with empty columns to be rendered properly.

  1. Rakudo PR #1287

The Rakudo PR provided a fix for Rakudo repo issue #1282. It was accompanied by roast PR #361. (Note that roast PR #361 is not yet merged.)

The PR allows table visual column separators (‘|’) and (‘+’) to be used as cell data by escaping them in the pod source.

Summary

  • Perl 6 pod is a great improvement over Perl 5, but it is still not fully implemented.

  • Working in the bowels of Rakudo Perl is rewarding (and fun), but prepare to get your hands dirty!

  • The Perl 6 community is a great group to be associated with.

  • I love Rakudo Perl 6.

Merry Christmas and Happy Hacking!

Credits

Any successful inputs I have made are due to all the Perl 6 core developers and the friendly folks on IRC (#perl6, #perl6-dev).

References

  1. JWs Perl 6 debugger Advent article
  2. JWs Rakudo debugger module Debugger::UI::CommandLine
  3. JWs grammar debugger module Grammar::Debugger
  4. Testing Rakudo
  5. Contributing to roast
  6. Github guide to pull requests (PRs)
  7. Perl 6 documentation (docs)

Appendix

POD tools

  • perl6 –doc=MODULE # where ‘MODULE’ is ‘HTML’, ‘Text’, or other appropriate module
  • p6doc
  • perl6 –ll-exception

Major Perl 6 POD renderers

Day 7 – Test All The Things

Perl 6, like its big sister Perl 5, has a long tradition of testing. When you install any Perl module, the installer normally runs that module’s test suite. And of course, as a budding Perl 6 module author, you’ll want to create your own test suite. Or maybe you’ll be daring and create your test suite before creating your module. This actually has several benefits, chief among them being your very first user, even before it’s written.

Before getting to actual code, though, I’d like to mention two shell aliases that I use very frequently –

alias 6='perl6 -Ilib'
alias 6p="prove -e'perl6 -Ilib'"

These aliases let me run a test file quickly without having to go to the trouble of installing my code. If I’m in a project directory, I can just run

$ 6 t/01-core.t
ok 1 - call with number
ok 2 - call with text
ok 3 - call with formatted string
1..3

and it’ll tell me what tests I’ve run and whether they all passed or not. Perl 6, just like its big sister Perl 5, uses the ‘t/’ directory for test files, and by convention the suffix ‘.t’ to distinguish test files from packages or scripts. It’s also got a built-in unit testing module, which we used above. If we were testing the sprintf() internal, it might look something like

use Test;

ok sprintf(1), 'call with number';
ok sprintf("text"), 'call with text';
ok sprintf("%d",1), 'call with formatted string';

done-testing;

The ok and done-testing functions are exported for us automatically. I’m using canonical Perl 6 style here, not relying too much on parentheses. In this case I do need to use parentheses to make sure sprintf() doesn’t “think” ’empty call’ is its argument.

ok takes just two arguments, the truthiness of what you want to test, and an optional message. If the first argument is anything that evaluates to True, the test passes. Otherwise… you know. The message is just text that describes the test. It’s purely optional, but it can be handy when the test fails as you can search for that string in your test file and quickly track down the problem. If you’re like the author, though, the line number is more valuable, so when you see

not ok 1 - call with number
# Failed test 'call with number'
# at test.t line 4
ok 2 - call with text
ok 3 - call with formatted string
1..3

in your test, you can immediately jump to line 4 of the test file and start editing to find out where the problem is. This gets more useful as your test files grow larger and larger, such as my tests for the Common Lisp version of (format) that I’m writing, with 200+ tests per test file and growing.

Finally, done-testing simply tells the Test module that we’re done testing, there are no more tests to come. This is handy when you’re just starting out and you’re constantly experimenting with your API, adding and updating tests. There’s no test counter to update each time or any other mechanics to keep track of.

It’s optional, of course, but other tools may use the ‘1..3’ at the end to prove that your test actually ran to completion. The tool prove is one, Jenkins unit testing and other systems may need that as well.

It depends…

on what your definition of ‘is’ is. While the ok test is fine if you’re only concerned with the truthiness of something, sometimes you need to dig a little deeper. Perl 6, just like its big sister, can help you there.

is 1 + 1, 2, 'prop. 54.43, Principia Mathematica';

doesn’t just check the truthiness of your test, it checks its value. While you could easily write this as

ok 1 + 1 == 2, 'prop. 54.43, Principia Mathematica';

using is makes your intent clear that you’re focusing on whether the expression 1+1 is equal to 2; with the ok version of the same statement, it’s unclear whether you’re testing the ‘1 + 1’ portion or the ‘==’ operator.

These two tests alone cover probably a good 80% of your testing needs, is handles basic lists and hashes with relative aplomb, and if you really need complex testing, its big sister is-deeply is standing at the wayside, ready to handle complex hash-array combinations.

Laziness and Impatience

Sometimes you’ve got a huge string, and you only need to check just a little bit of it.

ok 'Lake Char­gogg­a­gogg­man­chaugg­a­gogg­chau­bun­a­gung­a­maugg' ~~ 'manchau', 'my side';

You can certainly use the ~~ operator here. Just like ‘1 + 1 == 2’, though, your intent might not be clear. You can use the like method to make your intentions clear.

like 'Lake Char­gogg­a­gogg­man­chaugg­a­gogg­chau­bun­a­gung­a­maugg',
     /manchau/, 'my side';

and not have the ~~ dangling over the side of your boat.

DRYing out

After spending some time on and in beautiful Lake Char­gogg­a­gogg­man­chaugg­a­gogg­chau­bun­a­gung­a­maugg, you probably want to wring your clothes out. Test files tend to grow, especially regression tests. You might find yourself writing

is sprintf( "%s", '1' ), '1', "%s formats numbers";
is sprintf( "%s", '⅑' ), '⅑', "%s formats fractions";
is sprintf( "%s", 'Ⅷ' ), 'Ⅷ', "%s formats graphemes";
is sprintf( "%s", '三' ), '三', "%s formats CJKV";

That’s fine, copying and pasting (especially from StackOverflow) is a time-honored tradition, nothing wrong with that. Consider though, what happens when you add more tests with “%d” instead of “%s”, and since all of those strings are numbers, you just copy and paste the block, change “%s” to “%d” and go on.

is sprintf( "%s", '1' ), '1', "%s formats numbers';
# ...

is sprintf( "%d, '1' ), '1', "%d formats numbers';
# ...

So now you’ve got two sets of tests with the same names. Rather than editing all of the new “%d” tests, wouldn’t it be nice if we didn’t have to repeat ourselves in the first place?

subtest '%s', {
    is sprintf( "%s", '1' ), '1', "formats numbers";
    is sprintf( "%s", '⅑' ), '⅑', "formats fractions";
    is sprintf( "%s", 'Ⅷ' ), 'Ⅷ', "formats graphemes";
    is sprintf( "%s", '三' ), '三', "formats CJKV";
};

Now you just need to edit things in two places rather than three. If this has whetted your appetite for testing, I’d encourage you as well to check out Test All The Things at my personal site for more advanced testing paradigms and more advanced Perl 6 code. Also don’t forget to stay tuned for tomorrow’s Perl 6 Advent posting!

Thank you, and happy hacking!

DrForr aka Jeff Goff, The Perl Fisher

Day 6 – Five golden rings, Four calling birds, Three French hens, Two kinds of programmers?, And a book review in a Perl tree! (or Rakudo Tree? ;-)

PREFACE

In the following, this novice programmer will try to make the case for two kinds of programmers, which I then use as a springboard to my review of Think Perl 6, the Perl 6 introduction to programming book.

A little about me: I learned my first coding language summer of 2017.  My interest in programming started around the year 2000, when I was in the 8th Grade.  (Spare you the math, I’m 31 years old, trying to enter programming career without a degree.)  It was actually that book I got in 8th Grade that I read this summer, C– How To Program (I kid about the minuses.)  It was really neat to open that book and see the old Amazon bookmark of that time, and even where I had stopped reading.

I’ll touch real quick on the question, Is Perl 6 a good beginning code language?  Short answer, if only the imperative paradigm is taught first.  I think the beginner finds it comforting to know what every character in the program does.  So, the idea of introducing objects before the beginner knows what functions are, just doesn’t sound right to me.  I prefer the old-school method of learning first control structures, then functions, arrays, pointers, strings and characters, I/O, THEN classes.

I do think Perl 6’s lack of strict typing is a major disadvantage, because developing the mental muscles of keeping track of data types is a good idea, as well as enforcing the Principle of Least Privilege.  Perhaps if a Perl 6 author were to use code that adhered to this strict typing 100% of the time, that would be enough to train the novice in the way they should go.  I think it requires much repetition to drill into the untrained mind that values can be limited to types, and that this is a good thing.  In short, I don’t think a quick paragraph discussing types does the topic justice.  It must be encountered frequently.

The chief quality I nurtured in programming was to understand how programs work, rather than a desire to build something.  This difference summarizes the main argument of my article, which is that some people prefer to stay in the shallow-end of the pool and to simply understand, while others have a stronger desire to do more than just understand, but to actually build something.

I wonder if for those more interested in building, they more-or-less don’t care what .say is, rather they just care that it prints what they want to say.  So for a do-er type person, I wonder if they don’t care which paradigm they learn first.  For such an one, maybe Perl 6 is fine for a first programming language.

Finally, your calibration of the accuracy of this article will be skewed if I do not express to you what kind of thinker I am , and what sort of programmer I would be.  I say this because I perceive that I’m much different than the typical pursuer of programming knowledge, and this review may be vastly out-of-touch with the majority.  So we will take a detour on that subject real quick.

From my browsing of the internet, in those places where people that inspire to be programmers come to the watering holes of knowledge (Stackoverflow.com, Quora, Youtube, et cetera), I’ve discovered there seem to be two different kinds of programmers. The first is the stereotypical programmer, what I refer to in this article as the Computer Scientist (CS) — people that math hard, naturally talented, self-taught, can use a reference manual as a primary learning source, and so forth.

The other kind is the lowly Information Technology (IT) group – people that are better with computers than the average population but typically not fans of math, not able to effortlessly “pick up” command prompts like bash or even IRC or reference manuals, and frankly, they just aren’t as into computers as much as the CS guys. This IT group, I believe , is the majority of people interested in programming. One of the most popular questions associated with programming and beginners is, Do you have to be good at math to be a programmer? Notice I say majority of the people INTERESTED. I believe that the vast majority of ACTUAL programmers are the CS-type.

Whether you have a degree or not is not the point I’m getting at.  I chose to use those terms because a recent video I watch (https://www.youtube.com/watch?v=hT7RNAq-zvo) explained the difference between the two collegiate degrees which was a recent revelation to me.

Anyway, I am the IT kind of programmer.  So it is from this background, Perl6er, that I present to you my thoughts on your programming language’s leading book for novices of programming, Think Perl 6.  So take my opinion for whatever it’s worth.

REVIEW

The very second paragraph of Think Perl 6 is interesting because it touches the heart of this article.  The author states, “The single most important skill for a computer scientist is problem solving.”  I may be completely scornfully naive because I have zilch experience with programming, but I’m hoping for this statement to be wrong.

Working through a first programming book, I believe most beginners wrestle with the idea, am I cut-out for programming?  I’ve meditated on this, and I found that the ideal job description for me would be working on existing code that I debug or slightly modify. Certainly, this still requires problem solving, but I think the association of “problem solving” in computer programing is math word problems.  In short, I think most of these IT types are turned off by this phrase “problem solving”, regardless of how the author intended it.

THE BOOK IS DESIGNED FOR CS GUYS

Now back to the question, Is the current book for Perl 6 the right book for novices of programming?  For CS type guys, I would say, “Sure.”  The book has like 80 pages of solutions to practice problems, and I think the pace of the book is right down their alley.

So the rest of the article is geared to those IT programmers that need their hands held and “walked-through” the book.

Let me start off by saying that I have contacted the author to inform him of my pending review, and he was VERY courteous in his response to me.  He didn’t know this was going to be an Advent post, and neither did I until I chatted with some eager folks on IRC a few day ago.

I don’t wish to lambaste the author, I am merely recording what notes I took during my process of reading the text, with no respect of persons to who or what the author was or his status.  I’m focusing on the negative aspects of the book because that’s what invariably gets noted in a book.  But I will say real quick, the author does an outstanding job of summarizing the totality of programming on Page 4 better than anyone else I’ve read (I’ve now read a total of 3 programming books and two networking books), does well using comedy, explained modulus well and taught me something new about it, presents well some unique strengths of Perl 6’s: nested conditionals (p.62), for range-loop (p.64); and did an amazing job with a very clean example of recursion on Page 66, his tip on Incremental Development (p.79) was new to me and helpful, tip on debugging new to me and helpful(p.106).

TOPICS TAUGHT IN SOLUTIONS

I believe many people new to programming prefer to read the book through without doing any coding.  I know, I know, this is pretty much universally considered to be the wrong thing to do.  Regardless, I think the strongest reason for people doing so is because of the question, Am I really cut-out to be a programmer?  The fastest way to answer that question is to read through the book quickly, which means skipping the practice problems.

So one of the pitfalls to me of this book was as the author explains it on Page iii, “the solution section of the Appendix also introduces examples of topics that will be covered in the next chapter – and sometimes even things that are not covered elsewhere in the book” — meaning some topics are ONLY covered in the practice exercise solutions.  It only recently occurred to me that I could have gone directly to the answer section in the appendix.  But anyway, the material being separated to the back means many beginners will do what I did, just move on to the next chapter.

CODE SNIPPETS VS. COMPLETE PROGRAMS

I started my journey in programming the summer of 2017 reading the book C++ How To Program by Deitel.  It was a blessing that I started on this book because from what I can gather, this seems to be one of the few introduction to programming books that presents the reader with COMPLETE programs and their output to the screen and not just code snippets.  And, he does this for ALL the chapters.

The output was very important for me to figure out what was going on.  It helps tremendously when you see the value the author entered-in, and you can use his value to verify your understanding of the program is correct.

It was when I bought my second C++ book (Stroustup’s Programming) that I realized that I preferred complete programs because Stroustrup, like most other authors of today, only use short examples of code (snippets).

I was not able to “absorb” what Stroustrup was explaining even though he was teaching the basics which I had already learned.  It was and still is discouraging (and fascinating) to me that a certain method seems to be a requirement for me to learn; that is, that the author “holds my hand” and “walks me through the book”.  However, this lack of being able to read most of the computer books I encounter is the main contributor to the question, Am I cut-out to be a programmer?

No doubt this crutch of needing complete programs to understand is associated with the route I took – of just reading the book rather than doing exercises.  But I feel there is more to the picture.

I believe some people learn better reading code than writing code.  Or another way of saying it, I believe some people are better equipped to take jobs that require just reading code and modifying it, than developing new software.  This leads into the mysterious subject of various types of programming job descriptions, which seem to be an elusive hidden topic.  But anyways, on to the review.

THE BOOK LOOKS LIKE IT WAS RUSHED

Page XV describes the font conventions that will be used in the book: italics, bold, c o n s t a n t widths, and icons! — of a lemur for tips — a crow for general comments — and a scorpion for warnings.  However, these icons were never used in the book, and I never saw a bold word in the entire book, aside from section titles.  I didn’t see constant width used.

On Page 32, the author cites proper examples as

> round 42.45, 1;

42

> round 42.45, .1;

42.5

> round(42.45, .1);

42.5

>round( 42.45, .1);

42.5

but forgets a format he himself uses often throughout the book:

round (42.45, .1);

As I worked through the book, I kept wondering if the space was an error or not.  This may seem like nit-picking, but when someone is new to programming, they shouldn’t be concerned whether there are typos in what they are trying to learn.  This leads to thoughts of whether they bought the right book or not, or whether this programming code language community is professional or not, et cetera.

I know what you’re thinking, “Just go type it into a computer, you lazy fool.”  All I can say to my defense is that I grab these books and start reading.  I hate interrupting reading to type in code.

This goes back to what we talked about earlier, a beginner is more likely to want to get through the book quickly and skip coding.  Besides, beginners inevitably will have problems getting editors running, and command prompt issues.  So its best to make it easy for them to not need to.  It happened with my first book, as well. C++ How To Program 2nd edition was written in 1998?.  And it was tough finding an easy-to-install IDE; and then when I tried the code, it was outdated.  (No std::cout back then.)  So, I just went back to reading the book, instead of hassling to get things working to code.

Chapter 7 glossary includes the terms “item” and “slice” which are introduced in Chapter 9.  This is frustrating as a diligent reader will go back and re-read the chapter, assuming that they forgot the material.  And there are other details, which I won’t bore us with.  In summary, I felt the book was done in a hurry.

EXAMPLES WERE NEEDED OF:

the difference between ++variable and variable++ in a loop (p.20)

a function that returns void (p.43)

difference between a function and a function “mutator” (p.43)

what a caller to a function is (p.44). The author assumed the reader knew what a caller was. Page 40 was where the author introduces a code example involving a caller, but the author did not explain the function call.

to show where “if” or “when” needs to be used (p.89)

to show what implementing a dispatch table as an array means (p.198)

CODE NOT EXPLAINED

Author uses comment in example on Page 7, but doesn’t explain comments until p.25. This is discouraging because I remember when I was learning C++ the author carefully explaining each line of code, which was a comfort.

No explanation of: the comma in say ‘The answer is ‘, $value Page 22. I wondered if it was a typo.

necessity of ( ) for a variable representing a no-name function (p.48)

$*IN (p.69)

\n (p.99)

~$0 (p.117)

parentheses in regex /(<[\d.-]>+)/ (p.121)

m: (regex match operator) (p.126)

rx (p.128)

s/ / / operator (p.130)

handle (member-function of class IO) (p.141)

No explanation of bottom of Page 159, last line of code. I can’t understand my notes by just glancing at them, but there is some confusion with 1..3 being confused with [1], [2], [3].

Author didn’t explain why $_ is now needed for this example (p.168)

\t (p.189)

colon (p.190).

(Does this colon operator have a name?)

isa (p.240)

atan2 (p.243)

reduction operator (p.207)

THIS IS NOT A GENERAL BOOK ABOUT PROGRAMMING

I disagree with the author’s opening paragraph where he says, “this book is less about Perl 6, and more about learning how to write programs for computers.”

The author doesn’t mention anything in the book about the most fundamental element of good software design, which is the Principle of Least Privilege.

The author doesn’t introduce the term “identifier” (p.15 would be the area to do it) which is very important in a world of abstraction and names.

Author doesn’t explain what typed variables are and how they, not scalar variables, are used in most coding languages (p.16).

The author does not differentiate between assigning and initializing a variable (p.19). Initializing first gets mentioned on Page 97.

“Script” is defined as a file, rather than a small program (p.21).

Author does not tell reader what I/O stands for (p.139) And carriage return never explained (p.141).

ODDS AND ENDS

Confusing word choice for a novice: “non-alphanumerical” vs. “a symbol”, bijective (p.193).

Author talked about the necessity of declaring variables before you can use that variable (p.16), but in the first example that has two variables (p.23), he doesn’t remind the reader by demonstrating a declaration. It is assumed the variable is declared, which is confusing to the beginner, as he never stated it was assumed to be declared.

Author doesn’t tell reader how to pronounce ~ (p.24), and same thing with & (p.47).

Type conversion not explained (p.33).

Explanation of flow of execution was much too short (p.39).

Author doesn’t tell reader what things stand for: is rw

Author doesn’t explain first-class citizens (p.47).

Author gives the reader the code first, then explains later: p. 57, p.64, 109-110, 121. I prefer the explanation first, then the code, because I stay at that code focusing intensely if I am forgetting something I read before that would explain my confusion.

Significant typo on Page 161. Correct output reads: [1 2 3 4 5 6 7 8 9].

Index and glossary were not immaculate: twigil not in Ch. 9 glossary, index does not include $/ (p.118), .. (p.64, => (p.184) <=> (p.173), “data structure” 221 (correction: p.1, 145*, 157, 178, 221), “pointy block” needs pages 113 and 70 included, “state” needs p.198 included, topical variable needs pages 88 and 113 included.

CONCLUSION

I want to thank the author for writing this book. Without it I wouldn’t have met this nice Perl 6 community, and I also wouldn’t have been introduced to this Declarative Programming paradigm that looks so interesting. I read the book in the midst of me also reading other computer science books and learning C, so my brain is a mess. However, the more I learn about computing, the more Perl 6 looks like a sleeper car. You know, in street racing, how people have cars that look slow, but are actually very fast.

And finally, if there’s going to be a throwing-in of possible Perl 6 aliases, I’m submitting 6lerp (pronounced slurp)!  Stay warm out there folks, both inside and out.

-COMBORICO1611