One of the great things about Perl 6 is how accessible to regular users the compiler is to hack on. Easy bugs require nothing but knowledge of Perl 6 itself, since a lot of the Rakudo compiler is written in Perl 6. Slightly tougher bugs require knowledge of NQP, and tougher still are bugs involving Grammar and Actions. Things progress further in difficulty from there, going as far as assembly hacking on VM level, but today, we’ll stick around in Rakudo land. We have a task at hand!
Santa is having some difficulties generating his Naughty-or-Nice list due to a bug in Perl 6. He traced it down to the use of the S///
substitution operator with :g
modifier that for some reason returns an empty list instead of the original string if no matches were made:
say S:g/naughty// with 'only nice list';
# OUTPUT: ()
Time to dig in and fix this, if anyone is to get any presents this year!
The Bots To The Rescue
The first thing to do when fixing a bug is to find out when it first appeared. AlexDaniel++ and MasterDuke++ implemented several IRC bots that make this task extremely simple. They are available in #perl6 and #perl6-dev IRC channels, and you can play with them in #zofbot IRC channel, without annoying anyone. We’ll be using bisectable6
bot to find out when the S///
operator got broken:
<Zoffix> bisectable6, help
<bisectable6> Zoffix, Like this: bisectable6: old=2015.12 new=HEAD exit
1 if (^∞).grep({ last })[5] // 0 == 4 # RT128181
Just give the bot a piece of code, optionally specifying the starting and ending commits, and it’ll bisect by either the exit code, or failing at that, by output.
<Zoffix> bisectable6, S:g/d// given 'abc'
<bisectable6> Zoffix, Bisecting by output (old=2016.10 new=524368c)
because on both starting points the exit code is 0
<bisectable6> Zoffix, bisect log:
https://gist.github .com/c2cf9c3a7b6d13a43c34f64b96090e31
<bisectable6> Zoffix, (2016-10-23)
https://github .com/rakudo/rakudo/commit/b7201a8f22338a906f2d8027a21387e8f5c77f41
The last link is the interesting bit, it tells us the S:g///
was working fine until that commit. The commit does seem related—it’s the refactor lizmat++ did to make .match
150%–1400% faster—but it’s quite big and it’s not obvious how it’s linked to the workings of the S///
operator. Let’s find out, shall we?
How Do You Spell That?
We can specify the --target
command line argument to perl6
executable to ask it for the output of a particular stage of the program (run perl6 --statestats -e ''
to see names of all stages). Let’s output the parse
stage, to find out which tokens in the Grammar we should look into:
zoffix@VirtualBox:~/CPANPRC/rakudo$ ./perl6 --target=parse -e 'S:g/d//'
- statementlist: S:g/d//
- statement: 1 matches
- EXPR: S:g/d//
- value: S:g/d//
- quote: S:g/d//
- sym: S
- rx_adverbs: :g
- quotepair: 1 matches
- identifier: g
- sibble: /d//
- right:
- babble:
- B:
- left: d
- termseq: d
- termaltseq: d
- termconjseq: 1 matches
- termalt: 1 matches
- termconj: 1 matches
- termish: 1 matches
- noun: 1 matches
- atom: d
There are some general tokens in the output—such as statementlist
, statement
, EXPR
, and value
—we can just gloss over. They are about statements and we want stuff for the operator itself, so the interesting bit start with this:
- quote: S:g/d//
- sym: S
- rx_adverbs: :g
- quotepair: 1 matches
- identifier: g
Let’s pop open the Grammar in our text editor and locate a token
called quote
. It can also be a rule
, regex
or method
, but tokens are most common. The first thing we can locate is this:
proto token quote { <...> }
token quote:sym<apos> {
:dba('single quotes') "'" ~ "'"
<nibble(self.quote_lang(%*LANG<Quote>, "'", "'", ['q']))>
}
token quote:sym<sapos> {
:dba('curly single quotes') "‘" ~ "’"
<nibble(self.quote_lang(%*LANG<Quote>, "‘", "’", ['q']))>
}
The Grammar that parses Perl 6 isn’t much different from the grammar you’d use as a user of Perl 6, so most of it probably looks familiar to you. The quote
token is a proto regex, so looking further at the output --target=parse
gave us, we see we need :sym<S>
variant of it.
Scrolling a bit through the quote
‘s candidates, we finally come across :sym<s>
that sets the <sym>
capture to either s
or S
:
token quote:sym<s> {
<sym=[Ss]> (s)**0..1
:my %*RX;
:my $*INTERPOLATE := 1;
:my $*SUBST_LHS_BLOCK;
:my $*SUBST_RHS_BLOCK;
{
%*RX<s> := 1 if $/[0]
}
<.qok($/)>
<rx_adverbs>
<sibble(%*RX<P5> ?? %*LANG<P5Regex> !! %*LANG<Regex>, %*LANG<Quote>, ['qq'])>
[ <?{ $<sibble><infixish> }> || <.old_rx_mods>? ]
}
So this token handles both s///
and S///
operators and its body is unimpressive: it seems to be no more than than some set up work. With the name of the token in hand, we now know what to look for in the Actions: method quote:sym<s>
.
While finding it in actions is easy… it’s quite a biggie, with 177 lines of code to its name. However, someone nice left us a comment that fits into our puzzle:
method quote:sym<s>($/) {
# We are emulating Str.subst/subst-mutate here, by calling match,
# assigning the result to a temporary variable etc.
...
Recall bisectable6
‘s results? The commit it pointed out was work on the .match
method and according to the comment for S///
operator, it uses .match
to do its stuff. Let’s execute that method on builds before and after the commit bisectable6
found for us. There’s another handy bot to do that for us: commitable6
.
Give it a commit SHA or one of the release tags along with code to run and it’ll give you output for that code on that commit:
<Zoffix> committable6, 2016.11 say 'abc'.match: :g, /d/
<committable6> Zoffix, ¦«2016.11»: ()
<Zoffix> committable6, 2016.10 say 'abc'.match: :g, /d/
<committable6> Zoffix, ¦«2016.10»: ()
We ran the code on 2016.11 and 2016.10 releases and the output indicates there’s no difference… or is there? The problem with using say
as a debugging tool is it often omits things we may find useful. A better alternative is the dd
routine that is a Rakudo-specific utility dumper sub that’s not part of standard Perl 6 language. Give it some args and it’ll dump them out. Let’s give it a spin:
<Zoffix> committable6, 2016.11 dd 'abc'.match: :g, /d/
<comittable6> Zoffix, ¦«2016.11»: slip()
<Zoffix> committable6, 2016.10 dd 'abc'.match: :g, /d/
<committable6> Zoffix, ¦«2016.10»: ()
Aha! Another puzzle piece! When :g
adverb is in use, on failed matches .match
used to return an empty list, but after lizmat++’s .match
improvements, it started to return Empty
, which is an empty Slip
. Slips tend to flatten themselves out into the outer container, so perhaps that’s causing an issue in the S///
‘s action method? Let’s take a closer look at it.
Slippety Slip
A bird’s-eye view of method quote:sym<s>
action shows it does some setup work and then codegens a QAST (“Q” Abstract Syntax Tree). It’d be helpful to take a look at what it generates.
One method of doing so is using the same --target
feature we’ve used to get the parse
stage, except we’d use the ast
stage. So the command would be this:
perl6 --target=ast -e 'S:g/d//'
If you actually run that, you’ll get a text wall of QAST, and it may be tough to spot which are the bits actually generated by the S///
operator. Luckily, there’s a better way! The QAST node objects have .dump
method that dumps them the same style as what you see in --target=ast
output. So checkout the compiler’s repo if you haven’t already done so, pop open src/Perl6/Actions.nqp
file, go to the end of method quote:sym<s>
and stick note($past.dump)
in there to print the dump of the QAST generated for the S///
operator:
...
);
$past.annotate('is_S', $<sym> eq 'S');
note($past.dump); # <----------------------- like that
make WANTED($past, 's///'); # never carp about s/// in sink context
}
(Why is it called $past
and not $qast
? Historical reasons: QAST used to be PAST, for Parrot Abstract Syntax Tree).
Now, compile Rakudo:
perl Configure.pl --gen-moar --gen-nqp --backends=moar
make
make test
make install
And execute our buggy S///
match to make the line we added print out S///
‘s QAST:
zoffix@VirtualBox:~/CPANPRC/rakudo$ ./perl6 -e 'S:g/d// given "abc"'
- QAST::Op(locallifetime) :is_S<?> S:g/d//
- QAST::Stmt
- QAST::Var(local subst_result_1 :decl(var))
- QAST::Op(bind)
- QAST::Var(local subst_result_1)
- QAST::Op(callmethod match) S:g/d//
- QAST::Var(lexical $_) <wanted>
- QAST::WVal(Regex) :code_object<?> :past_block<?>
- QAST::IVal+{QAST::SpecialArg}(1 :named<g>)
- QAST::Op(p6store)
- QAST::Op(call &infix:<,>)
- QAST::Var(lexical $/)
- QAST::Var(local subst_result_1)
- QAST::Op(if)
- QAST::Op(unless)
- QAST::Op(istype)
- QAST::Var(local subst_result_1)
- QAST::WVal(Match)
- QAST::Op(if)
- QAST::Op(istype)
- QAST::Var(local subst_result_1)
- QAST::WVal(Positional)
- QAST::Op(callmethod elems)
- QAST::Var(local subst_result_1)
- QAST::Op(call &infix:<=>)
- QAST::Var(lexical $/) <wanted>
- QAST::Op(callmethod dispatch:<!>)
- QAST::Op(callmethod Str)
- QAST::Var(lexical $_) <wanted>
- QAST::SVal(APPLY-MATCHES)
- QAST::WVal(Str)
- QAST::Var(local subst_result_1)
- QAST::Op(p6capturelex) :code_object<?> :past_block<?>
- QAST::Op(callmethod clone)
- QAST::WVal(Code) :code_object<?> :past_block<?>
- QAST::Var(lexical $/)
- QAST::IVal(1)
- QAST::IVal(0)
- QAST::IVal(0)
- QAST::IVal(0)
- QAST::IVal(0)
- QAST::Op(p6store)
- QAST::Op(call &infix:<,>)
- QAST::Var(lexical $/)
- QAST::Var(lexical $_) <wanted>
- QAST::Stmt
- QAST::Var(lexical $/)
There are docs for types of QAST you see here, or we can just wing it.
We callmethod match
and bind the result to subst_result_1
:
- QAST::Var(local subst_result_1 :decl(var))
- QAST::Op(bind)
- QAST::Var(local subst_result_1)
- QAST::Op(callmethod match) S:g/d//
- QAST::Var(lexical $_) <wanted>
- QAST::WVal(Regex) :code_object<?> :past_block<?>
- QAST::IVal+{QAST::SpecialArg}(1 :named<g>)
We call nqp::p6store
(p6*
ops are documented in Rakudo’s repo), giving it the result of infix:<,>($/)
as container and the return of .match
call as value:
- QAST::Op(p6store)
- QAST::Op(call &infix:<,>)
- QAST::Var(lexical $/)
- QAST::Var(local subst_result_1)
We check if anything matched (for :g
matches, we check for a Positional
that has any .elems
in it):
- QAST::Op(if)
- QAST::Op(unless)
- QAST::Op(istype)
- QAST::Var(local subst_result_1)
- QAST::WVal(Match)
- QAST::Op(if)
- QAST::Op(istype)
- QAST::Var(local subst_result_1)
- QAST::WVal(Positional)
- QAST::Op(callmethod elems)
- QAST::Var(local subst_result_1)
If we did have matches, call Str!APPLY-MATCHES
:
- QAST::Op(call &infix:<=>)
- QAST::Var(lexical $/) <wanted>
- QAST::Op(callmethod dispatch:<!>)
- QAST::Op(callmethod Str)
- QAST::Var(lexical $_) <wanted>
- QAST::SVal(APPLY-MATCHES)
- QAST::WVal(Str)
- QAST::Var(local subst_result_1)
- QAST::Op(p6capturelex) :code_object<?> :past_block<?>
- QAST::Op(callmethod clone)
- QAST::WVal(Code) :code_object<?> :past_block<?>
- QAST::Var(lexical $/)
- QAST::IVal(1)
- QAST::IVal(0)
- QAST::IVal(0)
- QAST::IVal(0)
- QAST::IVal(0)
If we didn’t have matches, call nqp::p6store
, storing the $_
(this is our original string S///
works on) in the $/
:
- QAST::Op(p6store)
- QAST::Op(call &infix:<,>)
- QAST::Var(lexical $/)
- QAST::Var(lexical $_) <wanted>
Since we know the commit bisectable6
found makes .match
return an empty slip for failed matches, it’s that last bit of QAST that should look suspicious, since slips flatten themselves out. We’ll return to why we’re storing into an &infix:<,>($/)
rather than into $/
directly, but first, let’s write the NQP equivalent of such a setup.
We have two variables: $/
with Empty
and $_
with our original string. The QAST::Op
node maps out to an nqp op with the same name, so our suspicious bit looks something like this:
use nqp;
$_ = 'abc';
$/ = Empty;
nqp::p6store( &infix:<,>($/), $_);
Yet another helpful bot, camelia
, lets us run a piece of code straight from IRC. Just use trigger m:
with some code. Let’s try it out:
<Zoffix> m: use nqp; $_ = 'abc'; $/ = Empty;
nqp::p6store( &infix:<,>($/), $_); dd $/;
<camelia> rakudo-moar ea2884: OUTPUT«Slip $/ = slip$()»
<Zoffix> m: use nqp; $_ = 'abc'; $/ = List.new;
nqp::p6store( &infix:<,>($/), $_); dd $/;
<camelia> rakudo-moar ea2884: OUTPUT«Str $/ = "abc"»
The results show that when $/
is an Empty
, it ends up still being it after the p6store
, while if $/
is an empty List
, it happily takes a string. We finally connected the S///
operator with the commit that introduced the bug and found why it occurs (although, slips behaving like that may be a bug of its own). Let’s trace where that Empty
in Str.match
comes from and why it’s there.
What Sourcery Is This?
There’s another bot (it’s the future! people have lots of bots!), SourceBaby
, that can give you a link to source code for a routine. It uses CoreHackers::Sourcery
module under the hood and takes arguments to give to its sourcery
routine. Trigger it with the s:
trigger:
<Zoffix> s: 'abc', 'match', \(/d/, :g)
<SourceBaby> Zoffix, Sauce is at
https://github.com/rakudo/rakudo/blob/164eb42/src/core/Str.pm#L946
We gave it an object to call a method on (a Str
), a string with the method name (match
), and a Capture
with which arguments the method is to be called. In return, it gave a URL to the multi that handles those args:
multi method match(Regex:D $pattern, :global(:$g)!, *%_) {
nqp::if(
nqp::elems(nqp::getattr(%_,Map,'$!storage')),
self!match-cursor(nqp::getlexcaller('$/'),
$pattern($cursor-init(Cursor,self,:0c)), 'g', $g, %_),
nqp::if(
$g,
self!match-list(nqp::getlexcaller('$/'),
$pattern($cursor-init(Cursor,self,:0c)),
CURSOR-GLOBAL, POST-MATCH),
self!match-one(nqp::getlexcaller('$/'),
$pattern($cursor-init(Cursor,self,:0c)))
)
)
}
No Empty
here, but we can see that when $g
is true, we call self!match-list
. It’s a private method, so SourceBaby
would not be able to help with it. Let’s find it by searching the same source file:
# Create list from the appropriate Sequence given the move
method !match-list(\slash, \cursor, \move, \post) {
nqp::decont(slash = nqp::if(
nqp::isge_i(nqp::getattr_i(cursor,Cursor,'$!pos'),0),
Seq.new(POST-ITERATOR.new(cursor, move, post)).list,
Empty,
))
}
And there’s our Empty
! The commit message doesn’t mention why we changed from an empty List
to an Empty
, there are no comments in the source code explaining it, so we’ll have to resort to the most technologically non-advanced debugging tool in our arsenal… asking people.
The Dev IRC Channel
If you have questions about core development, join #perl6-dev
IRC channel on Freenode. In this case, we can ask lizmat++ if she remembers whether there was a reason for that Empty
.
If the person you’re trying to reach isn’t currently online, you can use the messaging bot, using the .tell
trigger, followed by the person’s nick, followed by message. When the bot sees the person talk, it will deliver the message.
<babydrop> .ask stmuk_ so is `zef` now the installer being
shipped with R*? I notice our REPL message still
references panda; wondering if that should read zef now
<yoleaux2> babydrop: I'll pass your message to stmuk_.
After the discussion about the Empty
, there doesn’t appear to be any specific reason to return it in this case, so we’ll change it to return an empty List
instead, just as its old behavior was, and that will also fix our bug. The new !match-list
then looks like this:
method !match-list(\slash, \cursor, \move, \post) {
nqp::decont(slash = nqp::if(
nqp::isge_i(nqp::getattr_i(cursor,Cursor,'$!pos'),0),
Seq.new(POST-ITERATOR.new(cursor, move, post)).list,
List.new,
))
}
Compile the compiler; this time we can just run make install
, since everything is already configured and pre-built from the last time we compiled:
make install
Check the change did fix the bug:
zoffix@VirtualBox:~/CPANPRC/rakudo$ ./perl6 -e 'say S:g/naughty// with "only nice list"'
only nice list
And run the test suite:
TEST_JOBS=6 make spectest
The TEST_JOBS
env var lets you run multiple test files at once and the optimal value to set it at is around 1.3 times the number of cores in your computer. If you have a very meaty box (or endless patience), you can run make stresstest
instead, for a more thorough test run.
With the spectest passing all of it’s tests, we are ready to finish off our work.
Test It!
The test suite is located in t/spec
and is automatically checked out from its repo when you run make spectest
. You can also simply delete that directory and clone your own fork as t/spec
instead.
Usually, it’s easy to locate the file where the test can go into by running tree -f | grep 'some search term'
. We fixed an issue with substitutions, so let’s go for this:
zoffix@VirtualBox:~/CPANPRC/rakudo/t/spec$ tree -f | grep subst
│ ├── ./integration/substr-after-match-in-gather-in-for.t
├── ./S05-substitution
│ ├── ./S05-substitution/67222.t
│ ├── ./S05-substitution/match.t
│ ├── ./S05-substitution/subst.rakudo.moar
│ └── ./S05-substitution/subst.t
│ ├── ./S32-str/substr-eq.t
│ ├── ./S32-str/substr-rw.rakudo.moar
│ ├── ./S32-str/substr-rw.t
│ ├── ./S32-str/substr.t
The ./S05-substitution/subst.t
file looks like a decent candidate, pop it open. Bump the plan
at the top of file by the number of tests you’re adding, then add the test at the end of the file (or a more appropriate spot):
plan 185;
...
is-deeply (S:g/FAIL// with 'foo'), 'foo',
'S:g/// returns original string on failure to match';
Run the test file, to ensure everything passes:
make t/spec/S05-substitution/subst.t
And commit! We’re done! Santa’s Naughty-or-Nice list shall work fine from now on.
The Final Mystery
Recall that &infix:<,>($/)
thing that was causing the bug when $/
contained an Empty
? So what is that all about?
If you don’t know something about Perl 6, just come to our #perl6 IRC channel and ask. This is what I did when I couldn’t understand the purpose of that infix thing and after a long discussion, finding old bug tickets, and testing old bugs… we came to the conclusion these are no longer needed here!
So along with a bug fix, we also cleaned up codegen. At least that’s in theory, perhaps by doing so we created another bug that will send us on yet another great hunting journey.
Conclusion
It’s easy to give a helping hand to the core developers of Perl 6 by fixing some of the bugs. Starting from easy things that require nothing more than knowledge of Perl 6, you can progressively learn more about the internals and fix tougher problems.
The perl6
compiler comes with a number of useful output methods like --target=ast
and --target=parse
that can aid in debugging. An army of IRC bots makes it easy to navigate source code both in space and time, by either giving you a link to an implementation or producing output of some particular commit.
Lastly, a very valuable resource we have available is the people of Perl 6, who can help you out. Whether you’re digging deep into the guts of the compiler, or just starting out with computer programming.
Join us. We have… bugs to fix!
One thought on “Day 11 — Perl 6 Core Hacking: It Slipped Through The QASTs”