Author Archive

Day 21 – Community smoke testing

December 21, 2014

So after all we are good programmers who know their stuff, right? I mean, I know that I should better test the code that I write, and I tend to do that for every change I do. At work I often just run a particular script that tests the new feature or the absence of a bug, but eventually these test codes are added to a test suite at the end of the day. But even when you are a better programmer than me, and you write tests beforehand, there is one problem left:

Assumptions

We assume quite a lot when we write or test our programs. We know how the program is meant to work and that is our problem, which is so hard to escape from. We assume that someone will enter a valid date in a date field, we expect positive numbers when it comes to quantities. And we usually test our programs on a given set of platforms, often just the one PC or Mac we’re sitting at.

One important rule about test suites is that you should try to forget about what Certainly Should Work™. No, test built-ins, because these can change over time. Also query the infrastructure if other tests rely on specific things. Test the environment as detailed as you can, because if something fails early, you save time debugging the problem. This is even much more worth if you get reports from systems you don’t have access to. Debugging a very high level and late problem in a test suite on a strange or just foreign platform can be frustrating.

So in case we want to minimize the assumptions we talked about… What’s the opposite? – It is knowledge made out of experience, which itself is made of input. That’s the approach we will take. I come back to that a little bit later.

Back to what we can do for you. We cannot directly do much about your expectations and assumptions when it comes to your test suite, sadly. We don’t have a framework that pushes garbage input data to your scripts, but we can test your program in different ways. And we can hopefully provide feedback to recent changes to your code base very quickly, but that will also only work out when there are testers that test your distribution often.

Personally I code using the most bleeding edge rakudo compiler with bleeding edge of the MoarVM backend, on an Ubuntu linux box. That is a quite specific scenario. And there are a lot of different setups out there, and even when your code is not platform specific, one library, a dependency of a dependency surely is.

Generate a lot of input, to be turned into experience and knowledge

How are we going to achieve that? Of course we have to intercept the build and test stages that happen when a random user installs our distributions. That user can help us getting reports be doing:

# on unixes:
PANDA_SUBMIT_TESTREPORTS=1 panda install Foo

# on windows
set PANDA_SUBMIT_TESTREPORTS=1
panda install Foo

On can also set and export this environment variable permanently, so that every attempt of installing a distribution will be reported and the dist author is able to care about upcoming problems.

The great benefit of this way of gathering information in contrast to smoke testing the ecosystem on a central box is of course that we get reports from a wide variety of operating systems, compiler versions, locales and dependencies like other Perl 6 distributions but also C libraries.

A test report made by panda will look somewhat like this:

Sample test report

Though the last section contains the entire report in fact. You’ll see information about the operating system, the kernel, the backend used by rakudo, flags how that backend was built, and you also see some information about the tested distribution. I think it would be worthwile to extend that information so that certain meaningful environment variables are included as well, I am thinking of vars like MVM_JIT_DISABLE that have an impact on how the test programs are executed.

Once we received reports like this we build stats to highlight how well the distribution is doing. One of them is a matrix that shows the pass rate across compiler version, operating system and backend.

Platform/version matrix

When you’d send a report for a platform that is not listed there yet (like GNU/Hurd), it would automatically expand the list when the matrix gets regenerated. As of today that is done every five minutes. The colored bars represent the three backends, MoarVM is the topmost, JVM in the middle and Parrot and the bottom. The color coding is:

  • green – The tests passed, that also means that the test stage has to exit with code 0.
  • orange – That usually means that we did not run any tests, but besides that everything ran cleanly.
  • red – Something went wrong. Either stage build or stage test.

You see that the list of compiler versions can grow pretty quickly, and I think the best solution is to collapse the development releases into a single line, and make it expand on click. On this matrix all shown releases are dev releases, recognizable by the trailing partial sha1.

So now imagine these stats are about your distribution. You see red flashing lights that should put you in action and you will quickly take a look at all negative test results. Now, some of the reports might reveal pretty quickly what is wrong. Maybe you’ve forgotten to add a dependeny to the META file shipped with the distribution. But there might be other failing tests where you don’t have a clue what went actually wrong, or where to start poking.

The good thing is that testers.perl6.org is not a tool that only works in one direction. Sure, the user runs your test and the result travels in your direction. But remember that you are the master of the tests in question. If you are in doubt about the cause of a bug, then extend the test suite, and wait for more reports to arrive that provide your recently added diagnostics.

That’s where the loop closes to the things I said earlier. Wipe everything from your mind that you *think* that applies to the box the tests failed. Test everything that is involved in the failing functionality. And if you have to test the built ins of the Perl 6 compiler, do it. If you are pedantic and suspicious enough, you’ll get a great test suite that offers many informations also for upcoming test failures.

Sidenote: For the Perl 5 community it is quite usual that there are volunteers that run tests for distributions every day around the clock. I hope that we’ll have such awesome volunteers too in near future. So if you have boxes that run 24/7, have a weird architecture or a rare operating system, please run tests to aid the other developers in the right direction.

Forecast

I have plenty of ideas how this tool can be improved. A connection to issue trackers like GitHub Issues is one of my favourite which is also listed here: TODO of testers.perl6.org
If you are interested in helping out, the repository of testers.perl6.org is here and I’d be pleased to accept pull request or hand out commit bits.

Day 16 – Slangs

December 16, 2013
use v6;
my $thing = "123abc";
say try $thing + 1; # this will fail

{
    use v5;
    say $thing + 1 # will print 124
}

Slangs are pretty interesting things in natural languages, so naturally they will be pretty awesome in computer languages as well. Without it the cross-language communication is like talking through a thin pipe, like it is when calling C functions. It does work, but calling functions is not the only nor the most comfortable thing out there.

The example above shows that we create a variable in Perl 6 land and use it in a nested block, which derives from another language. (This does only work if the nested language is capable of handling the syntax, a dollar-sigilled variable in this case.)
We use this other language on purpose: it provides a feature that we need to solve our task.
I hope that slangs will pop up not just to provide functionality to solve a given problem, but also help in writing the code in a way that fits the nature of that said problem.

How does that even work?

The key is that the module that lets you switch to the slang provides a grammar and appropriate action methods. That is not different from how Perl 6 is implemented itself, or how JSON::Tiny works internally.
The grammar will parse all statements in our nested block, and the actions are there to translate the parsed source code (text) into something a compiler can handle better: usually abstracted operations in form of a tree, called AST.

The v5 slang compiles to QAST, which is the name of the AST that Rakudo uses. The benefit of that approach is that this data structure is already known by the guts of the Rakudo compiler. So our slang would just need to care about translating the foreign source code text into something known. The compiler takes this AST then and maps it to something the underlying backend understands.
So it does not matter if we’re running on Parrot, on the JVM or something else, the slang’s job is done when it produced the AST.

A slang was born.

In March this year at the GPW2013, I felt the need for something that glues both Perl 6 and Perl 5 together. There were many nice people that shared this urge, so I investigated how to do so.
Then I found a Perl 5 parser in the std repository. Larry Wall took the Perl 6 parser years ago and modified it to be Perl 5 conform. The Perl 6 parser it is based on is the very same that Rakudo is built upon. So the first step was to take this Perl 5 grammar, take the action methods from Rakudo, and try to build something that compiles.
(In theory this is all we needed: grammar + action = slang.)

I can’t quite remember whether it took one week or two, but then there was a hacked Rakudo that could say “Hallo World”. And it already insisted on putting parens around conditions for example. Which might be the first eye catcher for everyone when looking at both languages.
Since then there was a lot of progress in merging in Perl 5’s test suite, and implementing and fixing things, and making it a module rather than a hacked standalone Rakudo-ish thing.

Today I can proudly say that it passes more than 4600 of roughly 45000 tests. These 4600 passing tests are enough so you can play with it and feed it simple Perl 5 code. But the main work for the next weeks and months is to provide the core modules so that you can actually use a module from CPAN. Which, after all, was the main reason to create v5.

What is supported at the moment?

  • all control structures like loops and conditions
  • functions like shift, pop, chop, ord, sleep, require, …
  • mathematical operations
  • subroutine signatures that affect parsing
  • pragmas like vars, warnings, strict
  • core modules like Config, Cwd and English

The main missing pieces that hurt are:

Loop labels for next LABEL, redo LABEL and last LABEL will land soon in rakudo and v5. The other missing parts will take their time but will happen :o).

The set goals of v5:

  • write Perl 5 code directly in Perl 6 code, usually as a closure
  • allow Perl 6 lexical blocks inside Perl 5 ones
  • make it easy to use variables declared in an outer block (outer means the other language here)
  • provide the behaviour of Perl 5 operators and built-ins for v5 blocks only, nested Perl 6 blocks should not be affected
  • and of course: make subs, packages, regexes, etc available to the other language

All of the statements above are already true today. If you do a numeric operation it will behave differently in a v5 block than a Perl 6 block like the example at the top shows. That is simply because in Perl 6 the + operator will dispatch to a subroutine called &infix:<+>, but in a v5 block it translates to &infix:<P5+>.

Oversimplified it looks a bit like this:

Perl 6/5 code:

1 + 2;
{
    use v5;
    3 + 4
}

Produced AST:

- QAST::CompUnit
    - QAST::Block 1 + 2; { use v5; 3 + 4 }
        - QAST::Stmts 1 + 2; { use v5; 3 + 4 }
            - QAST::Stmt
                - QAST::Op(call &infix:<+>) +
                    - QAST::IVal(1)
                    - QAST::IVal(2)
            - QAST::Block { use v5; 3 + 4 }
                - QAST::Stmts  use v5; 3 + 4 
                    - QAST::Stmt
                        - QAST::Op(call &infix:<P5+>)
                            - QAST::IVal(3)
                            - QAST::IVal(4)

The nice thing about this is that you can use foreign operators (of the used slang) in your Perl 6 code. Like &prefix:<P5+>("123hurz") would be valid Perl 6 code that turn a string into a number even when there are trailing word characters.

To get v5 you should follow its README, but be warned, at the moment this involves recompiling Rakudo.

Conclusion: When was the last time you’ve seen a language you could extend that easily? Right. I was merely astonished how easy it is to get started. Next on your TODO list: the COBOL slang. :o)

Day 11 – Installing Modules

December 11, 2013

“Honey, I can’t find my keys!”
– “Hmmm, have you already looked at home or site?”

Preface: This post is about a new feature which currently resides in the branches rakudo/eleven and panda/eleven.

So this post is about installing “modules” and finding them again later. I quoted the word “modules” here because we are not really talking about modules. Even when we say that we meant classes, roles, grammars and every other packagy type by that term, we’re in fact talking about distributions.

That is what we see when we look at modules.perl6.org. These things that have a name, an author or authority and hopefully a version, are the things that provide compilation units which then can be loaded later using statements like use Foo, need Bar or require Baz.

But these distributions can ship other information as well: executable scripts or music, graphics or fonts that are used by another application.
And this bunch of information that is put in a paper bag called distribution, labeled with name/auth/ver is meant to be downloaded by an installer (panda), placed safely on your harddisk, your stick or a webspace, and should be easily locatable later when we need it.

But we are devs, right? We want to use our in-developement-modules without the need to install them. So, there should be a way of telling the compiler that we have a directory structure where our github clones are. These directories should be considered when searching for candidates of a use statement. And, to the fact that we are lacking the paper bag in such a situation, these should be preferred, whatever name/auth/version-trait a use statement may have attached.

This could be one of our rules of thumb: Not installed modules in a known path are preferred over installed ones.

Our first crux, or: Use it.

use Foo:ver<1.2.3> does not mean you are loading a module Foo with version v1.2.3. You are in case loading a package Foo that is part of a distribution that has the required version and that provides such a namespace.

Al right, we are all good hackers, we can handle that. We would just need a (sort of) database were we can put all installed distributions that we would query later, say, when useing a module.

After a few days and the first prototype we would come at a point where we play with panda, our installer toolchain.
We would be ready in so far that panda would install dists into our database. Our tests would show that we could load these installed modules by name, auth and version even when several distributions would supply modules that only differ by version number.
Wasn’t that hard… All fine now?

The second crux, or: The installer installs the installer.

Even panda itself must be installed in our new environment. And that will become insteresting in two ways. We take the pathy way first:
What panda does when we execute its bootstrap.pl script is that it loads the not-yet-installed File::Find for example, compiles it, and installes it to the destination path, just to pick it up to compile Shell::Command. That breaks the our rule of thumb badly. Now a installed module should preferred.
It seems like we would need some sort of ordering there.¹

The third crux, or: I thought it is all about modules?

Panda (or perhaps pandora) offers another box for us: It is our first distribution that has executable files.
Okay, we have a problem here. Our task is to install several versions of the same distribution, but all of them are going to provide executables with the same name, but likely with different functionality?
Clearly we need a way of invoking the correct executable. Our shell would just pick the executable that is found in PATH first. We need something better.
What if we would only create one `bin` folder per installation repository? We could have a script that delegates to the correct version of the wanted executable. Querying our wrapper would then look like this:

panda --ver=1.2 install Foo::Bar

Our wrapper would only need to know about parameters named `–auth`, `–name` and `–ver`, and would just pass everything else to the original executable panda in this case.
Luckily this helps us in another aspect. We could install wrappers like panda-p and panda-j also, which would explicitely invoke the backends parrot and jvm.

The final chapter.

Let us forget about the subjunctive for a moment, what can we do *now*?

There are two interesting branches: rakudo/eleven and panda/eleven. Called after today’s date and the fact that the corresponding spec is the S11.
With these two branches you are able to:

  1. configure your directories for vendor, perl, site and home and also your developement paths using the libraries.cfg.
  2. bootstrap panda which gives you panda, panda-p and panda-j executeables
  3. install modules the “new” way, and also locate them in the following way:
    use Foo:ver(*);
    use Foo:ver(1.*);
    use Foo:ver(/alpha$/);
    use Foo:auth<FROGGS>
    use Foo:auth({ .substr(0,3) eq 'Bar' });
    ...
  4. you can invoke executables like:
    myscript --auth=Peter rec0001.wav
    yourscript --ver="2.*" index.html
    ...
    

I hope this will land in the master/nom branch soon, but I think there are a few glitches that need to be discovered and fixed before doing so. (One glitch might be just less Windows® testing from my side.)

Another glitch, now that I think about it: When you load a specific version of a module or execute a script, the magic must make sure that it prefers its own distribution when it loads modules without the need to specify this in the use statements. Otherwise you would execute the panda script version v1 while this loads modules of version v2.
This will require additional thought in the S11 specification.

A note for module authors:

You probably know about the META.info, in most cases you need to add a “provides” section as shown here.
Without that the packages can’t be used. This “provides” section will not break current code, so please add that.

¹) You can set the ordering of the repositories in your libraries.cfg and in -I compiler switches like:

perl6 -ICompUnitRepo::Local::File:prio[10]=/home/peter/project-a:/home/peter/project-b

Follow

Get every new post delivered to your Inbox.

Join 44 other followers