Day 21 – Community smoke testing

So after all we are good programmers who know their stuff, right? I mean, I know that I should better test the code that I write, and I tend to do that for every change I do. At work I often just run a particular script that tests the new feature or the absence of a bug, but eventually these test codes are added to a test suite at the end of the day. But even when you are a better programmer than me, and you write tests beforehand, there is one problem left:

Assumptions

We assume quite a lot when we write or test our programs. We know how the program is meant to work and that is our problem, which is so hard to escape from. We assume that someone will enter a valid date in a date field, we expect positive numbers when it comes to quantities. And we usually test our programs on a given set of platforms, often just the one PC or Mac we’re sitting at.

One important rule about test suites is that you should try to forget about what Certainly Should Work™. No, test built-ins, because these can change over time. Also query the infrastructure if other tests rely on specific things. Test the environment as detailed as you can, because if something fails early, you save time debugging the problem. This is even much more worth if you get reports from systems you don’t have access to. Debugging a very high level and late problem in a test suite on a strange or just foreign platform can be frustrating.

So in case we want to minimize the assumptions we talked about… What’s the opposite? – It is knowledge made out of experience, which itself is made of input. That’s the approach we will take. I come back to that a little bit later.

Back to what we can do for you. We cannot directly do much about your expectations and assumptions when it comes to your test suite, sadly. We don’t have a framework that pushes garbage input data to your scripts, but we can test your program in different ways. And we can hopefully provide feedback to recent changes to your code base very quickly, but that will also only work out when there are testers that test your distribution often.

Personally I code using the most bleeding edge rakudo compiler with bleeding edge of the MoarVM backend, on an Ubuntu linux box. That is a quite specific scenario. And there are a lot of different setups out there, and even when your code is not platform specific, one library, a dependency of a dependency surely is.

Generate a lot of input, to be turned into experience and knowledge

How are we going to achieve that? Of course we have to intercept the build and test stages that happen when a random user installs our distributions. That user can help us getting reports be doing:

# on unixes:
PANDA_SUBMIT_TESTREPORTS=1 panda install Foo

# on windows
set PANDA_SUBMIT_TESTREPORTS=1
panda install Foo

On can also set and export this environment variable permanently, so that every attempt of installing a distribution will be reported and the dist author is able to care about upcoming problems.

The great benefit of this way of gathering information in contrast to smoke testing the ecosystem on a central box is of course that we get reports from a wide variety of operating systems, compiler versions, locales and dependencies like other Perl 6 distributions but also C libraries.

A test report made by panda will look somewhat like this:

Sample test report

Though the last section contains the entire report in fact. You’ll see information about the operating system, the kernel, the backend used by rakudo, flags how that backend was built, and you also see some information about the tested distribution. I think it would be worthwile to extend that information so that certain meaningful environment variables are included as well, I am thinking of vars like MVM_JIT_DISABLE that have an impact on how the test programs are executed.

Once we received reports like this we build stats to highlight how well the distribution is doing. One of them is a matrix that shows the pass rate across compiler version, operating system and backend.

Platform/version matrix

When you’d send a report for a platform that is not listed there yet (like GNU/Hurd), it would automatically expand the list when the matrix gets regenerated. As of today that is done every five minutes. The colored bars represent the three backends, MoarVM is the topmost, JVM in the middle and Parrot at the bottom. The color coding is:

  • green – The tests passed, that also means that the test stage has to exit with code 0.
  • orange – That usually means that we did not run any tests, but besides that everything ran cleanly.
  • red – Something went wrong. Either stage build or stage test.

You see that the list of compiler versions can grow pretty quickly, and I think the best solution is to collapse the development releases into a single line, and make it expand on click. On this matrix all shown releases are dev releases, recognizable by the trailing partial sha1.

So now imagine these stats are about your distribution. You see red flashing lights that should put you in action and you will quickly take a look at all negative test results. Now, some of the reports might reveal pretty quickly what is wrong. Maybe you’ve forgotten to add a dependeny to the META file shipped with the distribution. But there might be other failing tests where you don’t have a clue what went actually wrong, or where to start poking.

The good thing is that testers.perl6.org is not a tool that only works in one direction. Sure, the user runs your test and the result travels in your direction. But remember that you are the master of the tests in question. If you are in doubt about the cause of a bug, then extend the test suite, and wait for more reports to arrive that provide your recently added diagnostics.

That’s where the loop closes to the things I said earlier. Wipe everything from your mind that you *think* that applies to the box the tests failed. Test everything that is involved in the failing functionality. And if you have to test the built ins of the Perl 6 compiler, do it. If you are pedantic and suspicious enough, you’ll get a great test suite that offers many informations also for upcoming test failures.

Sidenote: For the Perl 5 community it is quite usual that there are volunteers that run tests for distributions every day around the clock. I hope that we’ll have such awesome volunteers too in near future. So if you have boxes that run 24/7, have a weird architecture or a rare operating system, please run tests to aid the other developers in the right direction.

Forecast

I have plenty of ideas how this tool can be improved. A connection to issue trackers like GitHub Issues is one of my favourite which is also listed here: TODO of testers.perl6.org
If you are interested in helping out, the repository of testers.perl6.org is here and I’d be pleased to accept pull request or hand out commit bits.

One thought on “Day 21 – Community smoke testing

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.