Day 1 – Consuming GitHub Webhooks

Welcome Back

Welcome to the first Advent post of 2016; this marks almost the first anniversary of the release of the Perl 6 specification. Rakudo Perl 6 has released a compiler for over 100 months (even before the release of the actual specification). The releases this year each come with bug fixes, speed improvements, and occasionally new features that will probably become part of the next version of the specification.

For the first advent article this year, I wanted to share an experience where I was able to use Perl 6 for one of the strengths that it inherited from its older sibling: whipuptitude (coined by Larry – “the aptitude for whipping things up”). When I was given the task of getting something working quickly at $DAYJOB recently, it was the tool I reached for.

Consuming Github Webhooks

The task was to setup a tool that would, for a variety of projects in github enterprise, create a docker image every time someone pushed a commit to the mainline development branch.

Start Listening

If you have a github project, you may have seen in the settings that you can setup a webhook – an HTTP call that is triggered after certain events occur in the repository. Let’s create a test repository in github (or github enterprise if you have access to one at work); For the public github, you can go to https://github.com/new, pick a new repository name, select “Initialize this repository with a README”, and click “Create Repository”

Now let’s turn on a webhook. (For this to work, you’ll need to be able to make an http connection from the github/enterprise to your local machine.) Go to the repository’s settings, pick Webhooks, then “Add webhook”. Note that the webhook says that it will be a POST request, and can send JSON: we’ll need that later.

Pick a port, e.g. 7890, so the payload URL will be something like: http://somemachine.your.co:7890/build

For this proof of concept, we just need git push events (the default). Click “Add webhook” to finalize this hook. Now commit something to your test repository. You can see the error the webhook reports – it couldn’t connect. So, our first step is to be able to pick up when the webhook calls.

Looking through the Perl 6 ecosystem (http://modules.perl6.org/), we find Bailador, a Perl 6 play on 5’s Dancer, that will let us run a local HTTP server. After installing that module with zef or panda, we can write a barebones program that will let us listen to all POST requests on a particular port. We use the .* regex to indicate all paths, and give it an anonymous sub as a callback. The callback outputs a line when called, then tells github that everything is “OK”. The last line tells Bailador to listen on that port.

use Bailador;
post /.*/ => sub {
    say "Hello?";
    "OK";
}
baile(7890);

Now if we run this script, it blocks, waiting for connections. Push another commit to your git repo, and now you can see that everytime you push, our script wakes up, Bailador notes that it received a POST to the “/build” URL, and our callback emits “Hello?”, and then waits again.

What did you say?

Our next step is to see what data github is actually sending us: easy enough, let’s dump out the HTTP request that was shipped, using a sigilless variable that is made available to us in a Bailador routing callback. Let’s also replace our previous regex with the path “/build” to match the URL we gave the webhook.

use Bailador;
post '/build' => sub {
    say request.body;
    "OK";
}
baile(7890);

After another push, we can verify that github is sending us a body consisting of JSON, just like it said it would. Digging into the ecosystem again, we pick one of the JSON modules, and use that to convert the JSON string we get into an actual Perl 6 data structure:

use Bailador;
use JSON::Fast;
post '/build' => sub {
    say (from-json request.body).keys;
    "OK";
}
baile(7890);

Pushing now gives us a list of the top level keys of the JSON structure. The JSON from the request body is converted to a hash with the module’s from-json routine, and then we print just the keys from that structure. Now we can actually start processing some data.

What do I want?

We’re going to need to know what repository we’re working with, where to checkout a copy, and which branch it was on. We’ll ignore the revision and just assume we need to grab the latest version each time we get a push. Looking at the output from the previous dumps above, we can pull out the specific information we need from the data.

use Bailador;
use JSON::Fast;
post '/build' => 
    my $data = from-json request.body;

    my $repo      = $data<repository><full_name>;
    my $clone-url = $data<repository><ssh_url>;
    my $branch    = $data<ref>.split('/')[*-1];

    dd $repo, $clone-url, $branch; 
    "OK";
}
baile(7890);

The $data<key> syntax lets us index into $data with a literal key. For the branch, we split up the refs/heads/master into chunks by / and then pick the last entry. Note that we use Rakudo’s un-specced helper routine dd to pretty print the three variables to make sure we’re pulling the right data.

Now we have enough data to pull a copy of the repo and do some work! If the branch in question is the right one, we’re going to want to get a checkout and do a build on the HEAD of that branch. While for $DAYJOB, we did a docker build and pushed that up to a registry, for this sample, we’ll just checkout a copy.

    use Bailador;
    use JSON::Fast;
    post '/build' => sub {
        my $data = from-json request.body;

        my $repo      = $data<repository><full_name>;
        my $clone-url = $data<repository><ssh_url>;
        my $branch    = $data<ref>.split('/')[*-1];

        return "OK" if $branch ne "master";

        qqx/rm -rf "$repo"/;
        qqx/git clone --branch $branch --depth 1 "$clone-url" "$repo"/;


        "OK";
    }
    baile(7890);

The callback now bails out early if the branch we were called on isn’t master. We shell out to remove the repo directory (as coke on github, my tests in writing this article were on coke/demo), and then do minimal git clone of the appropriate branch. Since we’re shelling out, you can easily add your actual build/upload step here as well.

Too Slow?

Once we added the checkout step, you may have noticed that github complains of a timeout. Because we don’t care to report back to github if a particular revision successfully built or not (we’re just creating a build), there’s no need to wait for it to complete. Perl 6 makes this very easy. Our final version has just one more word and a block:

use Bailador;
use JSON::Fast;
post '/build' => sub {
    my $data = from-json request.body;

    my $repo      = $data<repository><full_name>;
    my $clone-url = $data<repository><ssh_url>;
    my $branch    = $data<ref>.split('/')[*-1];

    return "OK" if $branch ne "master";

    start {
        qqx/rm -rf "$repo"/;
        qqx/git clone --branch $branch --depth 1 "$clone-url" "$repo"/;
    }

    "OK";
}
baile(7890);

The start block here begins an asynchronous request. The POST callback skips over this and immediately returns “OK” to the github webhook, while Perl 6 queues our clone (and eventually our build) to run in the background.

What Next?

We were able to implement this solution without knowing much about webhooks up front (just that they are HTTP POST requests), iterating our Perl 6 script as we were able to see more information at each step.

Granted, this was the bare minimum to get things working for a project at my $DAYJOB – you might have more requirements, or have more time to play. Some experiments you could do to extend this script:

  • Add HTTPS support
  • Change the git checkout to keep a cached version that uses ‘clone –mirror’, and then fetches instead of cloning each time.
  • Create a local YAML config file that allows you to change the branch that triggers the build for each repository. (Or a DB connection, or…)
  • Make the handler smart enough to process each commit rather than just the latest commit when we are called.
  • Customize the script to actually do a build for your project.

Cheers!

9 thoughts on “Day 1 – Consuming GitHub Webhooks

      1. RE-fixed. Looks like I fixed them, and then wordpress did an auto-save that screwed it up again. Repushed the fixed version. Thanks for the report!

  1. return "OK" if $branch ne "master";

    With current implementation it would not return early also for ‘refs/heads/my/master’ (the ‘my/master’ branch) and ‘refs/tags/master’ (the ‘master’ tag… if anybody is insane to have it).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s