I am pleased to announce that the recent TLS certificate problems and outages to jml.io have been fully resolved.

Here are some notes on what happened and what I did about it.

Background

jml.io is a statically generated blog that’s hosted on AWS. The HTML pages are generated locally and uploaded to S3 buckets. These buckets are then served by CloudFront, which acts as a CDN. The domain names are managed on Route53.

Before now, these were managed by clicking around in the AWS console. There might have been a time when they were generated by Ansible playbooks, but I can’t find the playbooks anymore.

A couple of weeks ago, the wildcard TLS certificate for jml.io expired. This meant that anyone who browsed to the site with a modern browser got a scary warning saying the site wasn’t trustworthy. And fair enough too!

To get the site working, I needed to get a new certificate and distribute it from AWS. Recalling the steps to do this properly was too hard, and besides I knew of a better way.

Enter Terraform

We use Terraform to manage our AWS infrastructure at my employer, and really quite like it. I personally have some qualms about HCL, its configuration language, that I might write about later, but I like both it and Terraform more than any alternatives I’m aware of.

Because my site was down and because I really don’t have time to do considered maintenance, I decided to migrate the whole thing to Terraform. This would mean that all of my thinking and decisions could be stored in a Git repo, rather than my memory.

I would also use the AWS Certificate Manager to generate and manage the certificates, sparing me the difficulty of purchasing, storing, configuring, and later renewing them myself.

How it happened

I spent the first week snatching the occasional hour here and there figuring out how Terraform worked. While we use it at Weaveworks, I wasn’t the one to set it all up, and editing something built by someone else is very different from being able to build to from scratch.

In particular, I needed to get a feel for the workflow and for how Terraform’s means of abstraction, modules, actually worked.

The next week I started migrating all the “redirect” buckets to Terraform. These are S3 buckets for my old domain names (code.mumak.net, mumak.net, etc.) that now redirect here.

Doing this involved figuring out how to import things from Terraform, how to use modules, how to edit Terraform state when you’ve refactored something.

It’s quite slow going. The terraform plan step takes quite a while, which means the edit/test loop is bit of a grind. This really hurts when you are snatching a half-hour before bed here and there to get things done.

During this process, I got a bunch of excellent advice and working, reusable Terraform code from David Reid.

Once I got the redirect buckets incorporated, I moved on to their DNS records. That went fairly smoothly if slowly.

Then I decided to set up CloudFront, ACM, and a Lambda for HSTS all at once. It would have worked great, except that all my stuff was on us-west-2, and all the cool features for integrating with CloudFront are in us-east-1.

So today I had the joy of migrating buckets from one AWS region to another. AWS has no built-in support for this that I know of, so the way you do it is create a new bucket in the new region, copy all the content over, delete the old bucket, then wait a while for eventually consistent data stores and/or batch jobs to do their thing, then create a bucket in the new region with the old name, then copy all the data over again, then delete the temporary bucket.

It’s a real hassle, and AWS’s silly global bucket namespace thing adds an edge of frission: what if someone steals my name while I’m waiting to retry?

The new setup

Everything’s on AWS, managed by Terraform. Even the Terraform state and lock are kept there. I haven’t set up anything like terradiff yet, but it’s only a matter of time.

The module set up means I’ve got a pretty clear list of what’s a genuine static website and what’s a redirect site. There’s some duplication, but its mostly of boilerplate rather than of magic strings.

Going forward, I’m going to use the extra automation provided by Terraform to make publishing to this blog a bit easier for me. I think I can also take some of the stuff I’ve learned and incorporate it into our work infrastructure.

Conclusions

Terraform is great for managing AWS stuff. AWS is a pretty cool way of hosting a static site if you care about TLS certificates (which you should). dreid is awesome for giving me so much useful help at the right time.