No-one is paying you for DevOps
Software only gets difficult when you put it in front of customers. Unfortunately that’s pretty much a necessity if you want to get paid, so what’s the best way to do it? For many years the solution involved essentially the same mechanism: take your code, bundle it up somehow, copy it across to a server somewhere, unpack it, and hope for the best.
Writing the scripts that do this safely, consistently and without breaking your precious live deployment is a difficult and time consuming undertaking, so a number of offerings appeared that took this pain away (Heroku, AWS Beanstalk, App Engine, Firebase etc), making app deployment as simple as a single “git push”.
For simple applications with limited dependencies this is still the way to go. Customers aren’t paying you for your lovely DevOps: they’re paying you to reliably deliver a service. As such any time spent developing and maintaining DevOps systems is time you’re not spending improving your product and gaining market traction. So if you have a simple application with standard dependencies, then use any of the above solutions: they’ll all work, and frankly it doesn’t matter which one you pick
But I’m just too complex
Some systems just don’t fit easily into these simplified frameworks:
Your applications may have complicated or “non-standard” system dependencies
You’ve designed your system as a set of self-contained microservices, communicating over APIs
You need fine-grained control of resource allocation and scaling
So if I’m not using one of the GitOps frameworks, I’m back to writing scripts. The tools here have changed over the years, and the servers have evolved from a large noisy thing in your office, to a large noisy thing in a big room somewhere in the world, to a fully virtual thing inside a large noisy thing somewhere in the world, but essentially it’s the same recipe: write scripts.
There’s a number of problems with this approach - I’ll outline the main ones:
DevOps scripts are hard to write, hard to maintain, and nearly impossible to test. Sure, there are lots of frameworks that try to take some of the pain away (e.g Chef, Puppet, Fabric), but you’ll still end up writing reams of code.
Broken builds and downtime
So you’ve packaged up your code and pushed it to a server. How do you ensure that when you unpack it and switch it on that you’re not going to bring your app down in flames? Deployments really shouldn’t cause downtime, so to do this properly you’ll need to take each server offline, update the code, test it’s working somehow, take it back online, and move onto the next server.
Repeatability and consistency
With a scripting approach it’s much harder to reproduce the same environment locally. It’s also much harder to guarantee that you know exactly what code is running on all your servers. If you need to bring a server offline for any reason, when you bring it back you’ll need to ensure that it’s running exactly the same code as all the other servers.
So you’ve carefully deployed your code one server at a time and nothing has gone wrong, but 5 minutes later you notice a nasty bug that’s going to cause untold pain and misery. You either need to bake rollback into your “deploy and unpack” scripts, or repeat the whole deploy process with a previous version of your code.
It happens - servers can break for a multitude of reasons, although to be fair it’s far more likely that it will be down to your code. Running out of memory or disk on a Linux box is pretty catastrophic: your server will either become completely unresponsive, slow to a crawl (Java Garbage Collect Sawtooth of Doom - I’m looking at you), or explode in flames.
A clean death is certainly preferable here: your server will just disappear from the load balancer, and life will carry on. The worst possible case is where your app is extremely unwell, but not sufficiently poorly that the load balancer will notice. This is your “super-spreader at the rave” situation, and you need to avoid it at all costs.
It’s very hard with this approach to properly pack code into servers, as there’s no way to limit the resources needed by each service (e.g ram / cpu). The naive approach is just to fire up a new machine for each service: this tends to massive underutilise resources.
Containers to the rescue
I’ve written multiple variants of deploy scripts over the years, for various languages. They all tried to cope with the above problems, some more successfully than others. They were fiddly to write, difficult to maintain, and still left most of the above issues unsolved.
Luckily I don’t have to do it ever again: enter the Whale and the Wheel:
A Docker container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings. The whole works. Once you’ve created your image, you can run it anywhere. But isn’t this the same as a Virtual Machine? Well, sort of, but a Docker container is much more lightweight: it runs natively on the silicon rather than in an emulation layer, so it’s much more efficient. For a more detailed explanation head on over to the Docker website
A Docker image is built using a Dockerfile. This is a set of instructions that tells Docker how to construct the image: here’s one of the simplest possible examples:
The important bit is the first line: Docker images inherit from other images. In this case some helpful person has built a minimal Alpine Linux containing Python 3.7: we just inherit from that, copy in our code, and tell it how to run the app. You can of course just build from another image (e.g Ubuntu), although your final image will be quite a bit larger.
Docker images are stored in repositories such as DockerHub which is essentially source control for images: that’s where Docker will find the “python:3.7-alpine” image above. As you’d expect there are a bunch of pre-built images for every conceivable application, and in true open source style the Dockerfile is always accessible if you want to see how they are built.
Tags are used to organise images: the above image is only tagged with the Python version, so the owner could deploy a later version on the same tag (e.g with security fixes). For your own code you would tag with something unique to your code (e.g the Git SHA), as that will uniquely define that iteration of the container.
Nice - but how do I deploy this?
So I’ve built my Docker images for my applications, and I can run them locally: how do I deploy them? Well this is where Kubernetes comes into the picture. Kubernetes is a system for automating deployment, scaling, and management of containerized applications, with built-in support for load balancing, progressing rollouts, resource management (“bin packaging”) and self-healing.
With Kubernetes you define what your ecosystem should look like in config files, and then Kubernetes makes it so. This is declarative rather than imperative: Kubernetes has a fixed view of your deployment, so it’s entirely reproducible. If any one component breaks, Kubernetes can self-heal, because it knows what should be running.
A set of Kubernetes config files act as a nice layer of abstraction over the actual silicon, defining:
Horizontal scaling rules
The deployment config is the interesting bit: this is where we define the Docker image we want to deploy, the number of instances we need, and any required environment settings. We also specify how much CPU and memory we’ll need: not only does this neatly solve our “out of memory” issue above, it will let Kubernetes automatically pack services onto the cluster to make optimal use of the resources. This can save you a lot of money: we shaved over 60% off our bill in our migration, and that’s before we’ve done any serious optimisation.
No downtime deploys
Deployment definitions let you specify “liveness” checks that Kubernetes can use to determine whether the service is actually alive and can therefore be added to it’s load balancer.
It actually goes a bit deeper than that: during a deployment, Kubernetes will create new container instances and let them run for a bit to see if they’re actually working. Only when it’s happy all is well will it connect the container to the load balancer, and eventually tear down the old container.
This means that it’s very hard to deploy a broken build, so long as the check you’ve defined is a good indication of liveness. If the service is broken, the deployment just won’t take, and the existing system will be untouched. You’ve also always got the option of just rolling back to the previous version: deployment rollback is a built in feature, and very quick.
A simple Kubernetes config file
Here’s a very simple example of a config file for an app. We’re assuming that the Docker image contains an app running on port 8001: