With Twitter being a mess at the moment, I decided to try out Mastodon1 as an alternative. Mastodon is a federated social media platform, built on top of a protocol called ActivityPub2. It can be self-hosted, letting you own your data, and I wanted to do so using Nomad3 as a cluster orchestrator. This post shows how I did it, and will hopefully inspire you to give Mastodon a try if you haven’t already! You can find me on the fediverse at @[email protected] 🐘.
The first thing to note is that I’m writing this up as inspiration, not as a step-by-step guide. It’s prove that it works, and offer an alternative to more complex setups that use Kubernetes (which Mastodon have a first-party Helm Chart4 for). Your mileage may vary - use your best judgement! If you know of improvements, feel free to send me a message at the link above.
This post is meant to be read with the repository that I’ve open-sourced, containing the actual Nomad task definitions and scripts discussed below: https://github.com/CarrotCodes/nomad-mastodon
With that out of the way, let’s start with some goals and assumptions. I wanted to:
It’s worth noting that my budget balances my financial situation, the tooling I’m already familiar with, and performance of the system. You can definitely do things cheaper if budget is a primary concern, for example by using cheaper cloud providers like DigitalOcean. If you’d rather have someone else host Mastodon for you, cheaper services like masto.host exist too.
Mastodon itself is comprised of a number of systems:
On top of this, there are some other necessities to run it, and keep the system healthy after the initial setup:
tootctl- the administrative CLI for Mastodon5
As mentioned, I’m using Terraform and Ansible to manage my infrastructure. These tools have a lot of benefits, including making the act of setting up networking and servers repeatable, and documenting what exists in your cloud account. I strongly encourage you to use these tools (or alternatives) to manage your infrastructure, if you’re not already doing so.
There were some obvious choices on AWS for the main components - EC2 to run a server, RDS for a persistent database, and Elasticache for a Redis-compatible temporary data store. In researching S3 I also discovered that Cloudflare offer an S3-compatible service imaginatively named “R2”. It integrates nicely with their edge-caching service, and offers competitive pricing and performance, so I decided to give that a try as well.
There’s not too much to say about the Nomad jobs themselves - each primary job maps in to a Nomad job definition file (suffixed
.nomad - for example,
mastodon-web for the main web server). Mastodon have a first-party
docker-compose.yml file6, to start up docker containers automatically, and you can use that to understand how to start each of the components. Secrets are passed in as Nomad variables, mapping to environment variables in the specific task. A future improvement noted below is to figure out a way to share these environment variables across job definitions, but for now, they’re copy/pasted between jobs via a template file.
Nomad offers “periodic” tasks with a cron-like interface, to let you run tasks on a frequent schedule, which is perfect for the “frequent cleanup” requirement. The cleanup job starts
tootctl with some specific commands to remove media after a certain number of days, which I configured to 30 to start with. Media storage is something you should be aware of with Mastodon - it stores a “local” copy of media relevant for the users of your server. It can end up being quite a lot, so a good strategy to reduce your bill is to store discovered media for less time. The only downside being said media will have to be refetched when required again. Below is a graph of the storage usage for bunny.cloud - you can see media being cleaned up every day at 4am as it expires.
For running SQL migrations, I could see two choices:
I chose to use a batch job, because I wanted to control when I ran migrations, and decouple that process from running other critical components. This is because there are multiple jobs that require the migrations - at the very least, the main web server, and the message processor. Having a separate job means migrations can run before either of those jobs start. In practice I think the message processor would also be a decent choice.
Another aspect of administrating a Mastodon server, is interacting with
tootctl to do manual tasks. The tricky bit with a cluster orchestration system is “where do you run the command”? I think it’s overkill to spin up a dedicated entire job to run a single administrative command every now and then, so we have to pick a running job to run the command inside.
Nomad offers really handy CLI to inspect running jobs, and execute commands inside their tasks (called
allocs). I’ve included a script,
tootctl.sh, to let you run
tootctl commands inside the message processing task on-demand. Usage is exactly the same as the original command, just through the script:
🥕 carrot 🗂 bunny-cloud-infra/infra/cluster 🐙 main $ ./tootctl.sh version Discovering alloc of "mastodon-sidekiq" to run tootctl in... Found alloc with ID: ef0fd901-d598-9fe6-b2c2-3c577c0aee2e Executing: "tootctl version" 4.0.2
I’m a fan of shipping things and iterating, so there are a few obvious improvements that I plan on making:
Hopefully this inspired you to give self-hosting Mastodon a go, if you’re so inclined. I’ve also proven that you don’t need a complicated Kubernetes setup to get a lot cluster orchestration goodness. In total this setup costs around $45 a month, which is happily within my budget. That’s the cost for a single person on my server, so if there were more people who contributed to the running costs, it would rapidly get cheaper.
I also hope that more people give Mastodon a go - I’m really enjoying it so far, and feel that it’s a much healthier form of social media for me. I’ve been struggling with Twitter feeling toxic for a while, and having a space that I can share things to without being sold adverts, or served divisive content to push platform metrics, is something I value greatly.
Is there another aspect of this setup you’d like to know about in more detail? Feel free to ask over on the fediverse, at @[email protected]!
docker-compose.yml - https://github.com/mastodon/mastodon/blob/main/docker-compose.yml ↩