Migrating Mastodon S3 Providers

2024-03-08

As part of some recent work Kakious and I have been doing on the Mastodon instance we run, Floofy.tech, we were looking to cut costs while also improving performance. Part of that was migrating to new hardware using Proxmox rather than ESXi, and that migration was fairly unremarkable - we're still running everything in Kubernetes across multiple VMs, so there isn't really much to note (except we also moved to using Talos over k3s - I will have a longer post about that at some point). Another part of this was moving storage providers for the instance. We were using DigitalOcean Spaces, which served us well, but the pricing left quite a bit on the table. For $5/month, you get 250GiB of storage and 1TB of egress, with $0.02/GiB stored and $0.01GiB for egress after that. Our instance fell comfortably in this bucket, but very comfortably, to the point we would certainly save money going elsewhere. Being employed at Cloudflare, and already having a decent setup for Floofy there, we turned to the R2 offering. With no egress costs and less than 100GiB stored (on average - depends how busy the fediverse is!), we shouldn't be paying anywhere near $5/month, and we're only paying for storage since egress is free and heavily cached.

So! With that decided, it was time to figure out our migration path. The plan was simple - using rclone, setup two remotes on a temporary virtual machine (or LXC container, in this case), and run a bulk transfer overnight. Once complete, we run one more sync then quickly swap the configuration on the Mastodon instance. The window of time between the last sync and Mastodon instances picking up the new configuration should be small enough that we don't miss any new media. Finally, we swap over the DNS to point to our R2 bucket, which should update quickly as the DNS was already proxied through Cloudflare.

Setup of the container was straightforward - we grabbed a Debian 12 image, installed rclone, and then setup two remotes. One pointed to our existing DigitalOcean Spaces bucket (digitalocean-space) and the other our new Cloudflare R2 bucket (cloudflare-r2). A quick rclone lsd on both remotes to confirm connection later, and then a dry run sync or two to verify, we were ready to go. I loaded up tmux, hit go, and waited.

It was going smoothly, until the container decided to drop its internet connection. I'm still not sure what caused this but after running dhclient it was fine again. The sync went off without a hitch otherwise.

When I woke up, it was time to run another sync then make the configuration change. I'd already changed the values in our Kubernetes state repository, so it was just a case of pushing it to GitHub and letting ArgoCD sync everything up. First, I reran the rclone sync to ensure that anything new was synced over, then quickly pushed the configuration up. It took about a minute to cycle the pods to the new configuration, at which point I removed the DNS record pointing to DigitalOcean and swapped it over to the Cloudflare R2 bucket. Done!

I genuinely expected this to be more difficult, but it really was that easy. This process would work for any rclone-compatible storage provider, of which there are many, so I'd feel pretty comfortable recommending this to others. Depending how busy your instance is, it may be worth doing a final rclone copy (which copies new files but doesn't delete from the target) to catch and stragglers after the configuration change, and depending on your DNS setup you may need to modify the time-to-live values ahead of the migration, but we didn't really hit those caveats.

Hopefully this is helpful to others - if you have any questions, feel free to poke me on the federated universe @arch@floofy.tech.