It’s easy to think that servers in the cloud are a magical service that just works. In reality, they’re just someone else’s computer, and all computers break down at some point. If you’re using EBS-backed EC2 instances, you should perform regular backups to AWS S3.
Why Backup a Server Already In The Cloud?
Even cloud servers aren’t safe from failure. EBS volumes, which all AWS EC2 instances run on, are not entirely redundant. This means that if your server experiences a drive failure, you could lose your data.
There’s no need to panic though, as EBS volumes are actually fairly safe, all things considered. They do use RAID and as such are redundant, and are about 20 times safer than normal disk drives.
However, they can and do still break down from time to time, so you should prepare for this case, and keep backups. EBS has a failure rate of 0.1% — 0.4% annually, compared to a normal disk’s rate of around 4%. If you’re running a lot of them, you’re more likely to encounter a failure.
io2 volumes are the most durable, but all other types have only 99.8% – 99.9% durability, which pales in comparison to S3’s 99.999999999% durability (which is basically as good as it gets before it loses all its meaning). S3 objects are backed up across multiple datacenters. EBS is stored in a single Availability Zone.
Even AWS has had massive datacenter problems. In 2019, they had a power failure that killed the EBS servers at one of their facilities. Because they only host EBS locally in the same Availability Zone as the EC2 instance, this one failure completely destroyed all customer data stored on those volumes.
Luckily, this is fixed very easily, as AWS provides an easy-to-configure service for automating backups to S3, which is as safe as it gets.
Configuring EBS Snapshots
You may be worried about backing up a large drive, but EBS snapshots are incremental backups. This means that for every consecutive snapshot, only the data that has changed is added to the backup file.
Doing regular backups won’t fill up your S3 bucket, though, for use cases where the drive is constantly being written to, your backup files may be quite a bit larger than the drive itself. Luckily, S3 storage is very cheap in comparison to EBS, and snapshot data can be expired over time.
Turning EBS snapshots on is fairly simple. Head over to the EC2 Management Console, and click on “Lifecycle Manager” under Elastic Block Store. Create a new lifecycle policy.
Then, you’ll need to specify a tag for this policy to apply to. It can either select EC2 instances or EBS volumes directly when looking at tags.
If you want it to apply to all your servers, you’ll need to make a new tag, set it here, and apply it to all your volumes. If you just want to turn it on for one volume, select “Name” and find the name of the volume.
Then, you’ll need to set the schedule. For quickly changing data, backing up often can incur more costs. The default is every 12 hours daily, which is probably fine for most people, but for servers that aren’t experiencing write-heavy loads, backing up more often won’t hurt.
The other thing you’ll need to configure is snapshot retention. This will delete older data after a certain number of days or backups. You’ll want to set this high enough so that you won’t experience data loss after a server failure if you don’t restore quickly enough. Deleting data older than a few days to a week is fine.
Relaunching a Server From An EBS Snapshot
Restoring is incredibly simple. You’ll find the list of snapshots under EBS in the sidebar. Right-click on any one of them, and hit “Create Volume.”
A new EBS volume will be created with the snapshot data. You’ll then need to mount it to your EC2 instance by turning the server off, detaching the broken volume, attaching the new one from the console, and rebooting the instance.
If you’re using a non-root volume, you can mount without any downtime, you’ll just need to manually mount the drive in the OS with
umount on Linux or Disk Management on Windows.
Restoring can take a little while though. AWS has an extra feature, called Fast Snapshot Restore, which keeps redundancy on hand that makes this process nearly instant. It costs money though, and we don’t really recommend using it for most workloads. EBS failures are rare enough that it’s likely not going to be a problem, and even with Fast Snapshot Restore, you’re still going to have downtime.