TL;DR : I encountered a non-trivial bug in AWS EC2/EBS behavior.
I recently built an email server on an AWS EC2 instance. The experience generated a number of discoveries I wish to share as tips, (or warnings). I will share one specific experience here, and possibly others later. I refer to this one as The Booting-With-Additional-Volume “Feature”.
As you may know, more than one EBS volume may be attached to an EC2 instance. Though you can specify more than one volume be attached during EC2 instance launch, very often you will launch an EC2 instance using only its initial EBS sysroot volume, (Root Device). If the need arises later, then you may attach an “additional” EBS volume. You may attach the “additional” volume either while the EC2 instance is stopped, or running.
Let’s say you wish to attach one or more “additional” volumes to an existing EC2 instance so you may copy or move files between volumes. The EC2 Dashboard, or the AWS CLI, makes it easy to attach volumes to EC2 instances. Keep in mind that AWS doesn’t restrict the nature of the volume. You may attach:
* a raw volume, e.g., containing no file system
* a purely data volume, e.g., containing an ext4 or NTFS file system
* a bootable-sysroot volume containing an operating system
Let’s say you attach an “additional” volume to a running EC2 instance. Then you mount the volume, use the volume as necessary, (copy-edit-move some files), and all is well up to this point.
Now, because it’s necessary in your workflow, you stop-start or restart the EC2 instance. Here’s the surprise you may get: If the “additional” attached volume happens to be a bootable-sysroot volume, then when the EC2 instance restarts, it may boot using the “additional” volume as the sysroot device, instead of using the EC2 instance Root Device.
Yes, that happened several times in one session during construction of my email server. But, it was not consistent, it was intermittent. The EC2 instance did not boot using the wrong sysroot volume every time, just sometimes.
An obvious workaround is to simply detach the “additional” volume before restarting the EC2 instance.
I’m not sayin’ anything… I’m just sayin’… ya know? All software has bugs. There are hidden hazards in these waters.
[UPDATE 2016.0419.1504]: In the interest of being rigorous, I wished to reproduce the misbehavior described above. I set up a test specifically for this case, and re-started the EC2 test instance. Note in the image below that /dev/sdf is an additional block device. During the first re-start after attaching the additional volume, the EC2 instance booted using the wrong sysroot volume, /dev/sdf, instead of the correct Root Device, /dev/sda1. Likewise, the second re-start booted using the wrong volume. Likewise, the third.
I suspect this “feature” will be “deprecated” in the future, so don’t rely on it.