Amazon has this leadership principle of "Learn and Be Curious" which is all about wanting to know things and enjoying learning new things. I have my own version of this called "Learn and Be Furious." Every once in a while I have to learn how something works, and once I get in there and figure it out, I'm shaking my fist at the screen asking "why did they DO it this way!?"
In #AWS EBS volumes are the virtual hard disks on EC2 instances, and EBS volumes can have snapshots. Snapshots are often used for backup/recovery and lots of other important uses, so there is a way to "lock" a snapshot. This prevents it being deleted accidentally. Yesterday I had to learn how to work with locked snapshots.
Here's what I learned.
The API
How do you lock a snapshot? There's an #EC2 modify-snapshot-attribute
API, but "locked" is not a snapshot attribute. You can't lock it that way. Snapshot attributes
are actually mainly permissions. It allows some folks to see, and thereby launch instances from, the snapshot. This is how, say, the Debian team or the FreeBSD team make an AMI that you can launch in EC2. They make an EC2 instance, make a snapshot of its EBS volume, set its snapshot public, and do some other things that make it available. So attributes
aren't really "attributes" in some general sense: they're permissions.
If you want to lock a snapshot there's a lock-snapshot
API. That's all it's good for: locking snapshots. If you want to unlock one, you guessed it: different API: unlock-snapshot
.
This isn't exactly bad. Generally speaking, AWS APIs are service:verb-noun
. So ec2:lock-snapshot
fits the idiom and the common pattern. But by that logic, you'd expect ec2:share-snapshot
and ec2:unshare-snapshot
instead of ec2:modify-snapshot-attributes
with user: all
.
Why so furious?
I'm writing a janitor job that finds orphaned snapshots and deletes them. But if the snapshot is locked, trying to delete it throws an exception.
There are obviously 2 ways to do this: try it anyway and catch the exception when the snapshot is locked and deal with it. Or, I can figure out which snapshots are locked, and don't try to delete them in the first place.
I'm doing the latter, because I guess I want exceptions to be thrown only on failures. I don't want the janitor to run into something I did on purpose (locking a snapshot), and then figure it out down in the exception handler. I guess this is just what I think is the right way to do it, and maybe I'm wrong.
How do I find locked snapshots?
You'd think that you could call describe-snapshots
, which takes certain Filters
. There's a lot of possible things to filter on. I can get it to filter down to a certain set of snapshots based on a few criteria. Locked state is not one of them. In fact, the status of the lock is not returned in the information you get from describe-snapshots
. If you wanted to know about locked snapshots, you should have called describe-locked-snapshots
, which will return just those.
What about the list of unlocked snapshots?
If I have a list of snapshots (say, a list of orphans that should be deleted), but I want to figure out which ones are not locked, how do I do that?
First I get the list of all snapshots (or in my case, all orphaned snapshots). Then I get the list of all locked snapshots. Then I do the diff to remove locked snapshots from the list of all snapshots.
This feels like what my niece would call wonker bonkers. I dunno. Maybe my expectations are all wrong.