I learned something last week: when you start an RDS database instance from a snapshot, the instance's volume isn't copied all at once. Instead, it's lazily paged in from the snapshot, which is stored in S3. I suppose that shouldn't surprise me; after all, it's documented behavior (albeit for EC2, not RDS).
What did surprise me was just how much of a performance hit it causes. For example, a table scan of one of our large, rarely accessed (so not in cache) tables took about 10 minutes on an instance that had been running for several months. The same query, on a new but otherwise identical replica took nearly an hour.
Read replicas are one of the places where this behavior is particularly painful: one of the primary reasons for using a replica is to move long-running queries away from your primary machine. But with lazy-loading, for the first few hours (or days!) of a replica's life, it's significantly slower than running against the master. Worse, if you ever need to restore from a snapshot: your whole system will run slower while the blocks are loaded.
So what can you do?
You need to “warm up” the replica, by reading data from it before you put it into active use. And there are several ways to do this: one is to write queries that touch every table, preferably with a table-scan. An alternative is to take a backup of the database, and immediately throw it away:
mysqldump --all-databases --host HOST --user USER --password > /dev/null
I'm not convinced this is sufficient: it will touch all of the data blocks, but may leave the index blocks behind. I welcome any better approaches.
No comments:
Post a Comment