EC2 SQL Server backup strategies and tactics
The many backup modes offered by Microsoft SQL server, combined with the prodigious hardware options offered on Amazon EC2 can make choosing a backup strategy for your setup a little confusing. In this post, I’ll sketch some options and end with a simple PowerShell script usable on both Express and Standard versions, that’ll backup your database to S3.
To start with, you should probably be running your database off an EBS (Elastic Block Storage) volume. They can sustain many more random IOPS than instance disks (good for typical workloads) and they live independently of your instances. While i haven’t had an instance die from under me, if one should cop it, all data on the local disks will be gone-gone.
EBS volumes can fail too however, and will do se at an annualised rate of 0.1% to 0.5% according to Amazon. You may decide this is good enough for your purposes and leave it at that. Bear in mind, however, that this failure rate is compounded by other factors such as Windows or SQL Server malfunctioning and corrupting the volume, you pressing the wrong button in AWS console/Management Studio, a disgruntled employee doing it on purpose or something else entirely. In other words, you should take backups.
A simple approach is to use the snapshotting feature of EBS. This basically saves the (diff of the) contents of your volume to S3, from whence it can be restored to life if something happens to the volume. I’ve used this to muck around with test-environments and such. It works fine and could conceivably be automated using the AWS API. It’s a rather low-level approach though, and you could easily find yourself restoring from a snapshot taken with SQL Server’s pants around its ankles, in the middle of a transaction. While obviously capable of recovering from such an indescretion and rolling back to a safe state, this can be something of a hassle.
Another option is to do normal backups to another EBS volume mounted on the same instance. While I have no knowledge of Amazon datacenter topologies, one could fear that different EBS volumes attached to the same instance end up being hosted on the same EBS-SAN-thingamebob, the death of which would then also be your undoing.
You could also copy backup-files to another instance mounting its own EBS volume, or set up replication — allowing you to recover very quickly. Note that SQL Server Express can subscribe to a Standard instance in a replication setup, although it cannot publish. Your replicated instance could even live in a different availability zone, although you would then incur bandwidth cost on exchanged data on top of the cost of running an extra instance.
The approach we ended up taking uses S3 however. Amazon promises S3 to be very safe (“no single point of failure”) and has the added benefit of being available independently of EC2 instances. To do a backup, we basically do a full database backup to one of the local disks and then move the file to S3. This is handled by a PowerShell script invoked as a scheduled task, making it usable on SQL Server Express instances (where the native SQL Server backup scheduling is not otherwise available). To handle the S3 interaction, we use the free CloudBerry snap-in. A few gotchas:
- If you’re running on a X64 system, install the snap-in with that .Net version
- You probably have to modify the PowerShell script execution policy on your instance
- You need the DotNetZip lib for zipping
Some possible improvements are zipping of files and shrinking of logfile before upload (*both added February 1. 2009*) and perhaps an incremental backup scheme.
Script is included below.
# This Powershell script is used to backup a SQL Server database and move the backup file to S3 # It can be run as a scheduled task like this: # C:\WINDOWS\system32\WindowsPowerShell\v1.0\powershell.exe &'C:\Path\dbbackup.ps1' # Written by Michael Friis (http://friism.com) $key = "yourkey" $secret = "yoursecret" $localfolder = "C:\path\tobackupfolder" $s3folder = "somebucket/backup/" $name = Get-Date -uformat "backup_%Y_%m_%d" $filename = $name + ".bak" $zipfilename = $name + ".zip" $dbname = "yourdb" $dblogname = "yourdb_log" $ziplibloc = "C:\pathto\ziplib\Ionic.Utils.Zip.dll" # Remove existing db backup file if(Test-Path -path ($localfolder + "\" + $filename)) { Remove-Item ($localfolder + "\" + $filename) } $query = " USE {2} GO DBCC SHRINKFILE({3}) GO BACKUP DATABASE [{2}] TO DISK = N'{0}\{1}' WITH NOFORMAT, NOINIT, NAME = N'backup', SKIP, REWIND, NOUNLOAD, STATS = 10 GO declare @backupSetId as int select @backupSetId = position from msdb..backupset where database_name=N'{2}' and backup_set_id=(select max(backup_set_id) from msdb..backupset where database_name=N'{2}' ) if @backupSetId is null begin raiserror(N'Verify failed. Backup information for database ''{2}'' not found.', 16, 1) end RESTORE VERIFYONLY FROM DISK = N'{0}\{1}' WITH FILE = @backupSetId, NOUNLOAD, NOREWIND" -f $localfolder, $filename, $dbname, $dblogname sqlcmd -Q $query -S "." # Remove existing zip file if(Test-Path -path ($localfolder + "\" + $zipfilename)) { Remove-Item ($localfolder + "\" + $zipfilename) } #Zip the backup file [System.Reflection.Assembly]::LoadFrom($ziplibloc); $zipfile = new-object Ionic.Utils.Zip.ZipFile($localfolder + "\" + $zipfilename); $e= $zipfile.AddFile($localfolder + "\" + $filename) $zipfile.Save() #Upload to S3 Add-PSSnapin CloudBerryLab.Explorer.PSSnapIn $s3 = Get-CloudS3Connection -Key $key -Secret $secret $destination = $s3 | Select-CloudFolder -path $s3folder $src = Get-CloudFilesystemConnection | Select-CloudFolder $localfolder $src | Copy-CloudItem $destination –filter $zipfilename