Randoom a Michael Friis production

Posted
27 January 2009 @ 1am

Categories
Cloud, SQL

Tagged
, , , ,

You're reading Randoom, a Michael Friis production

EC2 SQL Server backup strategies and tactics

The many backup modes offered by Microsoft SQL server, combined with the prodigious hardware options offered on Amazon EC2 can make choosing a backup strategy for your setup a little confusing. In this post, I’ll sketch some options and end with a simple PowerShell script usable on both Express and Standard versions, that’ll backup your database to S3.

To start with, you should probably be running your database off an EBS (Elastic Block Storage) volume. They can sustain many more random IOPS than instance disks (good for typical workloads) and they live independently of your instances. While i haven’t had an instance die from under me, if one should cop it, all data on the local disks will be gone-gone.

EBS volumes can fail too however, and will do se at an annualised rate of 0.1% to 0.5% according to Amazon. You may decide this is good enough for your purposes and leave it at that. Bear in mind, however, that this failure rate is compounded by other factors such as Windows or SQL Server malfunctioning and corrupting the volume, you pressing the wrong button in AWS console/Management Studio, a disgruntled employee doing it on purpose or something else entirely. In other words, you should take backups.

A simple approach is to use the snapshotting feature of EBS. This basically saves the (diff of the) contents of your volume to S3, from whence it can be restored to life if something happens to the volume. I’ve used this to muck around with test-environments and such. It works fine and could conceivably be automated using the AWS API. It’s a rather low-level approach though, and you could easily find yourself restoring from a snapshot taken with SQL Server’s pants around its ankles, in the middle of a transaction. While obviously capable of recovering from such an indescretion and rolling back to a safe state, this can be something of a hassle.

Another option is to do normal backups to another EBS volume mounted on the same instance. While I have no knowledge of Amazon datacenter topologies, one could fear that different EBS volumes attached to the same instance end up being hosted on the same EBS-SAN-thingamebob, the death of which would then also be your undoing.

You could also copy backup-files to another instance mounting its own EBS volume, or set up replication — allowing you to recover very quickly. Note that SQL Server Express can subscribe to a Standard instance in a replication setup, although it cannot publish. Your replicated instance could even live in a different availability zone, although you would then incur bandwidth cost on exchanged data on top of the cost of running an extra instance.

The approach we ended up taking uses S3 however. Amazon promises S3 to be very safe (“no single point of failure”) and has the added benefit of being available independently of EC2 instances. To do a backup, we basically do a full database backup to one of the local disks and then move the file to S3. This is handled by a PowerShell script invoked as a scheduled task, making it usable on SQL Server Express instances (where the native SQL Server backup scheduling is not otherwise available). To handle the S3 interaction, we use the free CloudBerry snap-in. A few gotchas:

  1. If you’re running on a X64 system, install the snap-in with that .Net version
  2. You probably have to modify the PowerShell script execution policy on your instance
  3. You need the DotNetZip lib for zipping

Some possible improvements are zipping of files and shrinking of logfile before upload (*both added February 1. 2009*) and perhaps an incremental backup scheme.

Script is included below.

# This Powershell script is used to backup a SQL Server database and move the backup file to S3
# It can be run as a scheduled task like this:
# C:\WINDOWS\system32\WindowsPowerShell\v1.0\powershell.exe &'C:\Path\dbbackup.ps1'
# Written by Michael Friis (http://friism.com)

$key = "yourkey"
$secret = "yoursecret"
$localfolder = "C:\path\tobackupfolder"
$s3folder = "somebucket/backup/"
$name = Get-Date -uformat "backup_%Y_%m_%d"
$filename = $name + ".bak"
$zipfilename = $name + ".zip"
$dbname = "yourdb"
$dblogname = "yourdb_log"
$ziplibloc = "C:\pathto\ziplib\Ionic.Utils.Zip.dll"

# Remove existing db backup file
if(Test-Path -path ($localfolder + "\" + $filename)) { Remove-Item ($localfolder + "\" + $filename) }

$query =
"
USE {2}
GO

DBCC SHRINKFILE({3})

GO

BACKUP DATABASE [{2}] TO  DISK = N'{0}\{1}'
        WITH NOFORMAT, NOINIT,  NAME = N'backup', SKIP, REWIND, NOUNLOAD,  STATS = 10
GO
declare @backupSetId as int
select @backupSetId = position from msdb..backupset
where database_name=N'{2}' and backup_set_id=(select max(backup_set_id)
from msdb..backupset where database_name=N'{2}' )

if @backupSetId is null
begin
        raiserror(N'Verify failed. Backup information for database ''{2}'' not found.', 16, 1)
end
RESTORE VERIFYONLY FROM  DISK = N'{0}\{1}'
        WITH  FILE = @backupSetId,  NOUNLOAD,  NOREWIND" -f $localfolder, $filename, $dbname, $dblogname

sqlcmd -Q $query -S "."

# Remove existing zip file
if(Test-Path -path ($localfolder + "\" + $zipfilename)) { Remove-Item ($localfolder + "\" + $zipfilename) }

#Zip the backup file
[System.Reflection.Assembly]::LoadFrom($ziplibloc);
$zipfile =  new-object Ionic.Utils.Zip.ZipFile($localfolder + "\" + $zipfilename);
$e= $zipfile.AddFile($localfolder + "\" + $filename)
$zipfile.Save()

#Upload to S3
Add-PSSnapin CloudBerryLab.Explorer.PSSnapIn
$s3 = Get-CloudS3Connection -Key $key -Secret $secret
$destination = $s3 | Select-CloudFolder -path $s3folder
$src = Get-CloudFilesystemConnection | Select-CloudFolder $localfolder
$src | Copy-CloudItem $destination –filter $zipfilename

14 Comments

Posted by
Jim
27 January 2009 @ 6am

wow – i’m surprised no one has commented on your great blog entry. Thx for taking the time to write this up!


Posted by
Edward M. Goldberg
27 January 2009 @ 8am

Michael,

First off, great BLOG post. lots of great information here.

The idea of a second EBS mounted as a Backup disk is a very good idea.

The process:

mount,
format,
backup to,
dis-mount, <– important keeps the Snap boot clean.
Snapshot,
Delete Volume!!!!

This may sound strange, but, hear me out.

The “real” information is in S3 in the S3-Snapshot of the backup. The Volume is just needed to create and attach a file system to the server.

When you need to see that backup data just create a Volume from Snap!!

The Backup is safe in S3 and the cost is very low.

No need to keep Volumes around, or mounted that are not in active use.

NEVER, add to the Volume more backups. You are just in danger of mucking up all of the backups on that volume.

The S3 Snap is a very compressed version of just that one backup session. You only pay for what you used.
When you re-use a Volume you have to pay for the whole Volume used or not. The S3-Backups will just be larger and more costly.

In short:

1) The real data is the Snapshot
2) Only keep Volumes that are in use and mounted.
3) Use them once and start a new one. Never remount and add.

Edward M. Goldberg
http://Blog.EdwardMGoldberg.com


[...] we would merely create a new volume, attach it to our database instance and restore from backup (conveniently located in nearby S3). Reaction time and data loss would be similar, but performance will not be degraded for any [...]


Posted by
Neil
30 April 2009 @ 5pm

Great script – I just ran into one issue that may help others attempting to implement -

In the most recent version of Ionic Zip Utils they have changed the name space from Ionic.Utils.Zip to Ionic.Zip – so the dll name and the ZipFile command both have to be updated.


Posted by
George
9 July 2009 @ 9am

I am new to EC2 and figuring things our how to setup my servers.I am using SQL Server 2005 on windows server 2003 .

I am a developer and not a DBA but for back instead of doing log shipping and storing transaction logs I am thinking to have a differnt solution which can work like Full backup.I think to have 5 snapshots of my EBS database through out the day.Snapshots older than 2 days will be deleted.In this case all snapshot will have Fullbackup for last 2 days as those are going to be exact same copy of EBS volume.

Is this going to work or its too much allocation of S3?

Waitig for your suggestions.


Posted by
George
9 July 2009 @ 9am

After lot of struggle to find a backup strategy finally i found this great article!!

Thanks a lot!

You said that : “While i haven’t had an instance die from under me, if one should cop it, all data on the local disks will be gone-gone.”

Can you give me idea of having any of your instances died so far.If they did after how long life they died?I am curious and a bit tensed to know how reliable an instance is?


[...] EC2 SQL Server backup strategies and tactics [...]


[...] EC2 SQL Server backup strategies and tactics [...]


Posted by
Dominic Watson
8 August 2009 @ 1pm

Brilliant stuff thanks. I had to do a couple of things:

1) Change Ionic.Utils.Zip -> Ionic.Zip (as Neil pointed out)
2) Change sqlcmd -Q $query -S “.” > sqlcmd -Q $query -S localhost\SQLEXPRESS (presumably because of using SQL Express?)

Thanks again,

Dominic


Posted by
Manouchka
14 August 2009 @ 2pm

Edward M. Goldberg said:
“NEVER, add to the Volume more backups. You are just in danger of mucking up all of the backups on that volume.”
I don’t undertand why…
Snapshot EBS are “magic incremental”, smaller than a snapshot from and backups remain independent from each other

Any ideas ?

Emmanuel


Posted by
Manouchka
14 August 2009 @ 2pm

Edward M. Goldberg said:
“NEVER, add to the Volume more backups. You are just in danger of mucking up all of the backups on that volume.”
I don’t undertand why…
Snapshot EBS are “magic incremental”, smaller than a snapshot from a new volume and backups remain independent from each other

Any ideas ?

Emmanuel


Posted by
Justin
19 August 2009 @ 3am

The downside of this is that the S3 supports a maximum of 5GB files (at this moment) and snapshots. Note that if you are using the Express version, you can’t have more than like a 4GB database anyways (+ limitations on processors, memory allowed to be allocated, etc). Also, since the EBS is not guaranteed to not have a single point of failure, in the unlikely event that this does fail, you may not have access to the backup EBS either. If anyone has ideas about circumventing these limitations, feel free to let me know.


Posted by
Eric
1 October 2009 @ 3am

“a snapshot taken with SQL Server’s pants around its ankles”

Thanks – I finally have a way to explain how snapshots work to my boss!



Leave a Comment

Querying relationships with LinqtoCRM Rent vs. Buy (or EC2 vs. building your own iron)