Michael Friis' Blog

About


Google Container Engine for Dummies

Last week, Google launched an alpha version of a new product called Google Container Engine (GKE). It’s a service that runs pre-packaged Docker images for you: You tell GKE about images you want to run (typically ones you’ve put in the Docker Registry, although there’s a also a hack to run private images) and how many instances you need. GKE will spin them up and make sure the right number is running at any given time.

The GKE Getting Started guide is long and complicated and has more JSON than you shake a stick at. I suspect that’s because the product is still alpha, and I hope the Google guys will improve both the CLI and web UIs. Anyway, below is a simpler guide showing how to stand up a stateless web site with just one Docker image type. I’m also including some analysis at the end of this post.

I’m using a Mono/ASP.NET vNext Docker image, but all you need to know is that it’s an image that exposes port 5004 and serves HTTP requests on that port. There’s nothing significant about port 5004 – if you want to try with an image that uses a different port, simply substitute as appropriate.

In the interest of brevity, the description below skips over many details. If you want more depth, then remember that GKE is Kubernetes-as-a-Service and check out the Kubernetes documentation and design docs.

Setup

  1. Go to the Google Developer Console and create a new project
  2. For that project, head into the “APIs” panel and make sure you have the “Google Container Engine API” enabled
  3. In the “Compute” menu section, select “Container Engine” and create yourself a new “Cluster”. A cluster size of 1 and a small instance is fine for testing. This guide assumes cluster name “test-cluster” and region “us-central1-a”.
  4. Install the CLI  and run gcloud config set project PROJECT_ID (PROJECT_ID is from step 1)

Running raw Pod

The simplest (and not recommended) way to get something up and running is to start a Pod and connect to it directly with HTTP. This is roughly equivalent to starting an AWS EC2 instance and connecting to its external IP.

First step is to create a JSON-file somewhere on your system, let’s call it pod.json:

{
  "id": "web",
  "kind": "Pod",
  "apiVersion": "v1beta1",
  "desiredState": {
    "manifest": {
      "version": "v1beta2",
      "containers": [
        {
          "name": "web",
          "image": "friism/aspnet-web-sample-web",
          "ports": [
            { "containerPort": 5004, "hostPort": 80 }
          ]
        }
      ]
    }
  },
  "labels": {
    "name": "web"
  }
}

What you should care about is the Docker image/repository getting run (friism/aspnet-web-sample-web) and the port mapping (the equivalent of docker run -p 80:5004). With that, we can tell GKE to start a pod for us:

$ gcloud preview container pods --cluster-name test-cluster --zone us-central1-a \
    create web --config-file=/path/to/pod.json
...
ID                  Image(s)                       Host                Labels              Status
----------          ----------                     ----------          ----------          ----------
web                 friism/aspnet-web-sample-web   <unassigned>        name=web            Waiting

All the stuff before “create” is boilerplate and the rest is saying that we’re requesting a pod named “web” as specified in the JSON file.

Pods take a while to get going, probably because the Docker image has to be downloaded from Docker Hub. While it’s starting (and after), you can SSH into the instance that’s running your pod to see how it’s doing, eg. by running sudo docker ps. This is the SSH incantation:

$ gcloud compute ssh --zone us-central1-a k8s-test-cluster-node-1

The instances are named k8s-<cluster-name>-node-1 and you can see them listed in the Web UI or with gcloud compute instances list. Wait for the pod to change status to “Running”:

$ gcloud preview container pods --cluster-name test-cluster --zone us-central1-a list
ID                  Image(s)                       Host                              Labels              Status
----------          ----------                     ----------                        ----------          ----------
web                 friism/aspnet-web-sample-web   k8s-<..>.internal/146.148.66.67   name=web            Running

The final step is to open up for HTTP traffic to the Pod. This setting is available in the Web UI for the instance (eg. k8s-test-cluster-node-1). Also check that the network settings for the instance allow for TCP traffic on port 80.


And with that, your site should be responding on the external ephemeral IP address of the host running the pod.

As mentioned in the introduction, this is not a production setup. The Kubernetes service running the pod will do process management and restart Docker containers that die for any reason (to test this, try ssh’ing into your instance and docker-kill the container that’s running your site – a new one will quickly pop up). But your site will go down in case there’s a problem with the pod, for example. Read on for details on how to extend the setup to cover that failure mode.

Adding Replication Controller and Service

In this section, we’re going to get rid of the pod-only setup above and replace with a replication controller and a service fronted by a loadbalancer. If you’ve been following along, delete the pod created above to start with a clean slate (you can also start with a fresh cluster).

First step is to create a replication controller. You tell a replication controller what and how many pods you want running, and the controller then tries to make sure the correct formation is running at any given time. Here’s controller.json for our simple use case:

{
  "id": "web",
  "kind": "ReplicationController",
  "apiVersion": "v1beta1",
  "desiredState": {
    "replicas": 1,
    "replicaSelector": {"name": "web"},
    "podTemplate": {
      "desiredState": {
         "manifest": {
           "version": "v1beta1",
           "id": "frontendController",
           "containers": [{
             "name": "web",
             "image": "friism/aspnet-web-sample-mvc",
             "ports": [{"containerPort": 5004, "hostPort": 80 }]
           }]
         }
       },
      "labels": { "name": "web" }
      }},
  "labels": {"name": "web"}
}

Notice how it’s similar to the pod configuration, except we’re specifying how many pod replicas the controller should try to have running. Create the controller:

$ gcloud preview container replicationcontrollers --cluster-name test-cluster \
    create --zone us-central1-a --config-file /path/to/controller.json
...
ID                  Image(s)                       Selector            Replicas
----------          ----------                     ----------          ----------
web                 friism/aspnet-web-sample-mvc   name=web            1

You can now query and see the controller spinning up the pods you requested. As above, this might take a while.

Now, let’s get a GKE service going. While individual pods come and go, services are permanent and define how pods of a specific kind can be accessed. Here’s service.json that’ll define how to access the pods that our controller is running:

{
  "id": "myapp",
  "selector": {
    "app": "web"
  },
  "containerPort": 80,
  "protocol": "TCP",
  "port": 80,
  "createExternalLoadBalancer": true
}

The important parts are selector which specifies that this service is about the pods labelled web above, and createExternalLoadBalancer which gets us a loadbalancer that we can use to access our site (instead of accessing the raw ephemeral node IP). Create the service:

$ gcloud preview container services --cluster-name test-cluster--zone us-central1-a create --config-file=/path/to/service.json
...
ID                  Labels              Selector            Port
----------          ----------          ----------          ----------
myapp                                   app=web             80

At this point, you can go find your loadbalancer IP in the Web UI, it’s under Compute Engine -> Network load balancing. To actually see my site, I still had to tick the “Enable HTTP traffic” boxes for the Compute Engine node running the pod – I’m unsure whether that’s a bug or me being impatient. The loadbalancer IP is permanent and you can safely create DNS records and such pointing to it.

That’s it! Our stateless web app is now running on Google Container Engine. I don’t think the default Bootstrap ASP.NET MVC template has ever been such a welcome sight.

Analysis

Google Container Engine is still in alpha, so one shouldn’t draw any conclusions about the end-product yet (also note that I work for Heroku and we’re in the same space). Below are a few notes though.

Google Container Engine is “Kubernetes-as-a-Service”, and Kubernetes is currently exposed without any filter. Kubernetes is designed based on Google’s experience running containers at scale, and it may be that Kubernetes is (or is going to be) the best way to do that. It also has a huge mental model however – just look at all the stuff we had to do to launch and run a simple stateless web app. And while the abstractions (pods, replication controllers, services) may make sense for the operator of a fleet of containers, I don’t think they map well to the mental model of a developer just wanting to run code or Docker containers.

Also, even with all the work we did above, we’re not actually left with a managed and resilient capital-S Service. What Google did for us when the cluster was created, was simply to spin up a set of machines running Kubernetes. It’s still on you to make sure Kubernetes is running smoothly on those machines. As an example, a GKE cluster currently only has one Master node. This is the Kubernetes control plane node that accepts API input and schedules pods on the GCE instances that are Kubernetes minions. As far as I can determine, if that node dies, then pods will no longer get scheduled and re-started on your cluster. I suspect Google will add options for more fault-tolerant setups in the future, but it’s going to be interesting to see what operator-responsibility the consumer of GKE will have to take on vs. what Google will operate for you as a Service.

Transatlantic Facebook application performance woes

Someone I follow on Twitter reported having problems getting a Facebook application to perform. I don’t know what they are doing so this post is just guessing at their problem, but the fact is that — if you’re not paying attention — you can easily shoot yourself in the foot when building and deploying Facebook apps. The diagram below depicts a random fbml Facebook app deployed to a server located in Denmark being used by a user also situated in Denmark. Note that Facebook doesn’t yet have a datacenter in Europe (they have one on each coast in the US).

fbservers

The following exchange takes place:

  1. User requests some page related to the application from Facebook
  2. Facebook realizes that serving this request requires querying the application and sends a request for fbml to the app
  3. The app gets the request and decides that in order to respond, it has to query the Facebook API for further info
  4. The Facebook API responds to the query
  5. The application uses the query results and the original request to create a fbml response that is sent to Facebook
  6. Facebook gets the fbml, validates it and macroexpand various fbml tags
  7. Facebook sends the complete page to the user

… so that adds up 6 transatlantic requests pr. page requested by the user. Assuming a 250ms ping time from the Danish app-server to the Facebook datacenter this is a whopping 1.5s latency on top of whatever processing time your server needs AND the time taken by Facebook to process your API request and validate your fbml.

So what do you do? Usually steps 3 and 4 can be eliminated through careful use of fbml and taking advantage of the fact that Facebook includes the ids of all the requesting users friends. Going for an iframe app is also helpful because it eliminates one transatlantic roundtrip and spares Facebook from having to validate any fbml. A very effective measure if you insist on fbml, is simply getting a server stateside — preferably someplace with low ping times to Facebook datacenters. There are plenty of cheap hosting options around, Joyent will even do it for free (I’m not affiliated in any way).

Rent vs. Buy (or EC2 vs. building your own iron)

Over the past months Jeff Atwood (of Coding Horror fame) has been chronicling Stack Overflows quest for new hardware, starting with “Server Hosting – Rent vs. Buy?” and ending with some glamour shots. I’ve recently (along with others) built a setup for a .Net website in the same “to big for shared or low-end VPS hosting and (much) too small to have dedicated sysadmin staff” segment. We ended up going for Amazon EC2 so I thought I’d share our reasoning by comparing with the Stack Overflow setup.

UPDATE1: Atwood just gave another reason as to why EC2 may be  attractive.

UDATE2: Some of the gloomy projections in this post actually came through (for Stackoverflow, not for us): Tuesday Outage: It’s RAID-tastic!

First some notes on pricing: Mr. Atwood’s three servers costs him a total $6,000, on top of which comes rack space rent, bandwidth and licenses (where he gets off very cheaply by taking advantage of Microsoft’s BizSpark program). We rent two large EC2 instances, one of them with a SQL Server Standard license, for $1.6 pr. hour giving a total of $14,000 pr. year (on top of which comes bandwidth and Elastic Block Store usage). Mr. Atwood could buy all his gear (minus rack space) more than two times over for that money. And except for one important parameter, which I shall expand on later, his machines are much faster: The Database server has eight cores and 24GB of memory, while the Web servers have four cores and 8GB of memory. Our EC2 instances have to get by with just two cores and 7.5GB. An interesting aside is that exactly half the $1.6 goes to licenses (compared with getting non-windows large instances), most of it for SQL Server Standard.

Several commenters had some beefs with the disks in the new Stack Overflow database server and I agree they look rather dinky. The server has six 7200 RPM SATA drives in RAID1 and RAID10 arrays for OS/logs and data files respectively. While the drives are “Enterprise” branded, I hazard the guess that they are pretty much the same as desktop ones, except for a slightly higher MTBF promise and better warranty from the manufacturer. At any rate, 7200 RPM drives can only sustain about 125 random IOs pr. second, and because of the RAIDing, the IO-rate of the array is not six times that. On EC2 we have access to formidable Elastic Block Storage volumes, which are capable of sustaining upwards of 1000 IOPS. Should we need more oomph, the volumes can be soft-raided together until the 1GBps link from the EC2 instance to the EBS volume runs out of steam. (For completeness, I should note that the sequential IO performance of EBS volumes is not very good. That is irrelevant for most database workloads however, since users generally don’t have the good manners to request data in the order it is placed on disk). Mr. Atwood mentions that query execution time decreases nicely with CPU speed. This is obviously an important parameter when building a responsive web site, but I’d venture that query throughput volume is mostly related to disk performance and that we would have an edge here.

Another potential problem is the reliability of the drives, the longer warranty-period not withstanding. Let’s assume for a second (and I admit this is a pretty improbable scenario), that one of the drives in the Stack Overflow RAID10 array (holding SQL Server data files) copped it and went to the great disk-array in the sky. Mr. Atwood would probably get a notification of this, and immediately initiate a backup-operation to the good array (also holding OS and logs). Let’s also assume that at that very moment, the God of the datacenter decides to invoke Murphy’s law on the other disk in the mirror-set, killing it and taking the array and the database with it. Stack Overflow stops flowing, blog posts are written (I shall magnaminously refrain), F# buffs recurse indefinitely trying to post a question about why Stack Overflow is down but finding that Stack Overflow is down. Reddit and Slashdot are notified, further swamping the exasperated web servers – unable as they are to get anything out of the database. Mr. Atwood, in the meantime, is cheering on SQL Server Management studio to restore the latest backup as quickly as possible to the still-good array. He manages to bring the site up within a quarter of an hour, minus all activity since the last backup and running at a somewhat slower clip than usual. Having wiped the sweat from his brow, he still has to drive to the datacenter and swap the two bad drives (unless he trusts the datacenter dudes to do so), getting the usual datacenter tinnitus and a sniffle in the process.

If, on the other hand, the EBS volume holding our database were to die (an even more unlikely event), we would merely create a new volume, attach it to our database instance and restore from backup (conveniently located in nearby S3). Reaction time and data loss would be similar, but performance will not be degraded for any period. Also, I don’t have to plod out to some datacenter and fuss around with a server. Instead I can concentrate on adding new features to the site.

Some people stress the “Elastic” part of EC2, claiming that it is mostly relevant if your hardware requirements are extremely variable or you expect them to increase very rapidly. I think the flexibility it affords is relevant in more modest scenarios too though. Some examples: Need more IPs? Click of a button. Need a test server to try out a new version of your site? Click of a button. Need to increase the size of your database drive? Grab a snapshot and use it to create a bigger volume. Plus all the other features such as a CDN, secure backup in S3 and redundant datacenters that Amazon offers without large upfront costs.

EC2 is no panacea for sure and I agree with Mr. Atwood that poring over specs and reviews and putting together your own gear on the cheap is extremely rewarding. If you value your time and need flexibility though, it might be worth it to limit yourself to building desktop systems and use something like EC2 for hosting.

EC2 SQL Server backup strategies and tactics

The many backup modes offered by Microsoft SQL server, combined with the prodigious hardware options offered on Amazon EC2 can make choosing a backup strategy for your setup a little confusing. In this post, I’ll sketch some options and end with a simple PowerShell script usable on both Express and Standard versions, that’ll backup your database to S3.

To start with, you should probably be running your database off an EBS (Elastic Block Storage) volume. They can sustain many more random IOPS than instance disks (good for typical workloads) and they live independently of your instances. While i haven’t had an instance die from under me, if one should cop it, all data on the local disks will be gone-gone.

EBS volumes can fail too however, and will do se at an annualised rate of 0.1% to 0.5% according to Amazon. You may decide this is good enough for your purposes and leave it at that. Bear in mind, however, that this failure rate is compounded by other factors such as Windows or SQL Server malfunctioning and corrupting the volume, you pressing the wrong button in AWS console/Management Studio, a disgruntled employee doing it on purpose or something else entirely. In other words, you should take backups.

A simple approach is to use the snapshotting feature of EBS. This basically saves the (diff of the) contents of your volume to S3, from whence it can be restored to life if something happens to the volume. I’ve used this to muck around with test-environments and such. It works fine and could conceivably be automated using the AWS API. It’s a rather low-level approach though, and you could easily find yourself restoring from a snapshot taken with SQL Server’s pants around its ankles, in the middle of a transaction. While obviously capable of recovering from such an indescretion and rolling back to a safe state, this can be something of a hassle.

Another option is to do normal backups to another EBS volume mounted on the same instance. While I have no knowledge of Amazon datacenter topologies, one could fear that different EBS volumes attached to the same instance end up being hosted on the same EBS-SAN-thingamebob, the death of which would then also be your undoing.

You could also copy backup-files to another instance mounting its own EBS volume, or set up replication — allowing you to recover very quickly. Note that SQL Server Express can subscribe to a Standard instance in a replication setup, although it cannot publish. Your replicated instance could even live in a different availability zone, although you would then incur bandwidth cost on exchanged data on top of the cost of running an extra instance.

The approach we ended up taking uses S3 however. Amazon promises S3 to be very safe (“no single point of failure”) and has the added benefit of being available independently of EC2 instances. To do a backup, we basically do a full database backup to one of the local disks and then move the file to S3. This is handled by a PowerShell script invoked as a scheduled task, making it usable on SQL Server Express instances (where the native SQL Server backup scheduling is not otherwise available). To handle the S3 interaction, we use the free CloudBerry snap-in. A few gotchas:

  1. If you’re running on a X64 system, install the snap-in with that .Net version
  2. You probably have to modify the PowerShell script execution policy on your instance
  3. You need the DotNetZip lib for zipping

Some possible improvements are zipping of files and shrinking of logfile before upload (*both added February 1. 2009*) and perhaps an incremental backup scheme.

Script is included below.

# This Powershell script is used to backup a SQL Server database and move the backup file to S3
# It can be run as a scheduled task like this:
# C:\WINDOWS\system32\WindowsPowerShell\v1.0\powershell.exe &'C:\Path\dbbackup.ps1'
# Written by Michael Friis (http://friism.com)

$key = "yourkey"
$secret = "yoursecret"
$localfolder = "C:\path\tobackupfolder"
$s3folder = "somebucket/backup/"
$name = Get-Date -uformat "backup_%Y_%m_%d"
$filename = $name + ".bak"
$zipfilename = $name + ".zip"
$dbname = "yourdb"
$dblogname = "yourdb_log"
$ziplibloc = "C:\pathto\ziplib\Ionic.Utils.Zip.dll"

# Remove existing db backup file
if(Test-Path -path ($localfolder + "\" + $filename)) { Remove-Item ($localfolder + "\" + $filename) }

$query =
"
USE {2}
GO

DBCC SHRINKFILE({3})

GO

BACKUP DATABASE [{2}] TO  DISK = N'{0}\{1}'
        WITH NOFORMAT, NOINIT,  NAME = N'backup', SKIP, REWIND, NOUNLOAD,  STATS = 10
GO
declare @backupSetId as int
select @backupSetId = position from msdb..backupset
where database_name=N'{2}' and backup_set_id=(select max(backup_set_id)
from msdb..backupset where database_name=N'{2}' )

if @backupSetId is null
begin
        raiserror(N'Verify failed. Backup information for database ''{2}'' not found.', 16, 1)
end
RESTORE VERIFYONLY FROM  DISK = N'{0}\{1}'
        WITH  FILE = @backupSetId,  NOUNLOAD,  NOREWIND" -f $localfolder, $filename, $dbname, $dblogname

sqlcmd -Q $query -S "."

# Remove existing zip file
if(Test-Path -path ($localfolder + "\" + $zipfilename)) { Remove-Item ($localfolder + "\" + $zipfilename) }

#Zip the backup file
[System.Reflection.Assembly]::LoadFrom($ziplibloc);
$zipfile =  new-object Ionic.Utils.Zip.ZipFile($localfolder + "\" + $zipfilename);
$e= $zipfile.AddFile($localfolder + "\" + $filename)
$zipfile.Save()

#Upload to S3
Add-PSSnapin CloudBerryLab.Explorer.PSSnapIn
$s3 = Get-CloudS3Connection -Key $key -Secret $secret
$destination = $s3 | Select-CloudFolder -path $s3folder
$src = Get-CloudFilesystemConnection | Select-CloudFolder $localfolder
$src | Copy-CloudItem $destination –filter $zipfilename

Installing IIS 6 SMTP service on an Amazon EC2 instance

The standard Amazon Windows AMIs don’t come with the IIS 6 SMTP component installed. It can be added through the “Add or Remove Programs”->”Add/Remove Windows Components” util on Windows Server 2003 (full guide), but you need the installation media. Amazon has an article describing how to do just that here. You basically create an EBS volume from a snapshot they provide (2GB minimum size) and then attach it to your instance. It will show up as a drive in Windows, holding the contents of the two installation disks. Just point the installer at those and you should be good. Afterwards you can detach and delete the EBS volume.