Michael Friis' Blog

About


Linq-to-SQL, group-by, subqueries and performance

If you’re using Linq-to-SQL, doing group-by and selecting other columns than those in the grouping-key, performance might suffer. This is because there is no good translation of such queries to SQL and Linq-to-SQL has to resort to doing multiple subqueries. Matt Warren explains here. I experienced this firsthand when grouping a lot of geocoded events by latitude and longitude and selecting a few more columns (EventId and CategoryId in the example below):

from e in db.Events
group e by new { e.Lat, e.Lng } into g
select new
{
    g.Key.Lat,
    g.Key.Lng,
    es = g.Select(_ => new { _.EventId, _.CategoryId })
};

One possible solution is to fetch all events, to a ToList() and do the grouping in-memory.

var foo =
    from e in db.Events
    select new { e.Lat, e.Lng, e.EventId, e.CategoryId };

var bar = from e in foo.ToList()
            group e by new { e.Lat, e.Lng } into g
            select new
            {
                g.Key.Lat,
                g.Key.Lng,
                es = g.Select(_ => new { _.EventId, _.CategoryId })
            };

Transatlantic Facebook application performance woes

Someone I follow on Twitter reported having problems getting a Facebook application to perform. I don’t know what they are doing so this post is just guessing at their problem, but the fact is that — if you’re not paying attention — you can easily shoot yourself in the foot when building and deploying Facebook apps. The diagram below depicts a random fbml Facebook app deployed to a server located in Denmark being used by a user also situated in Denmark. Note that Facebook doesn’t yet have a datacenter in Europe (they have one on each coast in the US).

fbservers

The following exchange takes place:

  1. User requests some page related to the application from Facebook
  2. Facebook realizes that serving this request requires querying the application and sends a request for fbml to the app
  3. The app gets the request and decides that in order to respond, it has to query the Facebook API for further info
  4. The Facebook API responds to the query
  5. The application uses the query results and the original request to create a fbml response that is sent to Facebook
  6. Facebook gets the fbml, validates it and macroexpand various fbml tags
  7. Facebook sends the complete page to the user

… so that adds up 6 transatlantic requests pr. page requested by the user. Assuming a 250ms ping time from the Danish app-server to the Facebook datacenter this is a whopping 1.5s latency on top of whatever processing time your server needs AND the time taken by Facebook to process your API request and validate your fbml.

So what do you do? Usually steps 3 and 4 can be eliminated through careful use of fbml and taking advantage of the fact that Facebook includes the ids of all the requesting users friends. Going for an iframe app is also helpful because it eliminates one transatlantic roundtrip and spares Facebook from having to validate any fbml. A very effective measure if you insist on fbml, is simply getting a server stateside — preferably someplace with low ping times to Facebook datacenters. There are plenty of cheap hosting options around, Joyent will even do it for free (I’m not affiliated in any way).