Monitoring Your Site’s Performance Using Boomerang
This is part one in a series on boomerang. This article is an introduction to boomerang and covers the basics. Future articles will cover writing modules and analyzing the results.
Website monitoring tools have historically focused on uptime. The service provider’s server checks your website periodically to make sure it still works. If not, it sends an alert to let you know something is wrong. Most monitoring services also record your server’s response time which can be useful in watching for traffic surges and DDoS attacks. With client-side performance often being the biggest bottleneck on modern websites, you need additional tools to monitor from your user’s perspective. In steps boomerang, an open source client-side performance monitoring tool from Yahoo.
boomerang can measure some very interesting performance metrics, including page load time, your visitors’ bandwidth, HTTP and DNS latency, and anything else you’d measure in time. It provides a core framework for building performance monitoring modules. By using an image beacon, similar to Google Analytics and other tracking tools, boomerang can report the statistics back to your server for analysis.
Page Load Time
The main metric you’ll be interested in is the page load time. Most tools that monitor load time use a single page based system of recording the time at the top of the page and then comparing it to the time when the page has loaded. This provides incomplete data since it doesn’t include DNS lookup and connection latency. Boomerang takes a much different approach. It monitors links on your site and records the time as the visitor leaves a page. By using this as the start time it provides a more accurate measurement.
Bandwidth and connection quality can play a big part in client-side performance. A user’s experience on dialup will be much different than one on FiOS. By factoring a user’s bandwidth into your page load time monitoring you get a much more accurate view of performance. Boomerang will roughly check your visitor’s bandwidth by measuring the time it takes to load a series of progressively larger images. I say roughly because the accuracy of this number is often poor. The numbers it shows for my 15Mbit cable connection vary 50% with a standard deviation of 25-40%. Still, this can be useful to divide visitors into bandwidth-based buckets as mentioned in Howto #0.
One of the more advanced (and untested) features of boomerang is determining DNS latency. This requires adding a wildcard subdomain to your domain and pointing it to an empty file on your server. The DNS module will use this to make a handful of requests and time the results. This can be useful for monitoring DNS issues but does require more work to setup.
Some Work Required
Unfortunately the feat of gathering and analyzing the data is left in the hands of the user. The project currently does not include any server components. In an upcoming post in this series I will demonstrate how to collect and analyze the data using NodeJS and MongoDB.
To get started with boomerang download the source or clone it from github. The docs are included but can also be viewed online at http://yahoo.github.com/boomerang/doc/. YDN is providing a support form, monitored by the author Philip Tellis, at http://developer.yahoo.net/forum/?showforum=127.
Comments are closed.
I couldn’t find the follow up posts in the series, did you post them somewhere else maybe?
Do you have a working boomerang instance? We’ve just deployed the first phase of our boomerang experiments. We’re measuring page load times and so far have a reasonable amount of data flowing in. The big question is how to analyse it without re-inventing the wheel and writing something entirely from scratch. If you have any pointers or tips you’d like to share, that would be awesome. Happy to collaborate privately if you have anything you’d be willing to share which is not quite ready for full open source release yet. I’m planning to release whatever we develop under an open source license, once we can iron out any bugs.
Unfortunately I never used it in production and never built a backend. There are some really cool tools now for presentation of data such as Cube.
I’m actually working on a web performance dashboard to use with my clients so the topic is still of interest to me. Feel free to contact me, dan at driverdan dot com.
Very interesting, thanks for the link to cube. I’ll definitely look at how that might work. I’m not so focused on real time metrics as after the fact comparison analysis, but it sounds like cube could provide a powerful platform.
I’ll shoot you an email now to discuss.