Monitoring Your Site’s Performance Using Boomerang
This is part one in a series on boomerang. This article is an introduction to boomerang and covers the basics. Future articles will cover writing modules and analyzing the results.
Website monitoring tools have historically focused on uptime. The service provider’s server checks your website periodically to make sure it still works. If not, it sends an alert to let you know something is wrong. Most monitoring services also record your server’s response time which can be useful in watching for traffic surges and DDoS attacks. With client-side performance often being the biggest bottleneck on modern websites, you need additional tools to monitor from your user’s perspective. In steps boomerang, an open source client-side performance monitoring tool from Yahoo.
boomerang can measure some very interesting performance metrics, including page load time, your visitors’ bandwidth, HTTP and DNS latency, and anything else you’d measure in time. It provides a core framework for building performance monitoring modules. By using an image beacon, similar to Google Analytics and other tracking tools, boomerang can report the statistics back to your server for analysis.
Page Load Time
The main metric you’ll be interested in is the page load time. Most tools that monitor load time use a single page based system of recording the time at the top of the page and then comparing it to the time when the page has loaded. This provides incomplete data since it doesn’t include DNS lookup and connection latency. Boomerang takes a much different approach. It monitors links on your site and records the time as the visitor leaves a page. By using this as the start time it provides a more accurate measurement.
Bandwidth and connection quality can play a big part in client-side performance. A user’s experience on dialup will be much different than one on FiOS. By factoring a user’s bandwidth into your page load time monitoring you get a much more accurate view of performance. Boomerang will roughly check your visitor’s bandwidth by measuring the time it takes to load a series of progressively larger images. I say roughly because the accuracy of this number is often poor. The numbers it shows for my 15Mbit cable connection vary 50% with a standard deviation of 25-40%. Still, this can be useful to divide visitors into bandwidth-based buckets as mentioned in Howto #0.
One of the more advanced (and untested) features of boomerang is determining DNS latency. This requires adding a wildcard subdomain to your domain and pointing it to an empty file on your server. The DNS module will use this to make a handful of requests and time the results. This can be useful for monitoring DNS issues but does require more work to setup.
Some Work Required
Unfortunately the feat of gathering and analyzing the data is left in the hands of the user. The project currently does not include any server components. In an upcoming post in this series I will demonstrate how to collect and analyze the data using NodeJS and MongoDB.
To get started with boomerang download the source or clone it from github. The docs are included but can also be viewed online at http://yahoo.github.com/boomerang/doc/. YDN is providing a support form, monitored by the author Philip Tellis, at http://developer.yahoo.net/forum/?showforum=127.