Post provided by Catchpoint Systems.
Counting discrepancies in impressions, page-views, and other visitor data impact every online tracking company, publisher, and marketer. The cost of such problems varies and it impacts directly everyone involved. The sooner the problem is found and solved, the better it is for all.
Adservers, Ad Networks, Web Analytics, and Behavioral Targeting companies – all have products that rely on small image requests to track impressions of advertisements or user visits on the page. Every online company has been impacted at least once by discrepancies in the numbers they provide and the numbers of another company or product. Over the last decade the issue has become so common in the online advertising world that discrepancies of 1-10% are considered normal, and will not be investigated. Such discrepancies impact both marketers and publishers, and raises the cost of doing business. Imagine if 10% of the items shipped by a store were unaccounted for, the cost of these items would be taken out of the profits of the store.
The causes for these discrepancies vary from the application availability of the tracking company, to tagging mistakes by web developers, to simply the viewer browsing behavior. The availability and performance of the tracking server is key to tracking the data, if the server is unavailable or too slow the request from the browser will not get to the server – hence no tracking. Tagging mistakes are plentiful, people forget to place random numbers and the tracking image caches, they have typos or line breaks, all causing the image not to be called correctly. Viewer behavior can also impact tracking, if the user is too quick to browse from one page to another the tracking pixels might never get called. The impact of such behavior is especially a problem in pages with a lot of content and hundreds of requests, however it is not costing money to anyone. There are more known causes of discrepancies and are detailed on this blog post from AdMonsters.
In this blog post we wanted to discuss another case that causes counting discrepancies in tracking, which we discovered during a study of web performance of the Top 500 Internet Retailers. We observed on several webpages that the requests to the tracking pixels are aborted or canceled by Internet Explorer 7, which has roughly 10-30% of browser market share depending on the website. We were unable to replicate the problem on Internet Explorer 8, Firefox and Chrome. We observed the problem utilizing our performance testing agent, Wireshark, and HTTPWatch.
The Abort Problem
The abort problem happens quite often on Internet Explorer, and even on other browsers. The main causes for aborts are dynamic content modification (mouse-over on a tab menu) or simply the viewer canceling the navigation while content is still loading (stop or back button). However, during the performance testing of the top 500 Retailers we noticed that the problem often occurred on tracking pixels – without any viewer intervention. This resulted in the viewer activity not being tracked at all by the companies involved.
We dug deeper in the problem and discovered that it was caused by the JavaScript code used by several tracking companies in conjunction with the webpage content. While it is unclear why the aborts occur only in certain pages and not others, we were able to figure out what JavaScript code was bound to fail in triggering the request. We discovered that in all the cases the request was aborted the developer of the code relied on “new Image()” or “createElement()” (without attaching the image to an HTML object) to make the request to the server.
Our research showed that in some cases no request was sent and in others the request was sent but interrupted abruptly before receiving a response, and in some cases the response from the server was present partially or in full. Hence, in some cases the data might have been logged – but in the majority of the cases it was probably lost.
We reproduced the problem consistently in several websites including Brooks Brothers, Crate and Barrel, Fox News, JCPenny, Levi’s, Mashable.com, News.com, and others. The abort problem impacted several tracking companies directly (it was their code) or indirectly (another company was responsible for the code) including: Adify, Atlas (Microsoft), Chartbeat, Cnet (CBS Interactive), DoubleClick (Google) – Floodlight Product, Efficient Frontier, Federated Media, InviteMedia (Google), Right Media (Yahoo), Rubicon Project, and ValueClick.
Solutions to the Problem
We tested several JavaScript methods to request the tracking URL successfully on the same webpage where we received aborts previously, and the following three methods worked all the time:
- “document.write()” – this methods works successfully, however document.write has an impact on performance and is not appropriate for webpages relying on AJAX.
- Utilize existing JavaScript method, and trigger the request when page finishes loading (onload) – while the technique avoided the aborts by IE 7, a new problem arises whereby the requests do not get issued until every request on the webpage has finished loading. Any slow request on the page would delay the posting to the tracking company.
- “createElement()” and attach image to the body object – the method worked consistently with no side effects.
We feel that the best method to guarantee no aborts occur, is to rely on createElement() and attach the image to the body of the webpage. Here is an example of the code:
(function() {
function attachImg(imgSrc) {
var pixel_obj;
pixel_obj = document.createElement("img");
pixel_obj.setAttribute("src", imgSrc);
document.body.appendChild(pixel_obj);
}
attachImg("http://tracking.site.com/?key=value");
})();
Technical Anatomy of an Abort
When IE 7 browser aborts a request, for the majority of the cases the request never makes it to the wire. However, in some cases the request gets issued and then aborted and in these cases the browser issues a TCP RST packet, signaling to the server that something went wrong in the connection and it needs to reset. In theory the browser is supposed to re-open the connection, but in reality it never does.
In our tests with Wireshark, we observed almost always that the browser never made the request. However, while utilizing HTTPWatch and Wireshark on the same webpages we did see cases where the request was issued to the server, followed by a TCP RST packet interrupting the connection. In these cases while the request was aborted – it could have made it to the server. We are not sure why there is such difference between the two observing methods. We firmly believe that the Wireshark method is more accurate, and that the requests almost never make it to the server.
Conclusion
Counting discrepancies result in loss of money for everyone involved. Discovering the causes of discrepancy and solving them in time, is imperative to any business. Ongoing testing and monitoring of all the parties involved, in a lab and real world environment, helps in discovering such issues quicker before damage has occurred. Clients and vendors need to pay attention to the data reported and their monitoring systems, and analyze any abnormalities – to avoid finding such problems 4 years after a browser is released.
Editor’s note: This post originally appeared on the Catchpoint Systems blog.
About Catchpoint Systems Inc: Catchpoint was founded in September 2008 by four experienced former Google/DoubleClick technology innovators to offer an application performance monitoring platform designed for today’s complex, dynamic and distributed IT structure. The founder run, self-funded company delivers a monitoring service that combines synthetic, end-user, and internal monitoring functionality into a single solution to provide in real-time, a complete view into the health of online websites, services, and applications. With benefits for Business and IT teams, Catchpoint believes that speed, availability and reliability are key pillars to a company’s existence, long-term viability and overall success.