contact us

Feel free to reach out to us using the form to the right, or contact us us directly.

 

info@ewbanaltyics.com | 5o8-650-0271

 

PO Box 433
Natick, MA 01760
USA

508.650.0271

EWB Analytics Blog

Browser-Specific Custom Link Tracking Challenges

Elizabeth Brady

If you find custom links aren’t tracking for certain browsers, read on.  Certain browsers (Safari, in particular, sometimes Chrome, and the newest versions of Firefox) prioritize a new page load over the analytics tracking code for links leading to another page or site.   When this happens, the custom link tracking code is ignored.  To compensate, analytics tools recommend adding a fraction of a second to the link to give the tracking code more time to executive.   This post summarizes standard solutions for SiteCatalyst and Google Analytics.   While we did not see expected improvement with the Google Analytics solution when we tested it recently (results included here), I wanted to write up my own experience with the issue as it has not been easy to find documentation or discussion.  I would appreciate feedback from anyone who does get results with the proposed solutions.

A client with a very large number of custom links consistently recorded click rates for Safari much lower than other browsers since tracking was implemented in 2011.  With the adoption of larger numbers of IOS products in 2012, the Safari challenge alone impacted key metrics.  Beginning April 2013 (with the release of Firefox 20.0) the tracking challenge spread to Firefox.  Notice the percenta of Firefox visits registering a click event drops from almost 50 percent to under 20 percent within a few days.

 A summary of click rates by browser now demonstrates a clear tracking issue for both Firefox and Safari compared to Chrome and Internet Explorer.

Google Analytics’ standard custom link tracking code guide  fails to mention the challenge or provide a workaround.  A Google Analytics help article on outbound links does mention the possible browser challenge for outbound links with advice to introduce a function call with a 1/10 second delay on the link.

setTimeout(function() {
document.location.href = link.href;
}, 100);

In our tests the 100 millisecond delay provided no observable improvement in link tracking.  Increasing the delay to 500 milliseconds resulted only in very sporadic tracking for Firefox, and tracking for other browsers actually decreased with the function in place.

SiteCatalyst has been more upfront about the browser tracking issue in its implementation documentation.  The release of code version H.25 in 2012 included a standard 250 millisecond delay (if needed, to give the tracking code extra time to execute) for standard exit links for Safari and Chrome.  The SiteCatalyst blog covers the issue in the code announcement. The newer H.26 includes the delay for Firefox 20.0+ as well.  

In order to implement the delay for custom links, a ‘done action’ parameter must be added to the custom link tracking code.

<a href="http://anothersite.com" onclick="s.tl(this,'e','AnotherSite',null,'navigate');return false">

This should be added to any custom links to another page, not just exit links to other sites. (Document downloads and AJAX in-page events don’t need the extra delay and therefore don’t need the done action code).

To disable the forced link tracking (which is actually just a forced delay of up to 250 milliseconds if needed) for non-custom exit links use:

To disable:  s.useForcedLinkTracking=false:

To increase the delay time (to, for example the highest recommendation I have seen ½ second) use this:

s.forcedLinkTrackingTimeout=500

I’m curious whether the recommended maximum delay (half a second) is actually sufficient to consistently compensate for the browsers’ design, yet have yet to find a client willing to consider a longer delay enough to even test it.

National Gun Death Tally Since Newtown

Elizabeth Brady

Slate.com recently launched an initiative with @GunDeaths  to publish an estimate of the number of gun deaths across America since the December 14, 2012 Newtown mass shooting.  Official figures on deaths by firearms often take years to be published.  Partnering with the Twitter handle @GunDeaths, who collects and tweets as many media stories about gun deaths as possible, Slate created a database of all reported deaths since December 14th.  While openly acknowledging that the data will be incomplete, the project makes an admirable attempt to gather anecdotal reports into a comprehensive database including age, gender and location of each victim.  Furthermore, Slate makes the data publicly available.  As a data analyst, I could not pass up the opportunity to massage the data into a few additional visualizations to tell the story since December 14. 

The full interactive dashboard is available on Tableau Public.

We quickly see that December 14th was not the most violent day reported over the past seven weeks.  Both January 1st (58 deaths) and January 11st (51 deaths) recorded more gun deaths than December 14th.  The Newton victims represent just 1.7% of total reported gun deaths during this time period. 

However, the trend lines clearly point to why the event was so emotionally traumatic for so many of us.  Excluding December 14th, fewer than 2 percent of victims were children.  Ninety two percent of gun death victims were adults while five percent were teens (for the remainder, age is not known).  A classroom of children killed on a single day cuts through the ongoing reports of daily gun violence.  According to the Slate data, however, more children have died from gun violence after December 14th (27 reported deaths) than did that day.

Gun death victims are also much more likely to be male than female.  For the entire time period, 84 percent were male.  All of the adult victims at Sandy Hook Elementary (except the shooter) were female.  The shooting pushed Connecticut’s percentage of female victims for the time period to 55 percent, the highest of any other state. Many of us have perhaps grown numb to reports of gun deaths of young men. Shootings of women shock us more.

Thank you @GunDeaths and Slate for gathering and publicizing the data.  The numbers themselves (with a bit of visualization on top) will continue to tell the story.  Each story is complex and the aggregate data cannot cover those details.  The sheer volume of gun deaths across the country, however, should be acknowledged and discussed.

 

Mystery PDF Tracking

Elizabeth Brady

Tracking clicks to pdf files requires custom code. Without the code, you should not see any views to pdf files in web analytics reports, but this is not always the case.  One additional simple code customization explains the mystery.

Since it is not possible to include the basic javascript page tag on a pdf file similar to other html pages on a website, pdf tracking requires embedding an onclick (or onmousedown) event in the link to the pdf, to pass either a virtual pageview or an event to track the click to the pdf file.  These can be added manually, with a fully customizable hierarchy of event lables, as described for Google Analytics such as:

<a onclick="_gaq.push(['_trackEvent', 'Document', 'PDF', 
'/directory/file.pdf']);"href="/directory/file.pdf">
Click here to view PDF</a>

For sites with a large number of pdf files, it’s worth the investment to set up custom code in a jquery file to automatically track all clicks to pdf’s (as well as other documents, as specified).  Many varieties of the code exist – such as this solution provided by Adam Buchanon.

While walking one client through the various requirements and options for setting up tracking, and in an attempt to demonstrate the lack of any pdf tracking at all on their site, I performed a filter on ‘pdf’ in the standard content reports.  The results returned – much to my surprise -- a long list of pdf documents.  I went to the site to view a random selection of pdf documents.  I could see that there were no tags firing to track the click to the pdf.  How was it possible to see page views in the reports to pdf’s without any code on the site, and without data passing to Google Analytics? 

Finally, after some brainstorming with a very talented colleague, we noticed that all of the visits to the pdf documents were from search and referrals (suspicious).  We finally decided to go to the site to view the specific pdf files listed in the content report. We discovered that all of the pdf addresses listed produced 404 errors.  Without code adjustments to the 404 error template, web analytics reports record pageviews to the URL requested, instead of tracking the 404 error.  Sometimes the error status may show up in the page title, but sometimes it does not (the page title may be blank).

 For Google Analytics, modify the trackPageview line of code to record the 404 error along with the page requested:

 _gaq.push(['_trackPageview','/error 404/'+document.location.pathname]);

 Therefore, the Google Analytics Tracking code on a 404 error page should be:

<script>

                var _gaq = _gaq || [];

                _gaq.push(['_setAccount', 'UA-XXXXXX-X']);

                 _gaq.push(['_trackPageview','/error 404/'+document.location.pathname]);

                (function() {

                var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;

                ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';

                var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);

                })();

                </script>

In addition to explaining ‘mystery’ pdf tracking, error template customization produces extremely actionable reports for managing broken links, revealing where 301 redirects need to be set up, and improving customer experience.

 

Organic Traffic From Search Engine 'Search' Explained

Elizabeth Brady

Some sites show a surprisingly high volume of search from the organic search engine ‘Search’ in Google Analytics. 

Is this traffic really all from the search engine Search.com?  Search.com does exist, and is reported as the source ‘Search,’ however two Google Analytics code particularities explain most of the volume. Google states that Search.com is included on its list of default search engines however when I tested a flow from search.com the Google Analytics cookie recorded the visits a ‘referral’ visit, rather than an organic visit, so I am not convinced that any of the ‘Search’ traffic is actually from Search.com.

When analyzing one client’s organic search from January 2012 I was surprised to see ‘Search’ volume equal to 1 percent of Google’s (especially considering that I could not reproduce a visit recording as ‘organic).  This seemed suspiciously high.

With a code release in February 2012, Google explained that it had been rolling up several smaller organic search engines as ‘Search’.  'Conduit.com', 'babylon.com', 'search-results.com', 'avg.com', 'comcast.net' and 'incredimail.com', all had previously been rolled into 'search'.  Since February 2012, they show up as unique search engines.  As expected, the ‘Search’ traffic for the client in question plummeted after the February code release:

 

 

For sites that continue to show a significant volume from ‘Search’ after February, check for another obscure Google Analytics particularity.  Cross –subdomain implementations (subdomain1.example.com, subdomain2.example.com) with a search results page of search.example.com, combined with the query string ‘q’, will record the click to the search results as new visit from the organic search engine ‘Search’ as documented by Google.

To correct this, Google recommends adding a line of code (_addIgnoredRef()) to record the traffic as ‘direct’ instead of organic. 

This will still create 2 visits (the second as ‘direct’ instead of ‘Search) and will not cohesively link pre and post search behavior.  Better yet, reconfigure the search query string parameter to use something other than ‘q’.    Another option would be rewrite the search query parameter using ‘_trakPageview’.  To rewrite /searchresults?q=test the trackPageview line of code on the search results page would need to be customized:

  _gaq.push(['_trackPageview',’/searchresults?term=test’]);

Organic search from the source ‘Search’ before February 2012 represents a roll-up of several smaller search engines. For high volume 'Search' traffic after February 2012, modify cross-subdomain sites with a search.example.com subdomain to use (either at the search engine level or in the data passed to Google Analytics) a query string parameter other than ‘q’.

Flexibility of Event Goals in Google Analytics

Elizabeth Brady

One of the biggest advantages of the newest version (v5) of Google Analytics is the ability to configure a goal based on a tracked site event.  We no longer need to configure a ‘virtual page view’ specifically in order to track the event as a goal.  The goal can be defined by the category, label, action, or value of the event, or on a combination of more than one event characteristic.  However, the language around ‘match type’ for this goal is confusing and may lead some analysts to assume the configuration is less flexible than it actually is. 

I was perplexed when I first saw the three match types for event goals.  What’s the difference between 'that is equal to’ and ‘that matches’? 

Comparing the language to another customization in Google Analytics, custom segments include a long list of straight forward match types.

 

 

For  segments, ‘exactly matching’ is the equivalent to ‘that is equal to’ for event goals.  Note there is no choice for ‘matches the regular expression’ for event goals, but upon testing it turns out that ‘that matches’ does, in fact, support regular expressions.   

For a client with several different links to a donation site I wanted to track any click event on qualifying links as a conversion.  I configured the goal to match either of the exact labels tracked for two donation buttons (‘donate’ and ‘give’) or any content link to the donation site (donate.example.com).  Testing the configuration against a custom report using regular expressions produces the same conversion results.

Match types for event values differ slightly to permit setting a numeric threshold, which is extremely useful for triggering a conversion based on a visit completing a high-value event.

It is unfortunately not possible to trigger a conversion based on the number of times a specific event is completed in a visit, but I would hope that is on the list for new features. 

Event goals represent one of the most useful features in the newest version of Google Analytics, and despite some confusing language around match type, have proved to be fully flexible and configurable.

How to Tell If Your Data Stinks

Elizabeth Brady

Earlier this summer, as a conference panelist I was asked, “What is the most common mistake companies make in interpreting their website analytics?”  I responded, without hesitation, “Not understanding how configuration issues might be impacting the data collected.”  Missing or inaccurate web analytics tags can cause, among other problems, inflated visitor counts, an exaggerated percentage of ‘direct’ traffic, self referrals, mislabeled entry pages, and inaccurate bounce rate, time-on-site, and pages-per-visit metrics.  Inaccurate data leads to incorrect analysis, and may drive the business to take the wrong – or at least unhelpful – actions.

There are some straight-forward non-technical indicators of a potential data configuration issue.  A simple ‘sniff test’ can reveal something that seems a bit ‘off’.

  • A sudden drop (or surge) in any metric without a corresponding valid business explanation
  • ‘Illogical’ landing pages
  • Any instance of zero in the data – from personal experience a zero indicates a data issue 99 percent of the time
  • Suspicious success event attribution – for example, if only a very small fraction of visit from a key source of traffic are converting, this need to be investigated
  • A large percentage of ‘direct’ traffic for a site with very little brand recognition

Any of the above issues warrants investigation.  Using an http debugger tool (Fiddler is my tool of choice), I walk through a visit to the site from the leading traffic sources (search, e-mail, leading referral sites) through to the major conversion events (registration, click events, placing an order) to watch page by page, click by click, what the web analytics tool records as the domain, page, source of traffic and other details to diagnose the issue and recommend code adjustments.

I have several clients quite skilled at interpreting and monitoring their web analytics reporting.  When they reach out periodically to point out an anomaly and ask whether there might be a data issue, I reply, without hesitation, “Let’s check,” and get to work.

Thoughts on the WAA Certification Exam

Elizabeth Brady

This spring I joined the growing list of WAA Certified Web Analysts ™.  While we’ve understandably been sworn to secrecy not to share the contents of the exam itself, some details about the experience might be useful for anyone considering taking the exam.  I’ve heard several comments similar to ‘why bother’ or ‘is it really worth it?’  While it’s still too soon to weigh in on the latter, the simple answer regarding the former is that there is an extremely broad range of skill and knowledge in the web analytics field.  Particularly as an independent consultant, I appreciate the opportunity to prove my depth and breadth of experience.  I am thankful for the many hours the Web Analytics Association, along with scores of volunteers in the field, invested to launch this program.  I understand that it’s extremely difficult to design a great test.  While it’s a solid start, it’s not a great test.

I opted to take the exam at a remote testing center run by AMP (Applied Measurement Professionals).  When I logged into the scheduling calendar, I was surprised but relieved to find out that the nearest testing center (just seven miles from my house) could accommodate me any day of the week, morning or afternoon, for the foreseeable future.  The abundance of choice lulled me into complacency, and it took several months to force myself to log back in to lock in a date.

The morning of my scheduled exam, I pulled into the parking lot of the nondescript H&R Block office whose second identity turns out to be an AMP testing center.  The dying local shopping center (photo below) across the street from a remote testing center certifying a profession that did not even exist when I went to college provided a stark convergence of the new and old economy.

A professional and friendly receptionist greeted me and another test-taker at the door.  I gathered he was taking a different exam since he was permitted a calculator.  We were instructed to bring our car keys, his calculator, and nothing else (they even provided pencils), into a back room with four computer workstations in cubicles.

The 90-minute exam consists of a series of multiple choice questions, followed by additional multiple choice questions about several case studies.  The testing software is very straight–forward, and permits marking questions for review.  I initially thought this would be helpful, until I realized I was marking at least a third of the questions. Many of the questions have two answers that can be easily eliminated and then two where I struggled to figure out the intent of the question to help narrow down the answer.  I wished I could write an essay response about why I was stuck between the two remaining answers, and the reasons why I might pick each one.  But of course, in a multiple choice exam there is only one right answer, and no partial credit for getting close.  Several questions seemed unnecessarily picky.  For example, one listed several important tasks to do when starting a new job, and asks which should be done first.  If two answers are both things to do during the first week, does it really matter which one you do ‘first’?  I think a solid web analyst could still fail this exam.

While those who pass definitely demonstrate a certain level of competency, is the exam able to test the qualities that make a great web analyst?  Absolutely not.  It can’t figure out who can communicate well to developers, the marketing team, the product team as well as all levels of executives.  It can’t figure out who has the patience to run QA on web analytics tags, page by page, click by click, to diagnose a data collection issue.  It can’t tell you who really enjoys peeling the onion to find the underlying segment or source that’s driving a change in a KPI.  It doesn’t tell you who truly understands the business needs as well as the technical challenges to collect the data.

The exam does nevertheless screen for a basic body of knowledge and general analytic capability.  It provides an indication of experience and quality for employers, and for companies considering hiring a web analytics consultant.  For that, I am grateful.

Filtering Ecommerce Transactions Across Multiple Subdomains in Google Analytics

Elizabeth Brady

There are two options for installing roll-up reporting for an e-ecommerce site that spans multiple subdomains with a requirement to report both at the subdomain as well aggregated level.  The first option involves setting up a distinct Google Analytics account (and tracking code) for each subdomain, and then installing a secondary tracking code (a rollup tracker) across all subdomains to capture the rollup data.  However, the complexities for this ‘non-standard’ implementation are well documented (see Brian Clifton, Justin Cutroni among others).  First, the installation requires extra tagging to pass ecommerce data to both accounts.  Additionally, the fact that the Google Analytics cookie is shared for any one page introduces many complexities for implementation and analysis.  

For a site with multiple subdomains, a single currency, and no plans for independent Adwords accounts, a single Google Analytics account with filtered profiles is a less complex option.  However, a basic ‘hostname’ filter that filters all page data by subdomain in the Google Analytics profile will not filter ecommerce transactions.  By default, all ecommerce transactions for the Google Analytics account will show up in every profile associated with the account.  In order to limit the ecommerce transactions to those that occurred on a specific subdomain, each profile requires its own filter for transactions as well as a filter for transaction items.  Before opting for this installation, be sure to understand the filters required, in order to design the site tags correctly.

The first filter field requires a custom filter on the field “E-commerce Store or Order Location”.

The data used for this filter is the “affiliation or store name” field in the addTrans data.  Google Analytics own sample code: 

_gaq.push(['_addTrans',
    '1234',           // order ID - required
    'Acme Clothing',  // affiliation or store name (use this to differentiate Partner sites or subdomains)
    '11.99',          // total - required
    '1.29',           // tax
    '5',              // shipping
    'San Jose',       // city
    'California',     // state or province
    'USA'             // country
  ]);

Interestingly, this filter alone will not thoroughly filter all ecommerce transaction from another site or subdomain.  Without an additional order item filter, product revenue and quantity data will still show up in all profiles associated with the account (even without associated transaction and order revenue data, which are filtered by the 'affiliation or store name' filter).

A second filter at the item level can be applied, but requires a site/subdomain identifier in the data collected. 

By appending the subdomain/site/store name to the item variation during the tagging and data collection process, this can be used to filter order items by subdomain:

 _gaq.push(['_addItem',
    '1234',           // order ID
    'DD44',           // SKU code
    'T-Shirt',        // product name
    'Green Medium',   // category or variation, append the subdomain here ‘Subdomain_category’
    '11.99',          // unit price - required
    '1'               // quantity - required
  ]);