The Problem with Embedding Tracking Codes in your URLs
The problem with embedding a tracking code into URLs to track referrals from particular marketing campaigns or from particular partners is that inevitably those URLs end up in other places, such as in the search engines. Thus your referral numbers become overinflated.
Case in point: Google's "Inside AdSense" Blog. A couple days ago I searched Google for [inside adsense] and was surprised to find that the #1 result was not http://adsense.blogspot.com. It was the URL with a utm_source and some other stuff appended at the end of the URL (i.e. the URL was something like http://adsense.blogspot.com/? utm_source=aso&utm_campaign=ww-en_US-et-asfe&medium=et). Unfortunately I didn't record the exact URL at the time, and today Google is back to returning what it should be returning for the top result: http://adsense.blogspot.com (without any utm_source or query string). I bet the Analytics folks at Google will be scratching their heads at the spike in popularity of the "ASO" (or whatever it was) referral source when they look back at the month of May (unless of course they've read this blog post!).
Example #2: CBS News. Check this out... Run the query [site:www.cbsnews.com inurl:source=rss] on Google. Google returns 27,900 pages. You'll see that all of those pages have a source=RSS in the URL. Even though I don't believe Google's numbers of results to be even remotely accurate, still there are a heck of a lot of pages there, and those pages are bringing in some amount of traffic from Google searchers. When they do, the referral source is being wrongly attributed to the site's RSS feed. I wonder if CBS News realize this? Probably not.
So, if you must use the URL's query string to track your referral sources, then at least make sure that you aren't ever serving those links to search engine spiders. Drop the referral source from all links when spiders come to visit. Don't worry; the search engines say this sort of "cloaking" is totally okay.
That will ensure your own site isn't providing source coded links for the spiders to explore. But what to do about other sites that are linking to you? I suggest that you 301 redirect all traffic to URLs with tracking codes to the corresponding URL without the tracking code. You should see that your source coded pages in the search engines' indices should drop away to nothing over time (or at least get relegated to "supplemental hell").
3 comments
-
Stephan, it might be good to mention that removal of session tracking parameters from URLs for bots is okay as benign cloaking, but folx need to realize that changing the link label text may not be okay.
Non-technical people often confuse the parts of the link, not understanding that the visible text of the link, or "link label", is a separate piece from the URL it is linked to.
Search engines are okay with removing session variables for their spiders as you mention, but it's not clear that changing the label text would necessarily be okay. There was a debate about that on SearchEngineWatch.com this year, in relation to something similar to this on the Colgate-Palmolive website. Most search engines would likely consider this practice to be hostile cloacking, if they detect it.Comment by Chris Smith [Visitor]
· http://www.superpages.com —
05/08/06 @ 13:48
-
Thanks Chris. Good point!
Comment by Stephan Spencer [Member]
—
05/08/06 @ 16:02
-
Almost everytime I run my Norton System Works full system scan, it encounters a tracking code. How do they get in and am I right to have Norton remove it? Please respond to this via email as I will probably forget to come back here and check for an answer
Comment by Brenda Cartwright [Visitor]
—
08/24/07 @ 12:07
