Web Analytics | Stephan Spencer

Web Analytics Archives - Stephan Spencer

Append Tracking Information Without Creating Duplicate Content

By | Search Engines, Web Analytics | One Comment

I mentioned towards the end of my Search Engine Land article about redirects how you can use the hash or pound symbol (#) in a URL to append tracking information.

Why do this? Because it would prevent duplicate content (ie. the same page at multiple URLs that look unique to the engines), and it would aggregate all link juice to the one canonical URL.

The # in an URL is usually used for sending visitors to an anchor within the page they are on (e.g. “Jump to top of page” or “Jump to Table of Contents”).

Appending tracking information to URLs with a # works from an SEO perspective because search engines ignore the # and everything after it. This effectively collapses the tracked URLs together.

Let’s take a look at a concrete example to see how this plays out. Imagine you linked to your “About Us” page from your blog and that link pointed to:


and from your site-wide footer on your ecommerce site you linked to:


Both URLs would be interpreted by Google and the other engines as:

Yet the full URL (www.mythicalcompany.com/aboutus.php#footer) is available to any client-side JavaScripts. So you could write a script that would pull what’s after the # and insert it into a cookie or otherwise send it to your server and/or web analytics.

Note that the full URL will NOT show up in your log files, because web browsers only use what’s after the # to jump to the anchor within the page, and that’s done locally within the browser. In other words the browser doesn’t send the full URL, so the anchor information (i.e. any text after the #) is not stored within environment variables like REQUEST_URI. Thus you can not use a hash for passing parameters in your URL for use by your PHP (or ASP or whatever) scripts (at least not directly).

If you have a stats package that uses log file analysis, hash-containing URLs won’t pass the anchor to your server logs. A workaround is to write and then include a client-side script that sends a ping via a URL with the necessary tracking appended via a query string. That ping URL would have the info appended but any content returned from that URL would be ignored by your script. That way the stats package can pick up the tracking info from query string parameters as normal — but through the second URL requested by your script, not the first one originally requested by the web browser. Make sense?

My interview with Mike Moran, IBM’s search marketing pioneer

By | Search Engines, Web Analytics | No Comments

Ever wonder how easy it is to implement SEO in a giant company like IBM? Mike Moran was kind enough to sit down with me for an interview, to talk about the positive SEO changes he implemented at IBM, how to allocate costs to guarantee a better ROI, and where the future of search marketing is headed.

Prior to Mike implementing his strategies and SEO changes, organic search accounted for less than 1% of overall traffic to IBM.com. Just a few years after Mike was able to work his magic, IBM’s organic search traffic grew to over 25% of total traffic. In order to fully appreciate what an impressive feat this was, first consider what a behemoth IBM is. The corporation spans over 90 countries and 30 languages, according to Mike. So not only did he have to try to implement a strategy that would affect every division and business unit, he needed to distill it down for the executives, and provide actionable items that were easy-to-understand across different roles and professional positions. At a big corporation, SEO is as much about political maneuvers and getting “buy in” as the technical implementation.

Here are some of the key concepts from the interview:

  • Stop competing against yourself: Divisions within large companies can end up getting into bidding wars with each other for keywords that are attractive to multiple divisions. For example, “linux” was being bid on by many different departments within IBM, since the term is relevant to software, networking, etc. By centralizing the management of paid search, IBM was able to minimize the intramural competition and reduce cost.
  • Automated SEO has benefits: One of the highest impact things you can do is to optimize your dynamic website templates. By changing a few things within the template, you can affect a huge number of pages and make them more palatable to search engines. Those few changes cause a ripple effect that can dramatically improve your rankings and organic search traffic.
  • Test your search marketing: Search marketers should use the immediacy of feedback to their advantage, and study how people and search engines respond to changes in the content. Make iterative improvements to your paid search campaigns and your SEO based on actual statistics, not just a gut feel.

Listen to the 45 minute interview with Mike Moran, “Distinguished Engineer” at IBM and author of two books, and learn strategies to work around corporate processes, budgets, and differering roles and priorities in order to grow your organic and paid search channels.

Mike and I are both speakers at the American Marketing Association’s conference “Hot Topic: Search Engine Marketing” this Friday, September 28th in Boston, and again November 2nd in Chicago, Illinois. There’s still time to sign up for the conference! Hope to see you there!

Get Closer to the Search Engine Spiders

By | Web Analytics, Web Design | No Comments

One of the things I covered in my CNet: Searchlight blog, was the tools for how website owners, webmasters/mistresses, and other professionals can get a little closer to the search engine spiders.

Along with providing more detailed information and answering more and more questions publicly, the greatest advancement they (the Search Engines) have made has been in creating tools to actually give site owners (who have validated their sites) more information about their sites than they’ve ever experienced before.

Whether you use Google Webmaster Central or Yahoo!’s Site Explorer, this article goes into the pros and cons of each tool and how it relates to SEO.

Coolest new web analytics tool you’ve never heard of

By | Web Analytics | One Comment

One of the 5-minute “lightning round” sessions at Web 2.0 Expo on Sunday night was the folks from Nitobi demoing their brand new Robot Replay technology. All I have to say is… WOW!! Every one of you readers HAVE to go check this tool out. It is a free beta, so there is no excuse for not trying it out! Robot Replay allows you to track the mouse gestures of your visitors and play them back. Imagine tracking individual prospects and see how they are interacting with your site! Not just where they click, but whether they scroll, how they interact with your forms, etc.

Hitwise biased, but in which direction?

By | Search Engines, Web Analytics | One Comment

Hitwise is a competitive intelligence service that gives you insight into where your competitors get their traffic from and which keywords drive the bulk of their traffic from search engines, among other things. It comes at a price of course — to the tune of tens of thousands of dollars.

I noticed that in a post today Matt Cutts, Google engineer extraordinaire, made a small swipe at Hitwise in his review of the Compete search engine:

ISP relationships [buying user data from ISPs] can be a huge source of metrics bias. For example, some ISPs partner with Yahoo, and users on those ISPs are probably more likely to visit Yahoo. Other ISPs partner with Google. And savvy users that use smaller providers such as Covad or Speakeasy are likely not counted at all.

Because you don’t know which ISPs are selling user data to companies such as Compete or Hitwise, you don’t know what biases are baked into those companies’ metrics–and the metrics companies won’t tell you.

Touché, Inigo! (Matt Cutts’ regular blog readers will get that)

So my question to Hitwise is… If you won’t tell us who your ISP relationships are, will you at least reveal some of your biases? Like which search engine your biggest ISPs are partnered with?

Top 10 nastiest web analytics problems – EXCLUSIVE Emetrics Summit podcast

By | Web Analytics | No Comments

Here’s something special just for you, dear readers — a “Scatterings” exclusive, courtesy of web analytics guru/author/speaker/WAA co-founder Jim Sterne

Jim has kindly made available a 35-minute podcast titled the “Top Ten Nastiest Web Analytics Problems,” recorded at the Emetrics Summit Santa Barbara, 2004.

The session was a one-hour discussion followed by a half-hour recount of that discussion to determine what problems in web metrics that people were facing.

They’ll be doing it again this year at the Emetrics Summit 2005, taking place next month in Santa Barbara, California (June 1-3) and London, England (June 8-10). It’ll be interesting to learn how the pain points have changed over the past 12 months.

If you are AT ALL concerned with making your website more effective and measuring the really important stuff in the best possible ways, you NEED to be at this summit. Santa Barbara is nearly full, but there are still spaces at the London venue. Hey, if you’re on the East Coast of the U.S., London isn’t all that far away, right?

Register for Emetrics Summit London
Register for Emetrics Summit Santa Barbara

What Google’s acquisition of Urchin means for marketers

By | Search Engines, Web Analytics | One Comment

The Washington Post and others covered Google’s recent acquisition of web analytics software company Urchin, but I really haven’t seen much from the media (or from the bloggers, for that matter — with the exception of Traffick, which makes some excellent points, particularly point #4) on what this means for marketers (and certainly for Urchin customers and Google customers). So I’ll take a stab at it. Of course, this is all conjecture because I don’t really have the inside scoop on what Google plans to do with Urchin. (Urchin, by the way, is one of our favorite web analytics tools. It is a rich and robust website log analysis package… it’s extremely good value-for-money.)

If I were Google, I would incorporate Urchin technology into the AdWords advertiser suite of tools and into the AdSense publisher tools. I’d extend Google’s conversion tracking functionality to offer a thorough ROI analysis of natural search as well as paid search. I would also continue to offer an unbundled version, hosted, for free and cover it with Google ads just as they do with GMail. And I would probably phase out the server-installed version, focusing future development solely the hosted version. I’d also stop supporting the unbundled version (both hosted and server-installed software) and let the community of users support themselves through online forums on Google Groups. That’s my guess. It would be cool if they then took the logical next step, like what AOL/Netscape did with the Mozilla Project, releasing the server-installed software as open-source, but that would probably be giving too much of the hosted product away.

If my predictions come to pass, then all the big hosting companies who are offering Urchin would be up a creek. The hosted version of Urchin is quite new. I wonder if Urchin’s move to a hosted version was with a view to being bought by the Goog or someone similar. Hmm…

I think Urchin will do Google the most benefit by Google assimilating the technology into their existing technologies and services, just like “The Borg” does on Star Trek. Extending and continuing the existing product wouldn’t be nearly as leveragable, and thus not nearly as attractive to Google. At the end of it, Urchin as a company will be unrecognizable. If this comes to pass, your days as a supported Urchin customer would be numbered, but the opportunities presented to you as an AdWords customer or an AdSense publisher would be huge.

Undoubtedly Daniel Brandt at Google Watch is busily constructing some conspiracy theory about Google planning on secretly using the data to violate everyone’s privacy. I can see it now: “I know what you bought last summer…”

The Problem with Embedding Tracking Codes in your URLs

By | Search Engines, Web Analytics | No Comments

The problem with embedding a tracking code into URLs to track referrals from particular marketing campaigns or from particular partners is that inevitably those URLs end up in other places, such as in the search engines. Thus your referral numbers become overinflated.

Case in point: Google’s “Inside AdSense” Blog. A couple days ago I searched Google for [inside adsense] and was surprised to find that the #1 result was not http://adsense.blogspot.com. It was the URL with a utm_source and some other stuff appended at the end of the URL (i.e. the URL was something like http://adsense.blogspot.com/? utm_source=aso&utm_campaign=ww-en_US-et-asfe&medium=et). Unfortunately I didn’t record the exact URL at the time, and today Google is back to returning what it should be returning for the top result: http://adsense.blogspot.com (without any utm_source or query string). I bet the Analytics folks at Google will be scratching their heads at the spike in popularity of the “ASO” (or whatever it was) referral source when they look back at the month of May (unless of course they’ve read this blog post!).

Example #2: CBS News. Check this out… Run the query [site:www.cbsnews.com inurl:source=rss] on Google. Google returns 27,900 pages. You’ll see that all of those pages have a source=RSS in the URL. Even though I don’t believe Google’s numbers of results to be even remotely accurate, still there are a heck of a lot of pages there, and those pages are bringing in some amount of traffic from Google searchers. When they do, the referral source is being wrongly attributed to the site’s RSS feed. I wonder if CBS News realize this? Probably not.

So, if you must use the URL’s query string to track your referral sources, then at least make sure that you aren’t ever serving those links to search engine spiders. Drop the referral source from all links when spiders come to visit. Don’t worry; the search engines say this sort of “cloaking” is totally okay.

That will ensure your own site isn’t providing source coded links for the spiders to explore. But what to do about other sites that are linking to you? I suggest that you 301 redirect all traffic to URLs with tracking codes to the corresponding URL without the tracking code. You should see that your source coded pages in the search engines’ indices should drop away to nothing over time (or at least get relegated to “supplemental hell”).

Competitive analysis critical to SEO success

By | Search Engines, Web Analytics | 2 Comments

Understanding your competitors — their strategy, their tactics, their level of success, etc. — is crucial to the success of your SEO initiatives. I’m not just talking about your traditional competitors, I’m referring to the other sites occupying spots in the SERPs (search engine results pages) for keywords that you are targeting.

Many free competitive analysis tools are out there, but you have to know where to look for them. One of my favorite SEO blogs (Stuntdubl) offers a veritable Home Depot of such tools, at Mr. Ploppy’s Monday Tool List.

It’s a bit like walking into a DIY store and being faced with an overwhelming array of options. What is the right tool for the job?

Here’s a sampling of some of the SEO tools that I use for competitive analysis and what I specifically use them for:

Fascinating problem with email clicktracking

By | Email, Online Retail, RSS Marketing, Web Analytics | No Comments

I thought this was pretty cool: realsimpleshopping.com allows you to get your shopping offers through RSS. What this means is that personalized cabelas.com email campaigns get posted to realsimpleshopping and then untold numbers of people will start opening and clicking on that email, like this one for example.
I wonder if Cabela’s and other online retailers know that this sort of thing is happening? I imagine realsimpleshopping and other similar services will really take off in the not-too-distant future, once RSS newsreaders go mainstream, which will really throw off email campaign tracking and analysis.