You may have already seen my article on Search Engine Land “Making Sense of Google’s New Dynamic URL Recommendations“, but if you haven’t, I’ll recap some key points about Google’s new recommendations on dynamic URLs and URL rewriting and why I don’t advise you follow these recommendations.
As much as I’d love to believe that Google no longer needs webmasters to clean up their URLs for Googlebot, the hard truth of the matter is that Googlebot STILL stumbles across the same content at varying URLs and mistakenly indexes all copies — even returning one version of these URLs with some queries and other versions with other queries. In fact that’s the whole premise of my colleague Brian Klais’ Search Engine Land article from earlier this month, that guided navigation systems create numerous URL pathways to the same content, and Googlebot isn’t very good at detecting this and compensating for the duplication and PageRank dilution effects. What it all boils down to is this: what’s confusing for Googlebot ultimately becomes confusing for searchers, thus leading to a lose-lose-lose — for Google, for its users, and for you the site owner.
Given this, I dispute the assertion in the aforementioned post from the Google Webmaster Central Blog, that webmasters should “feel free to serve us [Google] your standard dynamic URL and we will automatically find the parameters which are unnecessary.” That’s gambling with your rankings, and personally I don’t like the odds.
Let’s have a look at a concrete example to prove my point. Just last month I spoke at the Shop.org Annual Summit, on a site clinic session where I gave impromptu critiques of sites volunteered by audience members. One such site was MEC.ca. A great site for users, not so great for Googlebot. It didn’t take long for me to spot the duplicate content and PageRank dilution issues. Digging through site:www.mec.ca results revealed pages with jsessionid and bmUID parameters. Indeed, 102000 results (estimated) for and 96400 results (estimated) for site:www.mec.ca inurl:jsessionid!
Let’s focus in on a specific page of MEC.ca: the “Biodegradable Shopping Bag” page, of which there are 15 copies in Google’s index. Clearly Googlebot is confused.
This confusion is further evidenced by the fact that a search on “biodegradable shopping bag” returns a different mec.ca URL (on page 1) than a search on “biodegradable shopping bags” (page 4 of the SERPs) — yet they are both the same (duplicate) page of content.
I would counsel MEC.ca that maintaining status quo and leaving things in the hands of Googlebot to eventually (maybe) sort out is not a viable solution.
Let’s review some pertinent facts about dynamic URLs, along with my evidence:
FACT: URLs with session IDs or user IDs don’t always get properly identified by Google, resulting in duplicate content and PageRank dilution.
EVIDENCE: The above-mentioned example from mec.ca.
FACT: URLs with keywords in them rank better in the SERPs than those with product IDs. So a rewritten URL like www.domain.com/blue-widgets will outperform www.domain.com/product.asp?productID=123 for a search on “blue widgets” — all else being equal. This is true not just in Google, but in other engines as well.
EVIDENCE: We’ve conducted numerous experiments for clients to prove the rankings benefit to ourselves, but we can’t publish these tests unfortunately (we are restricted due to client confidentiality). I encourage you to conduct your own tests. A Microsoft engineer just last month confirmed that keyword URLs provide a boost in Live Search.
FACT: Short URLs have a better clickthrough rate in Google SERPs than long URLs.
EVIDENCE: This effect was found through user testing that was commissioned by MarketingSherpa. MarketingSherpa found that short URLs get clicked on twice as often as long URLs (given that the position rank is equal).
FACT: Keyword URLs are more user-friendly, and thus probably better at enticing clicks in the SERPs by searchers.
EVIDENCE: Keywords within a URL that match the search query are bolded, providing additional emphasis to the search listing.
So, given the above facts, would you rewrite your complex dynamic URLs to look static and keyword-rich? I sure would!
Then what are Googlers’ Juliane Stiller and Kaspar Szymanski trying to accomplish with the aforementioned blog post? My hunch is that Google is finding an alarmingly large number of improperly implemented URL rewrites that are confusing Googlebot even more and exacerbating the duplicate content situation. If superfluous parameters — e.g. session IDs, user IDs, flags that don’t substantially affect the content displayed, tracking parameters — get mistakenly embedded into the filename/filepath, then Googlebot will have an even harder time identifying those superfluous parameters and aggregating the duplicates. And what if parameters are embedded in the filepath in inconsistent order (e.g. www. example.com/c-clothing/shirts-mens/ and www.example.com/shirts-mens/c-clothing/)? That’s another nightmare scenario for Googlebot. On top of all that, when Googlebot still finds links to the old (non-rewritten) URLs, your well-intentioned URL rewriting actually presents Google with yet another duplicate to deal with. It can be a real mess. The lesson here is to hire a professional when embarking on a URL rewriting project, NOT to leave your URLs dynamic and your website in the hands of fate.