- Every page of content has an “Email this page to a friend” link that leads to a page with a form (and optionally some duplicate content that was also present on the canonical page). The link to the form page is followable (i.e. no use of rel=nofollow) and the form page itself has a unique URL that is accessible to Googlebot (i.e. not disallowed by robots.txt directives or noindexed by meta robots tag). This is the case on Biolcell.org, but it occurs on millions of other sites too.
- Pages of internal search results (i.e. generated by the site’s search engine) are accessible via links, and then from there, links lead into the pagination structure (i.e. Next and Previous as well as page number links such as page 9 of 51), as well as to results pages for searches on numerous “related” keywords. This creates rampant duplication of content as can be seen in this example.
- Lists of “Related Tags” are shown on the site and these lead to tag conjunction pages where some or all of the tags are OR’ed together. For example, here. This example is taken from my own company’s website Netconcepts.com — and I’m pleased to say that our tag drill down feature has the “OR” links nofollowed to avoid sending spiders down an infinite loop of duplicate content.
- Faceted navigation (attribute-based navigation) creates seemingly limitless permutations by offering filtering and sorting options through crawlable links. Torrey Hoffman of Google’s Webmaster Central team presents the following ecommerce example in his recent Google Webmaster Central blog post:
Another common scenario is websites which provide for filtering a set of search results in many ways. A shopping site might allow for finding clothing items by filtering on category, price, color, brand, style, etc. The number of possible combinations of filters can grow exponentially. This can produce thousands of URLs, all finding some subset of the items sold.Faceted navigation, when implemented without expert SEO guidance, can sabotage your site’s SEO. We’ve seen search-delivered traffic tank on more than one occasion when a retailer implements faceted navigation “out of the box” without retooling it for SEO. You retool it with rel=nofollow on links pointing to low SEO value facets (like price range), with meta robots=noindex on the low SEO value permutations, and/or with alternative taxonomic navigation structures (with good anchor text). My colleague Brian delves into this thorny issue of search engine optimizing “guided navigation” in his article today on Search Engine Land. Check it out.
A couple weeks ago I spoke at Search Engine Strategies San Jose on Long Tail SEO Tactics (here’s my Powerpoint, btw, if you’re interested) and at SEOMoz Expert Training on the topic of Site Architecture and Internal Linking (download Powerpoint), and I found that I was addressing the same key issue to both audiences. That issue is… How do I best spread my PageRank (link juice) across my (very large) site? This question could be restated as: How do I avoid squandering my crawl equity? What do I mean by “squandering crawl equity”? Well, how deeply Googlebot goes into your site will depend in part on the PageRank, trust and authority you have earned (in the eyes of Google). It will also depend on how spider-friendly your URL structure and internal linking structure is. You can squander some of that crawl equity by presenting a plethora of low-value pages or duplicate content to the spider. Often times this is done inadvertently. Consider the following “spider trap” scenarios, all of which are undesirable from Googlebot’s standpoint:
Leave a Reply