Stephan Spencer's Scatterings

The Scattered Wisdom of a scientist turned web marketing virtuoso

September 2008
S M T W T F S
 << <   > >>
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30        

Get Inside Google's Head with Maile Ohye

Maile Ohye of GoogleEarlier this summer I had the chance to interview (by phone) Maile Ohye Senior Support Engineer at Google. Maile's role is to support users and webmasters, but also to implement changes in Google's code with feedback from users like you. The podcast of my interview has just gone live this week.

Download it now (6 MB, 25 min)

Maile was forthcoming about Google's "new" (i.e. expanded) Webmaster Guidelines. She explained that, in a lot of cases, webmasters don't necessarily have a good understanding of what SEO is, and don't know when they are doing something that violates their guidelines. Gray/black hat discussions like "cloaking" and "doorway" pages are covered within the new guidelines, which is great for big and small SEO shops, too, that may interpret those terms differently.

Maile and I had a geeky discussion about cloaking, session IDs, Flash, progressive enhancement, noscript, sitemaps, rel=nofollow, and paid links. In the end, Maile reaffirmed Google's desire to put users first.

More highlights on my interview here.

Posted by Stephan Spencer on 08/10/2007 | Permalink

Comments (0)| Comments RSS | Filed under: Search Engines cloaking, doorway pages, flash, google, google guidelines, maile ohye, podcasts, sitemaps, webmaster central            

Tricks for viewing cloaked content

There are two types of cloaking: user-agent based and IP based (also known by the euphamism "IP delivery"). Cloakers try to cover their tracks by making it difficult to examine the version meant only for spiders. They do this with a "noarchive" command embedded within the meta tags. Googlebot will obey that directive and not archive the page, which then causes the "Cached" link in that page's search listing to disappear.

So getting a view behind the curtain to see what is being served to the spider can be a bit tricky. If the type of cloaking is solely user-agent based, you can use the User Agent Switcher extension for Firefox. Just create the following user-agent under Tools > User Agent Switcher > Options > Options > User Agents:

Description: Googlebot
User Agent: Googlebot/2.1 (+http://www.googlebot.com/bot.html)

Then switch to that user agent by selecting Googlebot under Tools > User Agent Switcher.

But that won't work if the cloaker is doing IP delivery. If there's no "Cached" link in the SERPs, you might think you're out of luck. But you may not be!

A lot of times, Google's "Translate This Page" functionality can be used to view the cloaked content, because many cloakers don't bother to differentiate between the bot coming in for the purpose of translating or coming in for the purpose of crawling. Either way, it uses the same range of Google IP addresses. Thus, when a cloaker is doing IP delivery they tend to serve up the Googlebot-only version of the page to the Translate tool. This loophole can be plugged, but many cloakers miss this.

And I bet you didn't know that you can actually set the Translation language to English even if the source document is in English! You simply set it in the URL, like so:

http://translate.google.com/translate?hl=en&sl=en&u=URL&sa=X&oi=translate&resnum=9&ct=result

(Above, replace URL with the actual URL of the page you want to view)

That way, when you are reviewing someone's cloaked page, you can see the page in English instead of having to see the page in a foreign language. 

You can also sometimes use this trick to view paid content. i.e. if you're too cheap to pay for content from sites like WebmasterWorld where that content has been placed behind a registration wall and removed from Google's cache.

Example

Do pay for WebmasterWorld, though. Do right by Brett.

Posted by Stephan Spencer on 02/07/2007 | Permalink

Comments (4)| Comments RSS | Filed under: Search Engines cloaking, ip delivery            

Good cloaking: straight from the search engines' mouths

I'm here at Search Engine Strategies Chicago, and today at the "Meet the Crawlers" session I asked the distinguished panel of representatives from the four major search engines the question:

What is your current official position on simplifying the URLs selectively for bots like Googlebot, Yahoo Slurp, etc. by user-agent detection in order to drop session IDs and other superfluous parameters from the URL? Do you consider it cloaking? And if so, is it good cloaking or bad cloaking?

The panel, which included Ramaz Naam from MSN Search, Tim Mayer from Yahoo!, Charles Martin from Google, and Kaushal Kurapati from Ask Jeeves, gave me and the audience their definitive answer. But before they did, Ramez from MSN Search asked for clarification:

Will the same page content display to the user if that user types into their browser the URL that was given to the bot?

I responded with a "Yes," then all four search engines all confirmed individually:

No problem.

Then Charles Martin from Google jumped in again with:

Please do that!

So there you have it. Whether or not you call this technique cloaking or not, the search engines don't mind it, and in fact encourage it!

Posted by Stephan Spencer on 12/08/2005 | Permalink

Comments (7)| Comments RSS | Filed under: Search Engines cloaking, ip delivery