Interview with Google’s Matt Cutts at Pubcon
I had the pleasure of sitting down with Matt Cutts, head of Google’s webspam team, for over a half hour last week at Pubcon. I invite you to download the audio recording (31 minutes, MP3) and peruse the transcript, which follows below…
Stephan Spencer: I am with Matt Cutts here. I am Stephan Spencer, Founder and President of Netconcepts. Matt is Google engineer extraordinaire, head of the Webspam team at Google.
Matt Cutts: [laughing] Having a good time at Google, absolutely.
Stephan Spencer: Yeah. I have some questions here that I would like to ask you, Matt. Let us start with the first one: When one’s articles or product info is syndicated, is it better to have the syndicated copies linked to the original article on the author’s site, or is it just as good if it links to the home page of the author?
Matt Cutts: I would recommend the linking to the original article on the author’s site. The reason is: imagine if you have written a good article and it is so nice that you have decided to syndicate it out. Well, there is a slight chance that the syndicated article could get a few links as well, and could get some PageRank. And so, whenever Google bot or Google’s crawl and indexing system see two copies of that article, a lot of the times it helps to know which one came first; which one has higher PageRank.
So if the syndicated article has a link to the original source of that article, then it is pretty much guaranteed the original home of that article will always have the higher PageRank, compared to all the syndicated copies. And that just makes it that much easier for us to do duplicate content detection and say: “You know what, this is the original article; this is the good one, so go with that.”
Stephan Spencer: OK great. Thank you.
The way of detecting supplemental pages through site:abc.com and the three asterisks minus some gobbly-gook, no longer works – that was a loophole which was closed shortly after SMX advanced and after I mentioned it in my session. Now that it no longer works, is there another way to identify supplemental pages? Is there some sort of way to gauge the health of your site in terms of: “this is main index worthy” versus “nah, this is supplemental”?
Matt Cutts: I think there are one or two sort of undocumented ways, but we do not really talk about them. We are not on a quest to close down every single one that we know of. It is more like: whenever that happens, it is a bug to have our supplemental index treated very differently from the main index.
So we took away the “Supplemental Result” label, because we did not consider it as useful for regular users – and regular users were the ones who were using it. Any feature on Google search result page has to justify itself in terms of click-through or the number of pixels that are used versus the bang for the buck.
And the feedback we were getting from users was, that they did not know what it was and did not really care. The supplemental results, which started out as sometimes being a little out of date, have gotten fresher and fresher and fresher. And at least at one data center – hopefully at more in the future, were already doing those queries on the supplemental result or the supplemental index, for every single query, 100 percent of the time.
So it used to be the case that some small percentage of the time, we would say: oh, this is an arcane query – let’s go and we will do this query even on the supplemental index. And now we are moving to a world where we are basically doing that 100 percent of the time.
As the supplemental results became more and more like the main index, we said: this tag or label is not as useful as it used to be. So, even though there are probably a few ways to do it and we are not actively working to shut those down, we are not actively encouraging people and giving them tips on how to monitor that.
Stephan Spencer: OK.
Next question: what is the status on Google reading textual content within flash.swf files? Are there improvements to come?
Matt Cutts: It is a good question. I think that we do a pretty good job of reading textual content. Now, stuff within Flash is binary and you can define it in terms of characters and strokes – so you can have things that look like normal text – but that are completely weird and are not really normal text. So it can be difficult to pull the text out a Flash file. I think we do pretty well.
It used to be the case that we had our own, home-brew code to pull the text out of Flash, but I think that we have moved to the search engine SDK tool that Adobe/Macromedia offers. So, my hunch is that most of the search engines will standardize on using that search engine SDK tool to pull out the text. The easiest way to know whether you have textual content that can be read in a Flash file, is that you could always use that tool yourself and verify as well.
Stephan Spencer: Great tip.
All right, next question: Macromedia Adobe has the search engine SDK tool, which we have talked about now, but it has not been updated in some time, so is there still usefulness in this tool, as it continues to get older and older, in predicting what .SWF textual content can be read by the Googlebot spider? You guys evolve quite quickly and if the SDK is not keeping up, it kind of loses its utility.
Matt Cutts: Yeah. It is interesting to see Adobe have, in some cases, a renewed emphasis on Flash recently. They recently cut their prices on some multimedia Flash-type servers.
My general answer is that, probably, we will continue to rely on the search engine SDK tool. If you, as a webmaster, feel strongly that Adobe should do more and better, then I would say you could contact those guys and say: “Hey, Adobe, I wish you would continue to update that.” or “I hope you will continue to do iterations.”
My hunch is that we will essentially standardize on that SDK tool and hopefully that will create some incentives for Adobe to keep updating it, and make sure that it is as fresh as possible.
Stephan Spencer: Great.
Next question: will Google utilize the acquired “Riya” technology that determines similarity of image content through analysis of things like color, shape and texture, to assist in identifying Black Hat optimization? (Just to be clear, I don’t think Google bought Riya.) An example of where this would be useful is: if there is a background image behind links that match the color of the image, and make the links appear hidden.
Matt Cutts: It is kind of funny; I am not sure. I do not think we have “Riya” – I think we have “Neven Vision”, but your question still stands, and it is a good one: whether we will use that sort of technology to help with things like black hat text hiding.
The short answer is: we think that relatively simple heuristics, as far as color-matching, work pretty well. Of course, you can not go with the exact color, because people will monkey around in the RGB space a little bit and try to look a little different in the RGB space – but in perceptual space there is not much difference. However, in practice, the vast majority of hidden text colors are pretty similar.
I certainly have seen some spam where it was blue and noisy with blue text which did not stand out, so users did not notice it very much – but that sort of thing is relatively rare. If somebody is willing to put in the effort to effectively hide text in a very busy or interesting image, then they are almost able to do that with same amount of effort and just make good content.
I think we are, certainly, open to employing those advanced techniques to things like: what is the dominant color of an image, or things like that – but, in practice, it seems like most people have not tried to exploit those particular holes that much.
Stephan Spencer: OK.
Next question: are social bookmark links given less weight than other back links – given how easy these services are to manipulate?
Matt Cutts: Typically, our policy is: a link is a link, is a link; wherever that link’s worth is, that is the worth that we give it. Some people ask about links from DMOZ, links from .edu or links from .gov, and they say: “Isn’t there some sort of boost? Isn’t a link better if it comes from a .edu?” The short answer is: no, it is not. It is just .edu links tend to have higher PageRank, because more people link to .edu’s or .gov’s.
To the best of my knowledge, I do not think we have anything that says social bookmark links are given less weight. Certainly, some sites like del.icio.us and other people, may choose to put individual “nofollows” in and they may choose to take actions to try to prevent spam, but we do not typically say anything like: social bookmarking by itself – give less weight.
Stephan Spencer: OK. So, I guess, a follow on to that would be: a .edu and .gov link, and so forth, has, typically, a more pristine link neighborhood, so it is not just about the PageRank, right? The link neighborhood comes into play.
Matt Cutts: That is a little bit of a “secret sauce” question, so I am not going to go into how much we do trust that sort of stuff.
Stephan Spencer: OK. I am going to slap my wrist now. Ouch, ouch!
Matt Cutts: [laughing]
But, certainly, all of the things that have good qualities of a link from a .edu or a .gov site, as well as the fact that we hard-code and say: .edu or .gov links are good – and when there are good links, .edu links tend to be a little better on average; they tend to have a little higher PageRank, and they do have this sort of characteristic that we would trust a little more. There is nothing in the algorithm itself, though, that says: oh, .edu – give that link more weight.
Stephan Spencer: Yes. Which is what I would expect that SEOs would have already realized.
Matt Cutts: Well, you would be surprised how many are like: “Oh, I have to get .edu links because they are better.” You can have a useless .edu link just like you can have a great .com link.
Stephan Spencer: Yeah. And for those of you who do not believe that, just do a search for “buy viagra” and look at all the .edus that come up, or “viagra site:edu”.
Matt Cutts: [laughing]
Stephan Spencer: Pretty sad.
Next question: given the ever-broadening definition of doorway pages in Google’s Guidelines, would a poorly done site map page now be considered to be a doorway page? A page that is just a list of links with no real hierarchy, very keyword-rich because there are full product names and category names and so forth.
Matt Cutts: Typically, we try to be relatively aware and relatively careful about that, because it is very natural to say: take a list of all of my pages and export that, then turn them into clickable links and now I have a site map.
In fact, if you made a sitemap file, or sitemap file ‘proper’, you would end up with something that you could submit directly to Google. At first glance, that might look keyword-rich or that might look like a doorway page, but we try to be relatively savvy.
A good example is About.com. They have had site maps for a long time. They had even named it “SpiderBites”, which, at first glance looked like: “Hello! You are going for the Google Spider or something” – but whenever you dug into it, it was radically clear that what they were doing, was just normal site map behavior. It was not that they were trying to do any malicious work.
I think our own page algorithms for scoring content do a pretty good job of looking past keyword stuff, and things like that, anyway. It is also the case, that we try to be pretty savvy about that. That said, I think you have got a question you will ask later about how many links exactly you can get on a page? So, we may go into it in more depth then.
Stephan Spencer: OK.
Next question: what is excessive in the length of a keyword-rich URL? We have seen clients use keyword URLs that have 10 to 15 words strung together with hyphens; or blogs – we have seen them even longer there. A typical WordPress blog will use the title of the post as the post slug, unless you defined something different and you can just go on and on and on. Can you give any guidelines or recommendations in that regard?
Matt Cutts: Certainly. If you can make your title four- or five-words long – and it is pretty natural. If you have got a three, four or five words in your URL, that can be perfectly normal. As it gets a little longer, then it starts to look a little worse. Now, our algorithms typically will just weight those words less and just not give you as much credit.
The thing to be aware of is, ask yourself: “How does this look to a regular user?” – because if, at any time, somebody comes to your page or, maybe, a competitor does a search and finds 15 words all strung together like variants of the same word, then that does look like spam, and they often will send a spam report. Then somebody will go and check that out.
So, I would not make it a big habit of having tons and tons of words stuffed in there, because there are plenty of places on a page, where you can have relevant words and have them be helpful to users – and not have it come across as keyword stuffing.
Stephan Spencer: So, would something like 10 words be a bit much then?
Matt Cutts: It is a little abnormal. I know that when I hit something like that – even a blog post – with 10 words, I raise my eyebrows a little bit and, maybe, read with a little more skepticism. So, if just a regular savvy user has that sort of reaction, then you can imagine how that might look to some competitors and others.
Stephan Spencer: Yes.
Do you think we are moving towards algorithmic search results having substantially more human validation and/or intervention? There is the project, such as Search Wikia – they seem to be going down that path. What do you think? What does Google think about this?
Matt Cutts: It is a really interesting topic, because when Google started, we had just a few hundred people and the Web was so very large. We had to process tons, and tons of pages and tons, and tons of languages. We had to have the most capable, robust approach as we could.
The only thing that would really work well at that time was algorithms, because computers do not get tired, they can work 24/7, they do not exhibit any bias by themselves. Of course, an algorithm could somehow have some bias baked in when the human wrote it, but the computer itself is perfectly logical when it executes that algorithm.
So, for the longest time, Google pursued that as its first and foremost strategy – to the point where some people think that Google is nothing but algorithms and there is no room for any humans at all. In fact, we tried to be relatively clear that, if someone reports an off-topic spam that is redirecting to porn – everybody wants that gone except for the porn spammer. So, we are ready to take manual action on that.
Going forward, I think it is really interesting to think about the role of humans in search. I have done a post on my blog about that. I think that, if you can use humans but in a scalable and robust way – that is really the key. If you had to have a person construct all the search results for one search, there are so many search results and the long tail is so long, there is no scalable way you could do it.
But, for example, let us suppose you could have some humans figure out a scalable way to find spam, or a scalable way to say whether individual sites are good or bad, then those are the sort of things where it could be on the order where humans could genuinely help you.
I am glad that Wikia exists and that they are going to try this approach that puts a little more emphasis on people, because I think we need to let 1,000 flowers bloom and let lots of different search engines with lots of different philosophies try those ideas. And I think Google is willing to be pragmatic and embrace any approach that might work.
Stephan Spencer: OK.
Initially, it was stated that “nofollowed” links would be followed and crawled, but PageRank would not be passed. But you have recently stated that “nofollow” links are not even used for discovery. First of all, let us confirm: is that the case that they are not even used for discovery, and, if so, why the change?
Matt Cutts: It is interesting. Whenever we talked about it originally, we said PageRank would not be passed, and the messaging that I tried to do was that it would not even be followed and it would not even be crawled. It turned out there was a really weird situation, where, if you had totally unique anchor text that nobody else had, we would not follow that link – but if we had found the page from some other source, we still had this anchor text lying around and we were willing to associate it with that page.
Personally, I think that is almost a bug, because if you ever sign a blog post with a comment and you have some really weird anchor text, then when you search for that text and you find the blog post, your natural conclusion is that these “nofollowed” links do contribute something – whether it is PageRank, anchor text or some sort of vote. Then you immediately get back to people trying to spam blogs and trying to spam all those places that have “nofollowed” links.
I almost view it as, for a short time it was almost like a bug – that some anchor text, in some very strange situations, could flow. We have fixed that.
There was an example, where someone had done “dallas auto repair warranties” and another query, where they thought that “nofollow” had actually passed either anchor text or PageRank. My suggestion would be that people should repeat those experiments, because I do not think that those experiments will hold true now.
In fact, if you look at the Wikipedia pages for “Nofollow” (at http://en.wikipedia.org/wiki/Nofollow), they say – in “reference number eight”, if I remember correctly – something about how these links may still be used in some limited circumstances for this or for that. At least for Google, we have taken a very clear stance that those links are not even used for discovery; they are not used for PageRank; they are not used for anchor text in any way. Anybody can go and do various experiments to verify that.
Stephan Spencer: Great.
How concerned are you over the tactic of aggressive link buying to competitors’ sites in order to take down competitors? How long, do you think, it will be until competitors start taking each others’ sites out in Google with aggressive link buying?
Matt Cutts: I do not think a smart competitor will even try that second one because they would be more likely to help. The thing is, we are very aware that site ‘A’ could buy links to site ‘B’, and then spam-report site ‘B’ and try to frame site ‘B’. So we try very hard, in all of our spam techniques, to make it so that one site can not sabotage another site.
If you will notice, we do not say that it is impossible. The reason we do not say that it is impossible is, if you remember sex.com a few years ago, somebody – if I remember correctly – sent a fax and claimed to be the site owner and grabbed the ownership of sex.com and kept it for a few years, until they were forced by a court to relinquish it.
There is always the ‘far out’, possible case where somebody could do identity theft and grab your domain and hurt your domain that way. So we do not say it is impossible for a competitor to hurt another competitor, but we do try very hard. In fact, you have noticed that, with link buying in particular, we have been concentrating in the last couple of months more on the link selling aspect of that.
The odds that someone can come to us and say: “Oh! Someone hacked my site and sold links on my site for four months, and I had no idea! And, oh, yeah, I did bill it in Google Checkout, but they hacked my Google Checkout account, too! And I am being framed! It is a conspiracy!” – the odds of someone plausibly being able to make that argument are a lot lower. We do try very hard to prevent someone from hurting somebody else, and we are very mindful of that.
Stephan Spencer: Great.
Earlier in the year, Yahoo introduced the <div class=”robots-nocontent”> as a way to isolate parts of a page. I have not heard much since then, or if that is still viable, but, in any event, it was received with mixed emotions. Is this something, though, that you guys have given any thought to?
Matt Cutts: We definitely have. Personally, I think it is kind of interesting, because it gives more flexibility to site owners to sculpt how they want to flow PageRank or to change how the page should be indexed. I am always a fan of giving people more flexibility and more tools.
The downside of that – which immediately becomes apparent when I talk to other Googlers and whenever you think about it for a while – is that it is another feature that has to be supported. And I like to joke that the half-life of code at Google is about six months. You could write some code, come back six months later, and about half of it would be on some new infrastructure or be stale and so on.
We are constantly working on improving our infrastructure and our architecture. To have another feature to support, it has to be something that is compelling, that a lot of people use. So, what we did is we said: “OK. Let’s wait four or six weeks, and see how many people on the web are really using this particular feature.”
I made a deal with another Googler and said: “OK. If a lot of people use it, then maybe we will be more likely to support it.” If I remember correctly, it was less than 500 domains had used this tag at all. And in the grand scheme of things, where there are literally hundreds of millions of domains and tens of millions of very active domains, it is not the case that 500 sites is a very large amount.
My guess is, we would be more likely to spend our resources on other stuff, at least right now. We are open to the idea, but we have not heard a lot of people really, really asking for it.
Stephan Spencer: OK.
Google recommends having “no more than 100 links per page, for good usability” – and it is good usability. Pages with much larger number of links may be considered to be edging into doorway page status. So the guideline, for our listeners, is: “Create pages with good usability, intended for end users and not for search bots.”
However, DHTML allows people to create really great, usable pages with far larger amounts of links on the page, and allows those links to be crawlable. Users could click the “category” link to expand menus for links to sub-pages, for instance. Could we assume that, if the page is nicely usable, it might be OK to do far more, perhaps, than the 100 links per page guideline? What is the new cut-off number, or a new guideline, in this age of DHTML?
Matt Cutts: I would recommend that people run experiments, because, if you have 5,000 links on a page, the odds that we would flow PageRank through those is kind of low. We might say at some point: that is just way too many thousands of links. And at some point, your fan-out is so high that the individual PageRank going out on each one of those links is pretty tiny.
I will give you a little bit of background – and I encourage people to run experiments and find what works for them. The reason for the 100 links per page guideline is because we used to crawl only about the first 101 kilobytes of a page. If somebody had a lot more than a hundred links, then it was a little more likely that after we truncated the page at a 100 kilobytes, that page would get truncated and some of the links would not be followed or would not be counted. Nowadays, I forget exactly how much we crawl and index and save, but I think it is at least, we are willing to save half a megabyte from each page.
So, if you look at the guidelines, we have two sets of guidelines on one page. We have: quality guidelines which are essentially spam and how to avoid spam; and we have technical guidelines. The technical guidelines are more like best practices. So, the 100 links is more like a ‘best practice’ suggestion, because if you keep it under a 100, you are guaranteed you are never get truncated.
So, certainly, I do think it is possible to have more links, especially with DHTML – that was once an issue. But, people should always bear in mind to pull in a regular user off the street and have them take a look at it. If you have got so many links and they are so in a particular spammy nature or whatever, that it looks spammy to that regular person, then you want to think about breaking it down. There are a lot of ways you can break it down: you can go by category; you can go chronologically; you can have different topics. If it feels like you got too many, you can definitely break it into a lot of sub-categories.
Stephan Spencer: Right.
Next question, also regarding that 100 links per page recommendation: Is the higher number of links on say, category pages or sub-category pages, more permissible than on product or static content pages because the latter would appear more spammy?
Matt Cutts: I think you would want to apply the common sense approach. So, let us talk about a newspaper, for example. A newspaper might have written thousands of articles, and so, to have all of that linked to from one page would be probably a bit much – even for users.
Suppose the newspaper decides to break it down chronologically. They will have: “all the articles we wrote in 2007″. Then you click on that, and maybe that is still like 2,000 links, which is a little high.
So then they might break it down to: “all the stories we wrote in January, 2007; February, 2007; March, 2007. You go through and, suppose there are 120 or 200 links – that is more than our 100 link guideline that we give on the technical side, but the user who has gotten there, really understands why. They would say: “Oh, well, I wanted this story from March 2000. I clicked on 2000 and I clicked on March. I knew the story was on March 14, and here is my story.”
There definitely can be situations like that, where you have a larger number of links on the sub-pages, the sub-categories or the categories, but because that is the most logical way to break it down, it can make perfect sense for users and, therefore, perfect sense for search engines.
Stephan Spencer: Good suggestion.
Will the reasons for a site’s PageRank reduction ever be disclosed to webmasters in Webmaster Central?
Matt Cutts: I am open to that. It is kind of funny because first and foremost, we have to care about fixing any problems we see, trying to make sure that we have the most clean index that we can. So, malware is a good example of that. First and foremost, we did not want to return malware to users. So, we started out just by removing sites that had malware even if they were hacked.
We try to take the hacked sites out for a short period of time, but we did not have the resources to contact all of those people and to work one-on-one to help them get the malware removed from their site. Then, over time, we got better about messaging. We would show that a site was removed for malware, and then we had a process that was a partnership with StopBadware where it would take up to 10 days.
At the time, people were like: “Ten days for my site to get back! That causes me a lot of stress and a lot of pain!” – but compared to the stress and pain of a user who got malware from your site, we have to balance those.
We have continued to iterate. We have gotten better and better. So now, you can more or less get your site malware re-reviewed in about 24 hours, and we have just recently started to show messages to webmasters in our message center in Webmaster Central to say: “Yes, your site has some malware.” If I remember correctly, we even show a few example URLs to say: “Yes, here is where to look to find the malware.”
That shows this gradual progression where, first and foremost, we have to take care of spam, the viruses, the malware or the trojan. Then, over time, we polish off those rough edges and we try to provide better messaging and better alerts to help the webmasters as well. I could certainly imagine that over time, we could tell a webmaster: “Yes, we uncovered links that looked like they were certainly sold, so that played a factor in Google losing a little more trust in your website.” I am certainly open to doing that.
You also have to think about whether a site can be pulled toward white hat or not. Clearly, if somebody is a malicious spammer and they are just trying to do awful, awful things, you do not want to give them a head’s up that they have been caught. So, if we have seen someone that we think is deliberately abusive and really spammy and really savvy and they know what they are doing, then they might not expect to get a head’s up in our “Webmaster Console”.
But if someone is a relatively new webmaster, a small Mom-and-Pop business, maybe we think they did not know any better, then it is a little more likely that we might try to give them some message to say: “This is an issue. It is a violation of our guidelines. It is a violation of every search engine’s guidelines. Here is where you can read more about it. If you can correct this issue, then here is where to go and request a reconsideration.”
Stephan Spencer: Right, because if somebody is a white hat and has a history of being a white hat, certainly they deserve to be given a head’s up, whereas you do not want to define that line for a black hat spammer.
Matt Cutts: Yes. You do not want to clue in the bad guys but you want all the people who are on the fence or who are right towards the white hat edge, you want to keep pulling them into that white hat direction.
Stephan Spencer: Yes.
My last question here: what RSS feeds do you subscribe to?
Matt Cutts: [laughs] A better question is what RSS feeds I do not subscribe to.
Stephan Spencer: Perhaps you can just supply us with your OPML file?
Matt Cutts: [laughs] You know, I have thought about that. At various times, I have done screenshots so the people could get a sense of the sort of things I read. It is funny because I have it broken down into general search, white hat, and black hat. I try to keep the black hat folder closed so people do not feel bad. You know: “Oh, no! I am in Matt’s black hat folder!” Although there are a few people in there.
Stephan Spencer: Some would feel good. [laughter]
Matt Cutts: Yes, maybe they would be honored, who knows, but I do not want to give them the glory in that case. But, yes, certainly sites like Search Engine Land, Google Blogoscoped, Google Operating System, you know. Those are fantastic to just get first line news. Then, there are things like Search Engine Journal and all those sort of guys where you can get a lot more follow-on news or thoughtful commentary afterwards.
There is a lot of really good feeds that I read. I read about 70 or maybe even a 100 in the search space. A few that I read that are not search – there are only five or 10, but XKCD is a Web comic that is really pretty funny, that is very Web savvy. I found a feed on Flickr for their “Photos of the Day” which is just a nice way to start your day.
There is a neat site called One Sentence and the idea is that you have to tell an entire story in one sentence. You know, they are very compelling stuff. So, it is about 10 sentences a day, about 10 posts, and that is really a fun site as well.
Stephan Spencer: Cool. All right. Well, thanks very much for your time, Matt.
Matt Cutts: Yes, good talking to you.
Vote for Your Favorite SEO Article!
I am very honored and humbled to see that two of my articles were nominated for a SEMMY (The Year’s Best Posts in Search Engine Marketing).
In the category of SEO category for the 2008 SEMMY Awards, my article on Search Engine Land was nominated, “Sculpting Your PageRank for Maximum SEO Impact.”
Vote here for the year’s best in SEO articles.
Also, in the category of Web Analytics, my June article from Practical eCommerce, “SEO Metrics that Matter” was also nominated.
Vote here for the year’s best in Web Analytics articles.
My colleague, Chris Smith, was nominated for local search from his recent article in Search Engine Land entitled, “Anatomy and Optimization of a Local Business Profile.”
Vote here for the year’s best in Local Search.
Good luck to everyone who was nominated! Voting ends on January 30th, so be sure to cast your vote for your favorites today!
Deconstructing Google snippets
Most SEOs think the path to better snippets is writing a compelling, keyword-rich meta description tag. But that’s only part of it.
Meta descriptions aren’t going to help your rankings, but it’s worthwhile spending time on them because they can — and often do — make their way into the snippet.
But did you know that Google snippets can contain meta description copy and page content? Yep. Here’s an example. Do a Google search for “lord of the rings downloads” and you’ll get the following search listing of the lordoftherings.net home page at #2:
![]()
The first part of the snippet, the part preceding the first elipses (“The Lord of the Rings Movie: Return of the King tickets.”) comes from the meta description. The rest of the snippet (“shelob, and dark moria here. lord of the rings fellowship of the ring trailer movie downloads”) comes from content in the page’s HTML.
It was funny to see where this page content (which makes up the second part of the snippet) comes from… as it’s not visible to humans on the page; it’s actually keyword-stuffed hidden text embedded within a noscript tag. (Also note the mile-long meta keywords tag. Who woulda thunk… Peter Jackson is a black-hat!
)
When I was in their HTML source, I also noticed that LOTR’s meta description contains a keyword list, which I’m not a proponent of. A meta description tag containing repeated keywords or long strings of keywords separating by commas does not make for a compelling snippet. This is LOTR’s home page meta description:
The Lord of the Rings Movie: Return of the King tickets. Official LOTR New Movies Site. Listings, Showtimes, Trailer, Pictures, Wallpaper, Swords, Pics, Film Exclusives, Characters, Screensaver, Desktop Theme, Art, Downloads and News.
Another tip… don’t use the same meta description across all your pages. That makes for a lot of similar-looking snippets, which could potentially trigger Google’s duplicate content filter.
My Interview with Chris Alan, SEO Manager at Expedia.com
Late last year I had the pleasure of interviewing Chris Alan, a true SEO veteran and head of SEO over at Expedia.com. Chris was one of my fellow speakers at the AMA Hot Topic: Search Engine Marketing conferences I chaired in Seattle and Boston last Fall.
Now this interview is finally available as a half hour long audio podcast for your listening enjoyment, as well as an abridged transcript for you speed readers who prefer the written word over the spoken.
Chris and I had a fascinating discussion about the unique challenges of search engine optimizing huge sites. One of the topics we covered was landing pages for SEO and how they differ from PPC landing pages. Chris explains that, unlike with paid search campaigns, in SEO you can’t just switch out landing pages very easily — at least not without some powerful technology. Furthermore, any changes to improve conversion on the SEO landing page can negatively impact the page’s organic rankings, thus making it harder to pinpoint what’s making the page less effective. What’s needed is a solution that allows you to make changes to the page yet still maintain your rankings.
Chris understands the value of empirical testing. SEO is an experimental science. You can’t just blindly implement SEO tactics prescribed by SEO experts on their blogs and in the forums. You need to test it for yourself. This is harder than it sounds when you’re dealing with millions of pages indexed. That’s why Expedia has a suite of very sophisticated tools at their disposal, that Chris could only briefly allude to.
Hope you enjoy this podcast!
Social media optimization tips from Neil Patel
I had the pleasure of interviewing Neil Patel, leading practitioner in social media optimization, recently by phone and by email. Social media optimization is the new art of wielding tools, strategies and influence for the purpose of gaining visiblity on social media networks and sites like Digg, del.icio.us, reddit, NewsVine, Netscape.com, MySpace and even Wikipedia.
There was a great Wall Street Journal article in February talking about social media and the top influencers. Neil was featured as one of the top influencers on Digg.com.
When Malcolm Gladwell, author of The Tipping Point (a book I highly recommend, btw!), wrote about “connectors” and the power that they wielded to influence large populations of people — to infect them with new ideas, fashions, fads and so forth — I really think of people like Neil as the online equivalent. When Neil submits something on Digg, it can yield 20,000-30,000 visitors and cause the featured story’s web server to crash!
Making it on to the Digg.com home page is a laudable goal for social media optimization but, as Neil points out, it is not always appropriate or feasible. Digg users are alpha geeks. They are not going to be terribly receptive to articles about home decor or feng shui.
StumbleUpon is another great social media network worth targeting. For those who are unfamiliar with StumbleUpon, it is like channel surfing — but on the Web. There is a plugin that you install on your web browser that provides a button that you can press to channel surf. As part of the installation process, you select which topics you are interested in. Then, when you hit the StumbleUpon button, you are taken to websites which are given a “thumbs up” by other StumbleUpon users and which are in your areas of interest. I’ve found some really neat websites just by “stumbling upon” them.
Each social media site network has its own quirks and nuances and politics. Getting high visibility on reddit requires a very different submitter profile, story, topic and so forth than Digg. Getting visibility in Wikipedia is a real quagmire. Stumble Upon is certainly more straightforward than Wikipedia but it has its own quirks and tricks.
Have a listen to my 15-minute podcast interview with Neil, and also check out the text interview (conducted separately by email) which is in the Netconcepts’ Cool Friends library of interviews.
Neil will be speaking at the American Marketing Association’s Hot Topic: Search Engine Marketing, in San Francisco on April 22, NYC on May 25, and Chicago on June 22. I highly recommend attending. I’m chairing the conferences, so I’ll be there too!
Interview with Google’s Vanessa Fox
I had the distinct pleasure of spending an hour on the phone with Vanessa Fox, Product Manager of Google Webmaster Central, interviewing her just over a week ago. Our discussions ran the gamut of SEO issues — redirects, duplicate content, AJAX, Flash, PageRank, and of course, the wealth of tools and reports that Google has made available in their Webmaster Central.
The interview has been edited down to 40 minutes and is now available for download:
Download / Listen to the interview » (MP3, 9 megs)
Vanessa will be speaking at two of the upcoming American Marketing Association one-day conferences, titled Hot Topic: Search Engine Marketing. Her colleague Amanda Camp will be speaking at the other one.
I will be conducting interviews of all the illustrious faculty of search marketers over the coming weeks, so be sure to subscribe to the RSS feed to get these podcasts delivered directly to you automatically as they are published.
Subscribe to the RSS podcast feed »
These conferences present a unique opportunity to hear — in a small intimate environment with dozens of delegates instead of hundreds — the latest tips, tricks, tools, trends and best practices from Googlers Vanessa Fox or Amanda Camp along with search marketing practitioners and gurus Eric Ward, Neil Patel, Alan Rimm-Kaufman, Chris Smith (SuperPages.com), Paul O’Brien (HP) to name a few.
Oh, and I’ll be speaking too, as well as chairing the events.
Mark your calendars: April 20 in San Francisco, May 25 in NYC, and June 22 in Chicago.
Hope to see you there!
Spend 30 minutes with Google engineer Amanda Camp
Want to spend 30 minutes discussing SEO with Google software engineer Amanda Camp? Well now you can! Errr, to be more precise, you can spend 30 minutes listening to ME discuss SEO with Amanda.
I just posted my interview with Amanda (33 minute MP3, 8 megs), so be sure to have a listen!
See Amanda live this Friday in San Francisco at the Hot Topic: Search Engine Marketing conference being put on by the American Marketing Association.
Insight from search marketer Paul O’Brien of HP
Paul O’Brien is a brilliant search marketer from Hewlett-Packard. I had the pleasure of interviewing him a little while ago. (He has a new baby, btw, so be sure to congratulate him if you talk to him!)
Listen to the audio here. (MP3, 40 minutes, 10 megs)
Avoiding the landmines with Google quality scores and other paid search gotchas (podcast)
In this hour-long podcast, my conversation with Alan Rimm-Kaufman of paid search agency Rimm-Kaufman Group covers topics of paid search, natural search, books, economics and incentives. We dig deep into paid search, and Alan shares some real gems — from quality score gotchas to daily caps to metrics. It was a fascinating discussion.
This is the latest in a series of podcasts for the American Marketing Association’s “Hot Topic: Search Engine Marketing” conference. Both Alan and I will be speaking for the AMA this Friday in NYC and next month (June 22nd) in Chicago. (There’s still time to register for either one, btw.)
Download / Listen to the interview (MP3, 55 minutes, 13 megs)
Search marketing successes and learnings at Wards.com
First off, I must apologize for the radio silence for a number of weeks. Over half the month is gone already and I hadn’t blogged once! Very naughty! I had a good excuse though — I was super-busy with my move from New Zealand back home to the States. (Yes, that’s right, back in beautiful Madison Wisconsin, in time to enjoy a second summer this year!) Now that the move’s done, and all my conference hopping — SMX, Internet Retailer, SES Toronto, and various AMA speaking engagements (last one is this Friday, btw) — is nearly at an end (at least for a few weeks), I can take a breath and get back into blogging. Back to the point of this post…
Last month I had the pleasure of interviewing Marylynne Tosyali, who is the Director of Online Marketing at Direct Marketing Services Inc., the company behind Wards.com, HomeVisions.com, and several other online stores. We talked about their foray into blogging, about GravityStream (Disclosure: yes, they happen to be a client of ours), about paid search successes, and a bunch of other search related issues. It was a great interview, and I’m pleased to share with you the 45 minute audio recording…
Download / Listen to the interview (MP3, 44 minutes, 10 megs)
This is the latest in a series of podcasts for the American Marketing Association’s “Hot Topic: Search Engine Marketing” conference in Chicago.
Marylynne will be speaking at the conference this Friday (June 22nd). I’ll be speaking there too, btw!
There’s still time to register. It’s going to be a great turnout. Hope to see you there!




