SMX West 2011: The Spam Police
OMG, it’s packed in here. Matt Cutts, head of Google’s Webspam team, is on this panel. The SMX description says that in this session, some of the top “spam cops” discuss how they uncover new spam tactics, what to do if your site is inadvertently penalized and more.
Moderator: Danny Sullivan, Editor-in-Chief, Search Engine Land
Matt Cutts, Software Engineer, Google Inc.
Sasi Parthasarathy, Program Manager, Bing, Microsoft
Rich Skrenta, CEO, Blekko
Danny says there are a couple of goals for this session. First, understand the true meaning of spam. If you’re asking if what you are doing is spamming Google, chances are you’re not spamming Google. [Everyone is laughing.]
The other goal is to not make anything repetitive across the speakers. Google has a little more time than other panelists and Danny says it’s not because he loves Google more.
First up is Sasi Parthasarathy.
What is spam? One or more spam techniques to inflate the ranking in search engines in a way that adds no value to the user:
- There is page-level and link-level spam.
- Hidden text.
- Hidden links.
- Machine-generated content.
- Redirect spam/cloaking.
- Hi-jacked sites. Most sites are hacked to sell adult content or products (hint, hint).
- Link farms. Pages of little useful content but they all link to each other. Link exchanges are OK if two sites are related. Unrelated links will be discounted if this is found out.
[Wow, he is going really fast. I hope Lisa Barone is getting all this, ‘cause I sure as heck am not.]
Danny says Matt Cutts is very tired because he’s been working all night. He is up next. He is surprisingly chipper for being tired.
- He is showing a slide of a Website. There is hidden text with stuff that says “modern bathrooms” and such. There’s nothing wrong with just saying it on the page. If the bathrooms are modern, just say it. [Cracking up.]
- He is showing another slide with a site that is keyword stuffing with all the different ways to misspell things.
- Machine generated content. High risk.
- Link-exchange requests. Matt says he still gets them and likens it to someone walking up to a cop and asking where you can get good drugs. He is showing an e-mail from someone who has a site with a PageRank 0, he says he imagines it’s going to stay that way [Again, everyone laughing].
- Paid links are typically disapproved of and have low-quality content associated with it. He is reading the really low-quality content on this one site. When you pay $1 a blog post, this is what you get, he says. The links that are approved of are freely given. He said someone recently said to make something so awesome that people want to link to it.
- Hacked sites. Google really tried to work on this on 2010. It can happen to anyone, Al Gore had his site hacked. Be aware of how your link building is going.
Panda update: Feedback from users has been extremely positive. He is showing a slide that made him chuckle (below). At 11 a.m., Google announced it is beginning to give access to everyone to ban certain results straight from the Web.
- Engineers write algorithms.
- The manual updates are proactive and reactive (reactive takes spam reports and if it has four times the weight, then it might be looked at).
- The most common request to Google is to tell users when they’ve been penalized. He says they are sending out “parked domain” messages now. Register in Google Webmaster Central to get messages.
- Reconsideration request: If your site is only affected by an algorithm, that’s closed out automatically. That said, the engineers can look through the queue and see which sites are affected by Panda, for example, and then look at ways they might be able to improve the algo. If it’s something else, it may take about a week to be processed.
Rich Skrenta is up.
Yesterday, Blekko blocked 1.1 million sites from its search engine. What is spam? Blekko has a broader definition:
- Quality of content. There are certain things that just shouldn’t be in the index. He is showing a post that uses keywords for a medication that could be very dangerous to readers. Blekko takes a close look at content.
- Disqualified: Payoffs, non-experts (writing medical content, for example), sweatshop labor, too slow, too aggressive of promotion, bad conduct.
- Blekko has the right to refuse service to anyone. They are trying to clean up the Web.
[Well, that was fast. But, good bits.]
Q: Rich, what are the challenges in saying you need to get rid of things not written by experts? Journalists are not experts, so it doesn’t seem like a good definition. Seems like the criteria should be just substantial content.
A: A handful of exceptions. Reporters study their craft and there’s a code they live by. You can make a list of all the top publications. There’s only a handful.
Q: Matt, why did you need the New York Times and Wall Street Journal to tell you about J.C. Penney and Overstock paid links?
A: With Overstock, it had shown up on our radar a few times. We asked them to take corrective action.
Q: Do you recommend big brands hire SEO firms?
A: Matt says he recommends you do awesome things that people want to link to.
A discussion starts.
Rich just asked if all your SEO efforts fell off tomorrow, would people still come to your site? But then Danny says if your search traffic went away, no one would see your site. Rich says it depends on the site.
Rich is now talking about how he’s talked to half the people that have been blocked on his search engine. He tells them that their site is blocked because people hate it, so stop making sites that suck.
Q: Matt, do you have a white list?
A: When you have an algorithm you do your best to make sure it works. There is not a “golden list” of sites that are always OK. There are many algorithms that don’t have manual exceptions, like the Panda update.
Q: Matt, in Webmaster Tools in Google, you send penalty notices. Why do you not tell some people?
A: You want to help the people who have stumbled off the path a little bit and not the hacker. You want to give the information to the good guys and are always trying to find the balance.