The 950 Penalty or a New Ranking Theory?
I should preface this by saying everything you read below is pure speculation, but the fact that it’s pure speculation is why this post is so fun and you should read every single word without getting up for more water, checking your email, refreshing your MyBlogLog profile or using the restroom.
If you haven’t noticed the much-hyped debate regarding the maybe real/maybe not real Google Minus-950 penalty is heating up and is actually getting interesting.
We first started hearing talk about a supposed Minus-950 penalty back in January when several webmasters complained that pages that traditionally ranked very well were now being hidden on the last page of the search engine results. Even odder was that the only sites that seemed to be experiencing this problem were spam sites or high quality niche sites. Sites that fell in the middle were left unaffected. (Or have just been taught that openly complaining is annoying and is not a good way to make friends.)
At the time, the discussion in the forum focused on whether this penalty actually existed and why it was only certain pages that were being affected, not the entire site. Three explanations presented were:
- The penalty was a result of over-optimization, most often using too similar anchor text on a page.
- A sign that Google can’t differentiate between scraper sites and the original content producers. For example, Google runs across an instance of duplicate content in their index, doesn’t know which site to punish, so both sites are banished to the last page of the SERP.
- It’s somehow related to Google’s disengaging of the George W. Bush Googlebomb since both acts moved once high ranking pages out of a searcher’s view.
As you can see, the "explanations" offered read more like "outright guesses", and no possibility seem too outrageous. Like most forum debates, the conversation went largely unresolved.
The thread was rekindled today after WebmasterWorld administrator Tedster read through a patent filed (on my birthday) last June by Googler Anna Lynn Patterson. The patent is named "detecting spam documents in a phrase based information retrieval system" and discusses a system where word phrase frequency is used to determine if a page is spam. Some have called it low-scale version of latent semantic indexing.
The patent’s abstract reads:
"Phrases are identified that predict the presence of other phrases in documents. Documents are the indexed according to their included phrases. A spam document is identified based on the number of related phrases included in a document."
The idea here (I think) is that too many like or related phrases signals that a page is keyword stuffing and not providing useful information to the reader. Based on that find, Tedster broke off the original thread and asked: Is it a "950 Penalty"? Or is it Phrase Based-Re-ranking. Basically, is the 950 Penalty real or is Google re-ranking results due to phrase-based factors.
Tedster believes the latter, that this patent is responsible for the "penalty" site owners have been experiencing:
"My gut is telling me that this isn’t really a penalty, it’s an interactive effect of the way the Google dials have been turned in their existing algo components. It’s like getting a poor health symptom in one area of your body from not having enough of some important nutrient — even though you’ve got plenty of others and plenty of good health in many ways."
It’s hard to tell what’s going on or if anything is going on at all. Tedster backs up his assertion by reporting that he knows of a site where "one solid new inbound link from a very different domain" solved the site owner’s problem. But another member says it took him "de-optimizing" his site and lowering the keyword density of things like page titles, and body content before his site regained its rankings. So we’ve come full circle.
What do you think? Is there really a penalty or is there a filter that re-ranks results based on a sort of latent semantic indexing? If there is a penalty, is it just the MSSA penalty in disguise or is it legitimate?, Is one thing responsible for everything or is it just easier for people to make up new Google penalties than to accept responsibility for a crappy site?
The conversation is still going on at WMW so go check it out.