BACK TO BASICS: Duplicate ContenBACK TO BASICS: Duplicate Contentt
BACK TO BASICS: Duplicate Content
When determining the quality and relevancy of a page, most search engines compare the similarity of the content on that page to other pages on that website as well as other pages in their index. In general, search engines (especially Google) feel that duplicate content would probably not be very valuable for their users. As a result, they often penalize pages and/or sites with entirely or mostly duplicated content, making it difficult for these sites to rank well in their SERPs.
The search engines have tools to compare the similarity of content as well as the similarity of HTML code. Having very similar HTML code is commonly a byproduct of creating a consistent feel from page to page, so it is not usually a problem when comparing pages on one domain. Generally, the search engines compare HTML code to identify networks of duplicate or nearly duplicate sites that have been created to artificially increase link popularity and stack search engine results.
Having nearly identical text on multiple pages within one domain can hinder the rankings of those pages with the duplicate content, as well as the rankings of the site as a whole. After all, search engines are in the business of providing useful results to people searching for specific things. How useful can a website be if all of the pages say almost the exact same thing? Content is king, true but duplicated content is a sad little king of a sad little hill. Unique content, by contrast, is the absolute monarch, lord of all it surveys.
The most recent updates from Google have focused on eliminating duplicate content from their index. White papers released by Google indicate a very strong attempt on their part to track back content to the source and assign value only to the original document. This has been borne out in the rankings. Sites in all categories are finding that the previously accepted amounts of duplicate content are now losing ranking. There has even been speculation that directories with too similar of content, like those who seed their categories with DMOZ listings, have been taking a hit. Penalties for duplicate content have gotten worse as well. Before, taking a hit meant that your rankings would suffer. Now, the duplicate pages vanish entirely.
So what should you do if you discover that your content is duplicate? There are five possible methods for reducing the similarity of content on your pages:
- Eliminate the duplicate/similar content entirely.
- Rewrite the content to make it more unique.
- IFrame the content so that the actual code resides in another file. The content will still be displayed to your visitors, but the search engines will not see it.
- Make the content an image so that search engines cannot read it.
- Create another page for the content and link to it from the original page.
Obviously the first and second solutions are the most desirable as it allows you to avoid a penalty altogether. Better yet would be to write entirely new content to replace the duplicate content, adding value for your users as well as establishing you as an expert to the search engines. However there are times when you can't write or eliminate the content entirely. Technical specs, help files, and mandatory content have to stay the same. It is in these cases that the final three solutions may be implemented.