23 January 2009

Dealing with Duplicate Content: A Practical Guide

The Internet thankfully is still all about great content. Despite all the new marketing disciplines (SEO, SEM, SMO, SMM, ORM and IM) punting guaranteed search results, search engines serve appropriate results rich in content, thus keeping the system honest and ensuring their own survival.

The issue of duplicate content is something which mystifies, even plagues many Internet marketers. Every Internet marketer fears a drop in their websites search engine ranking. Duplicate content could result in a drop in search engine ranking or worse. Being black listed and having penalties instituted against you could deal a devastating blow to your search engine marketing efforts and prove very hard to recover from.

Before getting into ways of preventing search engines from perceiving your content as being duplicated it is necessary to understand that there are different types of duplicate content. Some occurrences of duplicate content are un-intentional whilst others are due to malicious and unimaginative “black-hat” marketing tactics.

Google defines duplicate content as: “substantive blocks of content within or across domains that either completely match other content or are appreciably similar.” They also acknowledge that “mostly, this is not deceptive in origin”.

Duplicate content can occur in the following ways:

Having two domains carry the same content. When an organisation decides for geographical and/or other search reasons including canonical domains, to use the same website content on two domains, it confuses search engine robots causing them not to list the site at all.

Using Dynamic URL’s. Database driven websites, in which content is stored in a database and accessed on demand, may serve the same content under different URL’s, depending on the search parameters being used. Search engines can view this as duplicate content.

Circular navigation, in which there are multiple paths to the same website can also cause content to appear to be duplicated.

Providing printer-friendly pages. Users often require hard copies of the information on your website. It is therefore sometimes necessary to provide them with the option to print content. This can be done in one of two ways. You could either host a printer friendly copy (i.e. duplicating the content) or if you are using a CMS, the same content can be supplied in both web and printer formats by the system. Unfortunately, both ways cause your content to appear to be duplicated.

Providing a mobile-friendly version of your website. With the number of Internet friendly mobile devices hitting the market in recent days, it is becoming essential to offer a very basic version/format of your website specifically for user’s browsing your site from a mobile phone. As in the above case however you may run the risk of duplicating content.

Syndication: In cases in which your content is syndicated to other websites or when your articles and blog posts are submitted to content repositories search engines are likely to pick up your content as a duplicate and possibly link to the more “popular” version of your content.

E-commerce product descriptions: Common e-commerce products often have manufacturer descriptions which are used when they are retailed online, particularly in the case of affiliates. This too could be viewed as duplicate content and affect a websites ranking. The same applies in cases where products have multiple categorisations.

Plagiarism is presenting the written intellectual product of another person as your own without their permission. Copying online content is very easy to do and can be a quick fix for websites requiring content in a hurry. Although in most cases search engine crawlers can be relied on to select and serve a link to the originator of the content this unethical duplication of content could result in a loss of traffic.

Content Scraping refers to the use of a crawler to copy website content and rephrase it in an attempt to present it as unique content.

Preventing penalties

Whilst more advanced search engine crawlers do a good job of distinguishing between genuine cases of duplicate content and inadvertent instances, you can avoid being wrongfully penalised for duplicate content by doing the following:

Redirects: In order to ensure that search engines don’t view multiple domains and dynamic URL’s as duplicated content, select a canonical domain and use 301 redirects for permanent pages and 302 redirects for pages with content which is changed or updated often.

Robots.txt: In the case of printer friendly or mobile versions of your website, you need to indicate to the search engine that the information contained therein is not for crawling, use robots.txt to do so. You can also request that syndicated copies of your content be blocked using this method.

Style sheets: As an alternative to creating a duplicate of your webpage for print purposes you could program a style sheet for print. Provided that your website is programmed well, this may be a viable alternative to using robots.txt.

Link back to your site: If blocking syndicated content using robots.txt is not an option for you ensure that your articles contain links to your website.

Categorise and enhance product descriptions: It is really important to categorise products carefully, taking a user’s browser behaviour into consideration rather than duplicating a single page under various categories. Manufacturer’s product descriptions or reviews should be enhanced by providing additional unique content.

Use a content checker: You can detect duplicate content using online duplicate content software, a pro-active approach is required.

Claim ownership of your content: In cases in which your content is continuously ripped off and you are convinced that it negatively impacts your search engine ranking, you can file a DCMA infringement request with search engines such as Google, MSN and Yahoo.

Knowledge of duplicate content issues and how to prevent them could be beneficial when optimising your website for better search engine performance. It is important to keep in mind that your website could contain duplicated content and to take steps to ensure that preventative measures are implemented.

Check your site for duplicate content

www.copyscape.com
Dealing with Duplicate Content: A Practical Guide Bookmark and Share

SEO News