Right from the early days of search engine optimization content duplication has been an identified issue with search engines like Google, Yahoo! and Bing. The task before every web master is to make sure all pages get indexed while not being affected by any duplication issue. A duplicate content penalty can be really damaging as it often leads to the removal of your web pages, and in extreme cases search engines ban your entire site from indexing. This not only reduces your site’s traffic rate but mars your site’s overall online campaign prospects. For any online business this is the last thing they want to happen.
In this post I will discuss the two common types of duplication issues and how they can be fixed:
- Internal content duplication
- External content duplication
Internal Content Duplication
As the name implies, this type of duplication is concerned with repetitive content in your web pages. You should make sure not to repeat the same content with in your site. If this does occur, search engines will usually choose the most relevant content and skip the other similar pages. Here are some ways that your site may have internal content duplication that you may not know about and how to fix it!
WWW and Non WWW version
Look at these URL’s:
As a human, we all know that the above three URLs are pointing to the same page. But search engine spiders consider it as three different web pages.
Solution: To avoid such sticky situations, first decide which URL you want to rank in search engines and do ‘301 redirects’ of all others to the chosen one. Nowadays, search engines are getting smart enough to understand these as the same ones. But still we need to do this redirection to maintain consistency in the website. Good website practices!
Pages that contain session ID’s
Websites such as retail sites, could end up with several urls for each page making it more complex for search engines to find the most relevant page. This will definitely lead to duplication issue.
Solution: Google, Yahoo! and Bing had recently come up with a solution for this issue by introducing the “canonical” attribute for the rel element. [rel=”canonical”]. You can insert this tag in the header of all other duplicate files with your preferred location so that the search engine will avoid other duplicate URLs and take the one spotted in the preferred location.
Example: <link rel=”canonical” href=”http://www.example.com/product.php?item=swedish-fish” />
Printer Friendly Pages
You may have printer friendly versions for your article or blog pages which can be a really nice feature for your visitors. But this nice feature can land you in trouble with search engines. The reason is that the printer page often has a different URL but the same content as the original page. Search engines can’t identify the two different URLs with the same content and will regard it as content duplication.
Solution: Block the printer friendly URLs with robots file to avoid getting them listed in search engines.
Title & Description
It is good practice to uniquely create a title and description in your pages that relates to the content on the page. Many webmasters make the mistake of copy and pasting much of the meta tags across to all pages. This is duplicated content and is a wasted opportunity to gain full weight for unique content which summarizes the page you are on.
Solution: Double check your website that Meta Tag content is not duplicated. If it is, then write relevant metas and make it unique for all pages.
External Content Duplication
Don’t forget to spice up your site with unique content. Content means that is exclusive to your site only without copying them from other websites. If not, obviously this will be a serious issue and it will lead to removal of your website from search engines index.
Duplication via ccTLDs
It’s a nice idea having the ccTLDs (Country Code Top Level Domains) to globalize the business in online. Example: http://www.example.com, http://www.example.ca, http://www.example.in etc If we have the same content in all the domains, definitely it’s a case of content duplication.
Solution: IP Redirection (GEO Redirect) has to be done in this case. After this redirection is done, if a visitor from Canada opens the .com domain, it will automatically forward to the .ca domain. If a visitor from India opening the .com domain, should automatically take to the .in domain. This is the right way of handling the ccTLD domains. Alternatively, a domain level forwarder can also be made from each local domain to the .com. This would be my recommendation for ease of use.
Copying Content from other websites
This will have the maximum penalty from search engines which may lead to removal of your website from the indexes of search engines. DO NOT COPY content from any website – period; be sure to have unique content instead. If you notice that someone copied content from your website, be sure to file DMCA complaint against them to Google. Remember that you will be at the receiving end if you have indulged in copying content. Apart from receiving the complaint from others, Google being involves itself in manual reviews which increase the chances of being caught.
Solution: If you feel you are the victim of content plagiarism make sure to report the act. To ensure the content you use is unique we also urge you to use a duplicate content program. Copyscape is probably the most well known in the industry. Alternatively, you can copy a sentence or two from your text, plug it into Google and see if it returns any identical finds.