A Comprehensive Guide to protect your site against Content Theft

A Comprehensive Guide to protect your site against Content Theft

  •  
  •  
  •  
  •   
  •  

“Content Theft” is a well known form of online theft. How many times you have seen your website / blog content published in another website / blog without your concern? Generating fresh and unique content is really hard (and you know how hard it is), but publishing that without permission is quite easy, right?

content theives

Content Theft is definitely a big problem in the world of world wide web. More and more websites/blogs are becoming victims to content thefts and the problem is looming large over almost every other webmaster in the world. But, if there is a problem, there ought to be a solution or an option to prevent it. This blog post is about finding the way out of content thefts.

Ways to Find Content Thefts

“Prevention is always better then cure”. There are numerous ways through which we can find content theft before it runs out of our hand.

Google Alert: Google Alert can be not only used for reputation management but also for effectively finding copied content. Register your post title in Google Alert and you will receive the alert as on when new content is published and included in Google’s index.

Trackback: Trackbacks from other articles can also be used to cite the articles if they have used your content in their blog. If enough credit is provided and references are cited, then it cannot be considered as content theft because, in the blogosphere, it is a common practice to give references between bloggers / blog links. If the whole article is copied, then action needs to be taken.

Online Tools: You can add your website into online tools like Copyscape and they will let you know on a daily basics after detailed scanning of the web for content similar to yours. Copyscape is a pretty good plagiarism protection service available. If you wanted to use this for an unlimited number of time, you have to choose the PRO version as the free version offers only limited search. There are several other tools like Copyscape through you can find if there has been content theft of your website/blog.

Types of Content Theft:

  • Image Theft
  • RSS Theft
  • Content Theft (Portion of your website content)

Image Theft

A picture is definitely worth a thousand words. Bloggers would love adding more images in the post to attract their viewers and readers. For instance, you can’t search on Google Image search and use whichever image you like to use in your website / blog. If you want to use any image from the Internet, look for their copyright or reprint permission and then go ahead.

How to Find Image Theft: Use Google / Yahoo (Search Engines) Image search and check for your images and look who else has used it. Also make sure to check your server report if any scraper is hotlinking to your images which highly wastes your bandwidth.

Image Hotlinking (also known as leeching, or inline linking) is the practice of stealing bandwidth from other websites, by linking to an image stored on another server.

How to Stop Image Theft: The below mentioned code need to be added in the .htaccess file which should be in your website’s root directory.

RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http://(.+.)?mysite.com/ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule .*.(jpe?g|gif|bmp|png|jpg)$ /images/sampleimage.jpg [L]

Make sure to replace the mysite with your domain and replace your own image / banner in the place of sampleimage.jpg

RSS / Blog Content Theft:

Many spammers scrap the original contents from RSS feed and replace it in their blogs by adding more ads. They think that having more new content would help them get better ranking and get them more visitors, thus, invariably generate more revenue through ads. Thanks to search engines, new era content duplication filters work pretty well to filter out scraped sites.

How to Stop RSS Theft: We can add only extracts in the RSS Feeds rather than provide entire blog content. When readers find the snippet interesting they will surely click through the website in which the article is hosted and read the entire article.

As many scrappers use online scrapper tools to scrap quality content, they will obviously scrap the entire content including the post title. They will not manually change / edit the post title. By having post title linking to post URL, each of the scrapped post will automatically link to the original post.

To do so in WordPress, simply open your single.php file and locate where the title is displayed. Then, replace the code by the following:

<h1>
<a href=”<?php the_permalink(); ?>”><?php the_title(); ?>
</h1>

Content Theft (Portion of your website content):

Each and every business works hard to explain to its clients / customers what it offers through its website content. But content scrappers easily copy-paste the content (sometimes even design) in business websites, making life miserable for the original businesses and easy for themselves (at least they think so).

We have covered this topic in an earlier blog post. There are ways in which we can file DMCA against such content thieves. We can contact Google and file a Digital Millennium Copyright Act against the copied website. After due analysis, Google will remove the copied website from their index and in addition, we can also file a SPAM report against them. This again works for blogs as well.

Some of the other ways you can report SPAM / copyright violation against the scraped websites.

,