SEO Wars II: Attack of the Clones - Website Content Scraping And Duplicate Content

In SEO circles it is commonly believed that content is king. This is largely true. Relevant content attracts search engines, and well-written content maintains healthy site traffic. But the Dark Side seeps into every aspect of SEO, seeking to unsettle the balance.

Black Hat SEO, ever searching for an easy way to optimise their sites, turn to a technique called content scraping. There are two conventional methods of content scraping:

  • Automated content scraping
  • Manual content scraping

Automated Website Scraping

This technique involves downloading an automated script from the Internet, and then uploading this script to your site. I won’t tell you where to find any of these bots or what they are called. Temptation leads to the dark side.

These bots are similar to the search engines web crawlers, in that they crawl the web, looking for sites relating to the key words you’ve selected for them. When they find sites sporting the relevant keywords, in the right quantities, they literally steal the content from the site.

Using the ‘scraped’ content, these bots then randomly generate content for your site. The problem with this is that the generated content is obviously machine generated. The flow of writing is almost non-existent. The subject of one chapter will be almost unrelated to that of the next. Points will be hard to follow, the structure will be erratic.

On top of the terrible reproduction of stolen content, automated website scraping scripts are easy for search engines, like Google, to detect and their use is punished swiftly.

Manual Content Scraping

Plagiarism is illegal, and make no mistake, that’s exactly what this is.

The second Black Hat means of acquiring content without actually writing it is more laborious than using automated scripts, but sightly safer.

Black Hatters will actually go ‘content mining’, where they visit some of their favourite sites on a particular subject, copy-and-paste articles, or sections of articles, change a couple of words, and then re post the content on their own site.

The Dark Side Never Prospers Balance of the SEO Force

Clones will never be as effective as their original counterparts. The process is somehow, cannibalistic, incestuous. Each copy will be of less quality than the original. Content scraping is an unethical abuse of the freedom offered by the Internet, and the search engines do not condone it.

The algorithms of Google, for example, are so far evolved that the search engine can pick up such intricacies as spelling, grammar and sentence structure. In light of this,* don’t you think duplicate content can be detected?*

Duplicate content will not generally get its website banned, otherwise News sites featuring syndicated articles would all have been dropped from Google’s lists years ago. However your site will not see any ranking benefits from featuring duplicated content.

Google ranks the site with the original content according to the value of the content, all other sites receive no credit for possessing the copied content.

Cloning may give you quantity, but it truly is quality and individuality that counts in SEO.