WP Plagiarism Pal Plugin

WP Plagiarism Pal LogoGoogle’s Panda algorithm update shook the blogging world, lowering the ratings of almost 12% of the sites on the internet. WP Plagiarism Pal can keep you from upsetting Panda by catching inadvertent duplicate content.

In a Google Blog post that Amit Singhal and Matt Cutts posted on February 24,

“This update is designed to reduce rankings for low-quality sites—sites which are low-value add for users, copy content from other websites or sites that are just not very useful. At the same time, it will provide better rankings for high-quality sites—sites with original content and information such as research, in-depth reports, thoughtful analysis and so on.”

Amit Singhal offered this blog post with advice on how sites can be evaluated in light of Panda. It appears to be the same list of (or a subset of the) questions that were given to the quality raters last year.

  • Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?
  • Does this article have spelling, stylistic, or factual errors?
  • Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?
  • Does the article provide original content or information, original reporting, original research, or original analysis?
  • Is the content mass-produced by or outsourced to a large number of creators, or spread across a large network of sites, so that individual pages or sites don’t get as much attention or care?
  • Was the article edited well, or does it appear sloppy or hastily produced?

The emphasis has clearly shifted to quality unique content.

Plagiarism Pal catches duplicate content

Plagiarism Pal searches the internet for phrases in your posts that may duplicate phrases on other sites. It doesn’t matter that you may never have heard of that site and the phrase was your own. We reuse information we’ve heard or read constantly. Google only cares about duplication. Many Article sites are following Google’s lead and are checking for the uniqueness of your articles.

Plagiarism Pal catches sites stealing your posts.

Once you’ve created a unique post there are thousands of other sites just waiting to scrape your site and post your articles as their own. Plagiarism Pal can find these sites for you. Just run the  duplicate check after your post has been indexed and they will appear as duplicates.

Easy to use

After you save a draft or publish a post or page click the Check for Duplicate Content” button and a list will be created of phrases that are then searched on the internet. The list will be displayed showing which phrases may be troublesome.

WP Plagiarism Pal Screenshot

WP Plagiarism Pal Screenshot

Note how Plagiarism Pal catches the
quotation in this article as a duplicate.

Then click the link and you’ll be taken to the search for that phrase so you can check just what was duplicated.

To check for others copying your posts, after your post has been up for a while, clear the results and re-check. Any one with a copy on their site should appear in the search. You can check if it was an outright copyright  or just a quotation.

By default Plagiarism Pal uses Google to search. But other search engines are supported

  • Google Search
  • Google Blog Search
  • Yahoo Search
  • Bing Search
  • Bing Blog Search

A list of excluded sites can be entered, so you won’t search the originating or any syndicated sites that may have you’re posts.

Update: Aug 2 2011. Version 1.10.02 fixes a problem with Iñtërnâtiônàlizætiøn, allowing proper searches with latin characters. It also adds support for rotating proxies so you can spread searches out over several proxies to avoid Google blocking a single IP.

Update: Sep 20 2011. Version 1.11.00 adds an Editor Required feature. Authors and Contributors cannot Publish an article with duplicate content unless approved by an Editor or Administrator. The post will be set to Pending Review until the Duplicate Content is corrected or an Editor or Administrator publishes it.

Sku
WPPP
Description
WP Plagiarism Pal
Product Options
#OptionPriceDownloadFile Size
1Download$20.00WP Plagiarism Pal v1.12.00 85.26KB
Shipping
Shipping Rate: F
Order WP Plagiarism Pal Download @ $20.00

{ 38 comments… read them below or add one }

Paul August 2, 2011 at 5:23 am

Hi,

I heard about ur plugin.
I am interested about it, but I also see that it doesn’t take into account latin characters (such as é à ô etc) and that your plugin was only looking for plagiarized stuff on English search engines only.

Can you please let me know if there is a way to fix that ? In that case I will definitely buy it !

Thanks,
Paul

Reply

Arky August 2, 2011 at 9:08 am

Yes we just reworked the Iñtërnâtiônàlizætiøn on the plugin. It will be released today Aug 2 2011. So check back soon or sign up for the notification list.

Reply

Abhinav September 16, 2011 at 3:21 pm

Hi,

I am running a Article Directory. Where users submit so many articles and most of them are junk. So i am looking to use a script which can automatically check and only accept original or unique content. Is this possible with your script ?

Thanks
Abhinav.

Reply

Arky September 16, 2011 at 3:40 pm

Yes it is. This is the Editor Required feature. With it a user with and Author or Contributor Roles cannot publish a post with duplicate content in it. It will be saved to Pending Review status as long as there’s Duplicate content. An Editor or Administrator Role can then Publish them if they decide the dup is legitimate, like a quote or common cliche.

Reply

Abhinav September 17, 2011 at 6:47 am

Ok thanks but let me know one more thing.

Does it require copyscape api or any third party tool to run it ? Or it works itself ?

Reply

Arky September 17, 2011 at 7:51 am

No it doesn’t use Copyscape. It does use search engines, primarily Google. But nothing that costs anything extra, unless you want to use proxies.

Reply

Dude September 22, 2011 at 7:52 am

Hi,

Is your script work for French content ?
Also what about characters like é à ô ? Is ur new version available ?

Thx

Reply

Arky September 22, 2011 at 8:26 am

Yes the current version handles Iñtërnâtiônàlizætiøn since 1.10.02. The current 1.11.00 also has an Editor Required setting that prevents Authors or Contributors from publishing articles with duplicate content unless they are Approved by an Editor or Administrator.

Reply

Dude September 22, 2011 at 8:45 am

Can you please confirm it works for others texts than English ?

Reply

Arky September 22, 2011 at 9:50 am

Yes we have a number of French and Swedish users that I’m aware of. It can’t handle Chinese and similar languages but Latin alphabet yes.

Reply

Dude September 27, 2011 at 8:59 am

Hi,

I bought your plugin and install it in WP.
However, how do you do to only limit his access on the article page to Authors and Admin, and not to contributors !

Reply

Arky September 27, 2011 at 9:32 am

There are

Administrators
Editors
Authors
Contributors
Subscribers

in the user profile for each user. All except Subscribers can edit posts. See

http://codex.wordpress.org/Roles_and_Capabilities#Contributor

for details on what each can do.

In Plagiarism Pal the same applies. If you set the Editor Required checkbox in settings, then authors and contributors have the same abilities EXCEPT when they Publish a post with duplicate content. If they do then it is saved with a Pending Review status and to be finally published with the dup content an Editor or Admin has to do it.

If there is no dup content, both Authors and Contributors can publish normally.

If you don’t want Contributors to post, make them Subscribers.

Reply

Remguy October 10, 2011 at 9:33 am

How many installation can we do with one licence?

Reply

Arky October 10, 2011 at 9:53 am

For your own personally owned sites, as many as you like.

If you’re selling it to a customer as part of an SEO package or customer site you should buy a separate copy and bill them for it with your own markup, just like you would for a premium theme.

Reply

Hugh from DiscountCodes.ie November 4, 2011 at 11:05 am

Hi there!

Great product – I just bought it.

But… it doesn’t work with custom post types :( Will it in future?

Thanks,

Hugh

Reply

Arky November 4, 2011 at 11:58 am

The problem is that custom types have so many different slug names that the coding couldn’t cover them all.

You can however add what you need yourself fairly easily. In the wppp_plagiarismpal.php file there is a function “add_meta_boxes()”. For each custom type add the custom type slug name in place of the XXXXXX below and add a duplicate section for each custom type. Post and Page are of course already there.

add_meta_box(‘wppp_id’,
__(‘WP Plagiarism Pal’, self::LANG),
array(&$this,’meta_inner_box’),
‘XXXXXX’,
‘advanced’,
‘high’
);

Reply

Hugh from DiscountCodes.ie November 4, 2011 at 6:44 pm

Cool, thanks for the super quick response!

Hugh

Reply

Jim November 9, 2011 at 3:04 am

Hi,

Does the plugin have the ability to run through the old posts and determine which of them have duplicate contents around the web?

I run an article directory too and would like to see and weed out unnecessary posts.

Thanks.

Jim

Reply

Arky November 9, 2011 at 8:13 am

Yes but it doesn’t do it automatically. That would be an awful load of Google searches and likely get searches turned off for a while to your IP.
But you can bring up any post and re-do the search on it to see what’s happening and who might be borrowing your content.

Reply

Panneaux solaires November 14, 2011 at 8:58 am

But how can we add this option to run automatically? I saw that we can make it during night, take post in random, et make search engine to every 2 or 3 minutes to check duplicate without ban. How can we add it to the plugins?
This is very important for article directory, but it s not in your plugins :s

Reply

Arky November 14, 2011 at 9:09 am

Sorry not with this plugin. You may have seen one of the plugins that use Copyscape that can do this but it’s a Paid Service. This one checks each as they are entered so if you keep up with approvals it shouldn’t get behind.

I could build a plugin that does what you ask using Googles Paid API but that would cost you $5 per 1,000 searches and a typical article takes on average 50 searches. So 100 articles would cost you $25. Could really add up on a large directory. Is this something there is a market for?

Reply

Fabien December 7, 2011 at 7:26 pm

Hi
Just bought your plugin and I have 2 questions :
– How can I run automatically a duplicate test when saving a draft ?
add_filter(‘content_save_pre’, ‘which function ?’);

– How can I then prevent a contributer to submit this post for review until he has made some corrections ?
???

A few hooks should do it but I didnt get into the code yet
Thanks

Reply

Arky December 7, 2011 at 9:04 pm

I originally had it search on save but it tended too pound on Google and get searches blocked for a while. There’s a switch in settings to have it search on Publish automatically.

For contributors just check the Editor Required setting. It then requires an Editor or Administrator to Publish anything with dup content. Otherwise it’s saved Pending Review until corrected or overridden by an Editor.

Reply

DavidS December 27, 2011 at 1:41 am

I do not understand something. What is the point for searching the net for duplicate content? This is what the net is all about, content, and if people have to go out looking to see if their content is on other sites, then they would have to hire a team to deal with this, or waste many hours/days/months trying to get their dup content removed, I do not not have the time nor energy to chase down content thieves, because someone guards their content that closely, then do not put it online to be taken.

my overall concern is with internal duplicate content. I have people trying to submit the same junk they submitted the week before; Because I can not remember every single title that comes across, then I ending up accepting articles that have already been accepted before, because we already know the article that i have recieved has already been submitted to 10,000 other PR 0 sites because higher pr sites probably already rejected them, if they accept them all, and if you run an article directory, then you already got duplicate content from some place on the internet, and unless you are one of the most popular sites on the internet, you dare not deleted anything somewhat reasonable, and no one will visit an empty/dead article directory….

Reply

DavidS December 27, 2011 at 1:47 am

I also have to say that my article directoy has only been up for a week or two, and google has been indexing my Article directory like wild-fire; I have never had a site been index, so fast, and with so many pages so quickly…is it unique content? most likely not…I am sure the content that I get has already been submitted to 10,000 other directories, but I am clearly getting indexed, but my concern is internal duplicate content, because I can not worry about internal content vs what is on the web, complete waste of time if you ask me. I already know that I have duplicate content; I just want to make sure my directory is somewhat clean of internal duplicate content as much as possible. I feel that the internet is made up of 90% duplicate content anyway….

Reply

Arky December 27, 2011 at 8:25 am

Plagiarism pal can check you’re own site. Just leave your domain name out of the excluded domains list.

As to why duplicates are important, Google doesn’t like them and has said so repeatedly. Not the way they taught me in school where your taught to use quotes and bibliographies referencing authorities. So if you’re not worried about getting Googles traffis you don’t have to worry about dups.

And if you are using quotes put them in <blockquotes> or <q> tags for inline quotes so Google will have some indication that “yes I know they’re quotes but I think they’re important.”

Where you won’t like duplicates is where someone spams your site with 100 badly spun versions of the same article just to get the backlinks.

Reply

DavidS December 28, 2011 at 9:43 pm

I clearly understand what you are saying, god, I understand…I do not like duplicate content either, in reality it is a fact of life, more so if you are a brand new site…if you start deleting duplicate that you find already on the web, you are going to end up being one lonely site for sure, and google does not like anything that goes against their backward beliefs, they do not like duplicates, links, paid links, reciprocal links, and so forth, they do not like anything that may threaten google, but some of it they have to bear, otherwise, they would just close the internet down, and no one would submit anything if google took out benefits out of submissions.

I have a small article directory, 98% of my urls are indexed…and I do not suspect my content is unique either, very few, so why do I have a high indexing rate? but if your pal helps me keep my blogs from having internal duplicates then I am sold on that alone, but it does not help with my non-wp sites which I need a solution also, as these num-nuts keep sending the exact same articles daily.

Reply

Arky December 29, 2011 at 9:51 am

The technique is adaptable to any system that runs on PHP. I can do it in .NET as well but that’s all custom work.

Reply

DavidS December 28, 2011 at 9:49 pm

Does pal give you some % range of duplicate content that you want to search for? I would rather allow 75% unique compared to something
that was maybe 5% uniqueness….

Reply

Arky December 29, 2011 at 9:57 am

The samples used for comparison are displayed in a list. Dups are highlighted in red. You can click on any sample line to display what google saw when it searched that sample. See the screenshot above.

Reply

Andre March 9, 2012 at 1:26 pm

I’m not sure if I understood the documentation and comments correctly. Is WPPP able of performing an automatic DC check when an author publishes an article? Because it seems to just sit there and reject the article, prompting that it does so because there was either DC or no check for DC performed. The “Editor required” checkbox is enabled. Did I misunderstand how WPPP works or is there something else I can check?

Reply

Arky March 9, 2012 at 2:18 pm

If you have Editor required set the Author must be aan Editor or Administrator to Publiush an article when there is DC present. Otherwise it get set to Pending Review when they click Publish. So then an Editor or Admin has to approve it. It may be legitimate DC like a quote or a long book title being referenced.

The DC check is not automatic, because they can get you locked out of Google with too many searches.

Reply

Andre March 9, 2012 at 2:51 pm

Hrm, so what is the required order of steps to publish an article as an author when “Editor required” is checked? It seems that the article is always sent to the approval queue, even if the manually invoked DC check shows nothing but green…

1) Author writes Article
2) Author makes WPPP perform the DC check which says “all okay”
3) ???
4) Article is published

At present, it seems as if Step 4 is always “send to the pending queue” for authors, regardless of the DC status.

Reply

Arky March 9, 2012 at 3:44 pm

It should, if there has been a full check, no yellow and no red, go ahead and publish the article. Is it not doing so? They do have to push the Publish button.

Reply

Andre March 10, 2012 at 2:33 am

I’ve tried it myself with an author test account. Full check performed, everything green, publish button sends it to the pending-for-review queue.

Andre March 13, 2012 at 11:02 pm

Is this a bug in WPPP? If not, any idea how this could happen? Are there known side effects with other plugins?

Maxim July 23, 2012 at 2:51 am

Hi,

Does this license supports multi-site? Does it check automaticly existing posts after instalation?

Reply

Arky July 23, 2012 at 9:19 am

Yes you can use it on as many sites as you personally own. If you are using it as part of a site you are selling to someone else, they should buy a separate copy.

It only checks posts as you create them. It won’t check older posts unles you go to them and explicitly check them.

Reply

Leave a Comment