A content theft or a WordPress feature?

Tuesday, August 7, 2007 at 2:04 | Posted in Blogosphere, internet, WordPress | 13 Comments

I tend to get somewhat itchy each time I see my blog posts republished in full length somewhere. Even if there is a link back to my original post, it smells fishy when my creation is obviously being used for seeking commercial profit. The fair thing to do would be to just publish a brief quote and refer the visitor to read all of the post where I posted it.

There are no Google ads visble at Wpfind.com but most of the links at the site (those that my text is supposed to attract the visitor of the site to click on) look suspicious indeed. It looks even more suspicious that the site presents itself as a WordPress search engine designed to help users of WordPress.com:

Wpfind is a way of allowing the millions WordPress users to rapidly search WordPress including their blogs and friends blogs from one simple and easy to use interface. The secondary aim was to make these blogs very easy to spider by search engines so users journals can achieve more popularity and a higher readership.

There is even a link to a blog hosted by WordPress.com which is said to provide information on “recent progress on development”. Interestingly, however, clicking to that link produces a familiar error message referring to a violation of WordPress.com terms of service:

A WHOIS query to the site does not quite look like it would be ran or even endorsed by WordPress.com. So maybe the good folks of the WordPress team should have a look at the site and take appropriate measures. I have norhing against contributing to genuinely useful projects that actually benefit the community but I have quite a lot against plain content theft.

So which one is it? A search engine targeted at one site would not sound like an effective idea especially if it needs to publish the whole post, using a whole page to publish it as it returns “search results”. Just think of how consuming it would be to harvest search results to a popular query at Google or Technorati if they used the same method.


RSS feed for comments on this post. TrackBack URI

  1. Do you have any Creative Commons info in your sites? Seems like a breach of copyright to me.

  2. I would appreciate the parteibuch content to be stolen, mirrored and redistributed, but you don’t like it and they steal your content.

    And the worst of all, they do not even give me the possibility to add my site to this “search engine”.

    Something goes wrong.

  3. Geirsan, the only reference to Creative Commons you could find at my sites is when I use somebody else’s material under CC linking to the original source and reproducing their license as is required.

    Marcel, it is not only that they steel the contents and make money with it. This short of operations are also often closely connected to spam. If you google what you get from a WHOIS query it is not uncommon to detect the same persons being covered by anti spam sites. However, I did not find such connections in this case.

  4. You’re not the only one, Larko. Just today two referers from wpfind showed up in my statistics.

    Interesting enough, the company behind this so-called service, Macranet, also runs a service called ljfind, which does the same with LiveJournal content, only with ads, conveniently placed on top of the search results.

    I wouldn’t mind this if not the whole content of my posts would be replicated, but this seems like a shady seo/linkfarm enterprise.

  5. You are absolutely right, Mark, which is why their bots should be blocked. That would, of course, just be a temporary solution because they would soon enough return under another IP but it would be better than doing nothing.

    We are talking about a weed of the Internet. I wish there were an effective pesticide.

  6. I guess as long as there are people clicking on ads or buying into Ponzi schemes, these niches will survive and even – when it’s a relatively new and uncharted medium for people – prosper.

    Just a quick thought: effectiveness in this cases lies on the side of the users. If these can be taught how to spot and avoid such ‘click traps’, the raison d’etre for such exploits would diminish.

    Of course this would mean huge efforts in teaching media literacy.

  7. True, and any change in that unfortunately looks like unachievable. A huge majority of the web users are more or less trap illiterate. The same goes for traditional e-mail spam: everybody curses it but only the most advanced users, a very small minority, know how to use effective spam filters.

    Not only users should fight this sort of rubbish, the main responsibility lies on main providers. However, they are less than likely to do anything about weeds because many of them cut a slice of the revenue.

  8. There is no doubt in my mind this is a spam site. They seem to run at least one other site like this, ljfind.com and are billing themselves as a search engine when they are really scrapers.

    The difference is obvious when you compare how Google displays results versus how WPfind does. Google displays only a snippet of the site, these sites display all of the content.

    I’ll be contacting them about their sites and trying to get answers.

    In the meantime, anyone who wants their content removed from the site can send a DMCA notice to their host, The Planet at this email: copyright@theplanet.com

    If you need help filing a DMCA notice, let me know or check out the stock letters section on my site.

    Feel free to email me if I can help in any way.

  9. Hi Guys,

    I noticed your conversation in google. I run Wpfind.com. Although we do index the whole post and display it in the cache link, each post in our search results links back to the wordpress blog twice.

    We are not a scrapper site or SPAM site as discussed above. Infact alot of technology goes into indexing millons of blogs and returning very rapid search results. We crawl the content the same way the other blog engines do via ping update services so we do not need users to submit to us. When they post wordpress.com auto pings and we crawl from these updated lists.

    Also the fact someone commented they noticed visitors from wpfind does show we are generating genuine search traffic.

    While we do place adverts at the top of the page for some terms, we still operate at a loss. Although long term obviously we would like to make a profit.

    If anyone wishes there post removed simply contact us by clicking the linkat the top right of wpfind.com and we will remove your content from the index.

    Thanks and I hope that helps


  10. Linking back to the post does not change the fact that you use blog contents by publishing a whole post without consent of the bloggers. Please regard this as a request to cease and desist.

  11. Ok Sure, I will remove your blog. One thing I will say is we only provide a link from the cache link to the full post much like google does. We do not advertise on post pages.

  12. […] also some users on .com who apparently are not happy about this SE at all, because it almost always shows a full post contents instead of usual practice […]

  13. Content grabbing is evil. Wise bloogers never use this bad “blogging metod”. ūüėČ

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.
Entries and comments feeds.

%d bloggers like this: