Copyscape

Plagiarism 101 for Blogs and Online Publications

catching plagiarismPlagiarism is the cardinal sin of online writing and publishing. No one wants it to happen on their website because it ruins credibility, quality, and search engine rankings. It's understood to be a huge ethical problem. However, with the Internet, it's easier than ever to plagiarize while being just as difficult to catch it or to stop it. Here's what every blogger or online publication needs to know about plagiarism, and what constitutes plagiarism:

So, What Is Plagiarism?

According to Wikipedia, plagiarism is:

"the mere copying of text, but also the presentation of another's ideas as one's own, regardless of the specific words or constructs used to express that idea". Meaning, in order for text to be considered plagiarized, it needs to be a copy or close copy of the text AND lack attribution to the original author or source"

Yes, I copied and pasted that definition verbatim from the Wikipedia. But it's not plagiarism as I attributed the definition, placed the definition in quotes, and provided a hyperlink to the very web page I pulled the definition from. A mere word-for-word copy is NOT plagiarism, I repeat, it is NOT plagiarism. It only counts if it is not properly attributed and the author is trying to pass the words and/or ideas as one's own. There are many times when a word-for-word copy would be perfectly appropriate, or even preferable, such as a definition (especially a long or technical one), a direct quote, or a set of statistics.

Why are Plagiarism Detection Services Bad?

This distinction is important to remember because many plagiarism detection services can only detect blatant word-for-word copies of text and don't take those nuances into account when looking at a block of text. Services like Copyscape and Plagium would say that most of the above paragraph is plagiarism, despite the fact that I attributed the definition, quoted the definition (showing that I didn't write those words and that I am 'quoting' someone else), and provided a link to the exact web page I found the definition.

The word-for-word copy also doesn't account for another form of plagiarism: taking another's idea without attribution. I can take someone's public policy idea, change around enough words to pass these services, and then write about the idea as if I came up with it all on my own. It's also possible to pass these services by changing every third or fourth word, so with plagiarism detection services, it's important to exercise human judgement and intuition when evaluating an article. It's also important to let your writer's know that passing these services isn't enough, and ought to know the difference as well.

In our next post, we'll offer a few ways to catch plagiarism. In the meantime, you need to exercise your judgement by knowing that needs to be credited and what doesn't. Knowing this will make easier to

What Needs, and Doesn't Need, Credit or Attribution

Here, then, is a brief list from the Purdue Online Writing Lab of what needs to be credited or documented:

  • Words or ideas presented in a magazine, book, newspaper, song, TV program, movie, Web page, computer program, letter, advertisement, or any other medium
  • Information you gain through interviewing or conversing with another person, face to face, over the phone, or in writing
  • When you copy the exact words or a unique phrase
  • When you reprint any diagrams, illustrations, charts, pictures, or other visual materials
  • When you reuse or repost any electronically-available media, including images, audio, video, or other media

Things that don't need documentation or credit, also taken from the Purdue Online Writing Lab's page on plagiarism, include:

  • Writing your own lived experiences, your own observations and insights, your own thoughts, and your own conclusions about a subject
  • When you are writing up your own results obtained through lab or field experiments
  • When you use your own artwork, digital photographs, video, audio, etc.
  • When you are using "common knowledge," things like folklore, common sense observations, myths, urban legends, and historical events (but not historical documents)
  • When you are using generally-accepted facts, e.g., pollution is bad for the environment, including facts that are accepted within particular discourse communities, e.g., in the field of composition studies, "writing is a process" is a generally-accepted fact.

Overcoming Content Marketing Challenges: Creating Original Content

content marketing challengesArticle first published as How to Ensure Originality in Your Content Marketing on Technorati.

If your biggest content marketing challenge is creating original content, then you're not alone. Almost 70% of B2B marketers said this was their biggest challenge, according to a B2B marketing trends survey from content curation platform Curata. The next two biggest challenges for B2B marketers were having the time to do it (65%) and finding high quality content (43%) to drive a content curation strategy.

As the use and importance of content marketing continues to rise, there's added pressure on content creators to come up with new and original content regularly. Duplicate content isn't cool, but creating original content can be a challenge when you are covering some of the same topics over and over again. So, just how do you ensure originality in your content marketing without resorting to copying or running out of ideas? Here are a few ways to do that:

  1. Involve More People in Your Content Marketing Process - If you have only one person writing for your blog, then creating original content is going to be a huge challenge. One person can only do so much. When it comes to your business blog, allow employees from sales and customer service to contribute as well. These folks have insight into customer questions and pain point, and can offer something incredibly valuable to the blog, something that your original blogger could have missed. For other forms of content marketing, involve a team or maybe outsource a project or two to a content marketing firm. A fresh perspective could be all it takes to get the original content you've been craving.
  2. Don't Rely So Much on Copyscape - Too many people think Copyscape is the magic wand to finding original content and to banishing those plagiarists. However, Copyscape is not a silver bullet. First of all, simply rewriting something into your own words doesn't absolve the crime of plagiarism. Think of this as putting someone's book or academic report into your own words, and then putting your name on it without giving credit to the original author. The ideas aren't your own, and without proper citation, even the rewrite is still plagiarism despite passing Copyscape. Second, there are things that ought to be cited and be verbatim in content, such as a quote, a definition, a set of statistics, a phone number, and a book title. This is where human judgement comes in, as rewriting these things may make your content less powerful, not more. Third, Google hates duplicate content, but an article that's copied and pasted is very different from including a quote or an excerpt of someone's book or blog post. Original content is much more than having unique text. It's about having unique ideas while being able to give credit where credit is due.
  3. Conduct Your Own Research - A great way to be original is to conduct your own research with a survey or analysis of data, and then to report the findings. This method may take a while, but the goal is to find something new and to have a lot to write about, more than just a single blog post or white paper.
  4. Update/Repackage Current Content - No one says that once you publish something, that's it. Get more mileage from your content by updating the information, or repackaging the content. For example, if you've written 20 blog posts about anti-virus software, then take those 20 posts and turn it into a guide or an eBook about anti-virus software. You can make this original by adding an author page and an introduction in the beginning, a description of your company at the end, and updating any statistics you used in the posts. Okay, you've technically copied yourself, but you own that content. No one's going to ding you for that.

How Copyscape is Fraudulent on Plagiarism and Content Fraud (Part 1 of 3)

copyscape plagiarismCopyscape is a popular plagiarism detection service that many folks use to see if their content is being stolen, as well as to see if prepared content has been plagiarized from other sources. Many are happy with Copyscape and the service it provides, presuming that it does a good job of catching plagiarism and content fraud. However, I hate the darn thing, and more professional writers ought to share in my enmity. Copyscape does not do as good a job as people think it's doing. My rage is due to the fact that a few days ago I was falsely accused of plagiarism by a potential client, because of the Copyscape results he received for my article. In our conversation, he never specified what it had flagged; just said that "chunks" of it were copied. Since I didn't know what it caught, I had no idea how defend myself. I guessed that Copyscape caught the survey statistics I mentioned, and offered that as the explanation, but he didn't like that. He said this whole thing was unprofessional and didn't want to take the risk working with me. Obviously, I did not get the gig, and I did not appreciate the quick and harsh accusation.

Worried of the potential damage this could have to my career and credibility, I ran the article through Copyscape myself to see what it flagged. It flagged TWO sentences, out of this 400-word article. To boot, these two sentences were meant to be a technical definition, something that you'd want to have verbatim to ensure accuracy. He also didn't see that I had included several hyperlinks throughout the article, including a hyperlink to the web page I got these two sentences since technical difficulties forced me to send him a text only version, instead of the actual document that included the hyperlinks (in my experience, one can't hyperlink in chat boxes). If he was able to see the hyperlinks, he would have seen that I had hyperlinked this definition to the web page I got it from. I explained the technical difficulty to the client twice, but it didn't seem to matter. All that mattered was that some words matched some other words somewhere else online, coming to the conclusion that the whole article was copied and that I'm not to be trusted.

Copyscape had also listed 20 results of copied content, except it was 20 different sites that had these same two sentences, so really it was one result instead of 20. Copyscape also didn't catch the survey statistics, which I actually did pull verbatim from the website. I don't think the client really perused these results, cause he would have seen that the results were a false positive.

And I am not the only one. A writer based in El Paso, Texas, who asked to remain anonymous, shared her story with me. Anonymous wrote a piece on gambling addiction, and the editor sent it back to her saying there was plagiarism. The results from Copyspace revealed a few phrases and a hotline from a web page as the plagiarism. Her editor now wants her to rework the piece or write something entirely different. She could rework the piece, but Anonymous fears that the editor won't trust that the rest of her work is original.

I've proceeded to run a few more of my articles (ones that are published and live on the web) through the system, with mixed results. It caught some in their entirety. Others, it only caught sentences and statistics, and not the whole article. There was one article where it didn't catch anything at all, leading me to believe that Copyscape isn't as reliable as people are hoping and thinking it is.

According to Wikipedia, plagiarism is "the mere copying of text, but also the presentation of another's ideas as one's own, regardless of the specific words or constructs used to express that idea". Meaning, in order for text to be considered plagiarized, it needs to be a copy or close copy of the text AND lack attribution to the original author or source. Yes, I copied that definition verbatim from the Wikipedia, but it's not plagiarism as I attributed the definition, placed the definition in quotes, and provided a hyperlink to the very web page I pulled the definition from. And, lovely lovely Copyscape flagged this paragraph as plagiarism, despite my extra efforts.

Attribution for online content is different from print content like an academic paper. It's not as if endnotes or footnotes really look great on a blog or web page. I think that proper online attribution means a hyperlink and/or a statement of the source, with quotation marks if the words are exact words. Since hyperlinks help in Google rankings, I don't think anyone would challenge 

In contrast, many so-called plagiarism detection services, LIKE COPYSCAPE, can only detect blatant word-for-word copies of text. A mere word-for-word copy is NOT plagiarism, I repeat, it is NOT plagiarism. It only counts if it is not properly attributed. There are many times when a word-for-word copy would be perfectly appropriate, like a definition, a direct quote, or a set of statistics.

Here, then, is a brief list from the Purdue Online Writing Lab of what needs to be credited or documented:

  • Words or ideas presented in a magazine, book, newspaper, song, TV program, movie, Web page, computer program, letter, advertisement, or any other medium
  • Information you gain through interviewing or conversing with another person, face to face, over the phone, or in writing
  • When you copy the exact words or a unique phrase (which means that a word-for-word copy is okay, as long as it is attributed)
  • When you reprint any diagrams, illustrations, charts, pictures, or other visual materials
  • When you reuse or repost any electronically-available media, including images, audio, video, or other media

There are, of course, certain things that do not need documentation or credit, which is important to note because services like Copyscape just look at the text, but don't look at how the text is used, what the text says, or if the text comes with the proper attributions, Things that don't need documentation or credit, also taken from the Purdue Online Writing Lab's page on plagiarism, include:

  • Writing your own lived experiences, your own observations and insights, your own thoughts, and your own conclusions about a subject
  • When you are writing up your own results obtained through lab or field experiments
  • When you use your own artwork, digital photographs, video, audio, etc.
  • When you are using "common knowledge," things like folklore, common sense observations, myths, urban legends, and historical events (but not historical documents)
  • When you are using generally-accepted facts, e.g., pollution is bad for the environment, including facts that are accepted within particular discourse communities, e.g., in the field of composition studies, "writing is a process" is a generally-accepted fact.

I suspect that Anonymous and I aren't the only ones who've been wrongly accused of such an unethical deed. This one incident wouldn't be a big deal, except that as a professional writer, an accusation of plagiarism could have widespread and career-damaging consequences, whether the accusation is true or not. After all, a man cleared from death row after 20 years in prison doesn't suddenly have the ordeal over and done with. That sort of thing remains with you long after the whole thing, just like an "act" of plagiarism.

Writers who've been dealt injustice because of faulty Copyscape results need to come forward with their stories, to show that you are not alone and that this is problem. Those wanting our content need to understand what plagiarism really is, and realize that Copyscape shouldn't be taken as foolproof and  absolute.

In Part II, I will complete a full statistical analysis of Copyscape, running all of my online articles through the system and summarizing the results. I have hundreds of articles live on the web, so the results should be valid. In Part III, I will offer alternatives to catching plagiarism and content fraud.