With the rise of AI usage for generating text, so is the increase in AI detector tools to help mitigate potential risks and ensure responsible adoption. In response, tools that promise to make AI-generated text undetectable began arriving on the scene shortly thereafter. But are these undetectable AI tools worth it?
As it turns out, there are a few risks to be aware of.
With the continual evolution of AI, it was uncovered that AI models can sometimes have what’s known as “hallucinations.” An AI hallucination is when the AI generates data that is factually incorrect. The same can happen with the tools that aim to make AI undetectable.
Generally speaking, undetectable AI tools are paraphrasing tools, meaning they take the text provided and render it to sound more human without, theoretically, losing the meaning of the text. However, when the text is ‘Humanized,’ what is generated can alter the original content’s meaning, including changing or omitting facts. In short, these tools are trying to paraphrase without the capability of considering the subject matter of the text itself.
To determine the accuracy and other potential risks of these tools, we conducted tests between November 2023 and January 2024.
On November 2, 2023, we asked ChatGPT-3.5, the most widely used AI model, to generate content about Danish astronomer Ole Rømer. We then had some of the leading undetectable AI tools on the market ‘humanize’ it for us.
Here are the results.
Note that the text highlighted in red is where the undetectable platform changed or altered facts contained within the original text.
Facts Changed or Omitted
First, we looked at the output of undetectable AI tools to determine if the original text’s meaning was maintained or changed, including if any facts were altered or omitted.
For this, we tested a few pieces of text. Here are the results.
Our testing uncovered another risk to be aware of.
Undetectable AI tools can often compromise sentence structure and overall grammar in an effort to generate more ‘humanized’ text.
On January 11th, 2024, the initial GPT-3.5 generated text was re-submitted to the same undetectable AI tools. This time, we noticed that while the facts stayed more intact than the outputs in November, the results had noticeable grammatical and overall structure errors, as shown in the examples below.
On January 11th, 2024, we also asked ChatGPT-3.5 to generate a new essay, this time about the North Atlantic right whale.
We put portions of the new essay through the same undetectable AI tools as before, and the output resulted in both facts being changed or omitted and improper grammar and sentence structure.
Passing the AI Detector Test
But what about the whole point of these tools, which is to make AI-generated text more human, subverting AI detectors?
To determine how effective these tools are for their intended purpose, on January 11th, 2024 we put the ‘humanized’ text about the North Atlantic right whale through the Copyleaks AI Content Detector to see if it would be detected as AI or human.
Clearly, there is still the possibility that text rendered as “undetectable” can, in fact, be flagged by AI detectors.
In the end, is it worth it?
As our tests show, the outputs from these tools can result in facts that are inaccurate, improper sentence structure, and, most importantly, text that can still get flagged as AI.
Therefore, before deciding to spend the time having your AI text ‘humanized’ by undetectable tools, you should stop and consider if the risk is worth it, especially since many of these tools require you to pay before you can even try them.
In the end, if you’re going to end up rewriting your AI-generated content to fix what the undetectable tools generated, then why not just do it yourself in the first place?