There is a lot around generative AI that most of us don’t understand. Some parts of the AI world are a “black box” to most people, meaning they can’t get a clear and straightforward explanation of “why” an AI model performs a certain way.
It’s easy to get frustrated.
At Copyleaks, our AI Detector understands an AI model’s behavior, but our clients often ask us, “How does the AI Detector understand AI? How is it determining what is AI-generated and what is human-written?” The short answer is very advanced mathematics. But to the average user who doesn’t have a PhD in Mathematics or Computer Science, it’s challenging to work with and comprehend the mathematics behind AI detection. Therefore, if we were ever going to explain how the AI Detector essentially works, we would have to make it as direct and clear as possible.
Our answer? AI Insights.
But before we get to that, let’s talk about AI detection and start unpacking the black box.
AI-generated text detection is pattern recognition.
Why pattern recognition?
Because every writing style – yours, mine – is a collection of many kinds of patterns, this is true for people, and it is especially true for AI because AI is a computer program that was programmed to generate text in a certain way, with specific rules, and with a certain formula.
In short, AI text is formulaic.
Now, some of these patterns are easily detected by people, but most of them are not, and for this reason, we need the help of computers.
One pattern that is an indication of a writing style is sentence length. For example, we can easily see the pattern of “sentence length” when looking at a document, but of course, a computer can calculate this in milliseconds. Another one that we can “see” is the use of hyphens in the text. Again, the computer can do this faster and more accurately than us, but we can definitely see it.
Thousands of patterns have been discovered over the years, and they are being used as part of a research field called “Stylometry.”
Identifying these patterns is the foundation of our AI Detector. But there’s more to it.
Our AI Detector doesn’t simply analyze individual patterns. Instead, it examines their relationships. Let’s go back to the previous example. Consider for a moment that the average sentence length is more than 5, the normalized number of hyphens in the text is 0.003 per word, and the number of syllables per word is 2.3; the model understands that it means something and learns to treat this combination as a pattern by itself.
Great! But what does that mean?
Sure, a high level of accuracy for AI detection is what we all want, but if you can’t explain how you arrived at that accuracy, then you’re still just inside the black box.
Try telling the everyday student, educator, HR admin, or marketer that their text was flagged as AI-generated because of complex mathematical formulas involving logarithms of syllable counts and sentence length ratios.
You lost them at “complex mathematical.”
This lack of transparency frustrated our clients because they needed more than accuracy; they needed an in-depth, clear explanation.
Cue AI Insights.
AI Insights was developed to uncover any pattern within the text that significantly impacts the main model’s decision and can be easily presented to people alongside their texts in a clear, concise manner that removes any mention of “complex mathematical.”
Using this method, we can show our users the exact phrases appearing in text that are more likely to have been written by AI than humans. We can even show the exact ratio between the phrase occurrences from our AI and human databases.
The release of AI Insights gives our users a whole new AI-detection system. They can now get an explanation of the results that highlight the exact phrases that AI more frequently uses than people, with the actual numbers to back it up.
By removing the black box surrounding AI detection, our clients gain more confidence in the results. Now, they have a tool to use when discussing AI with students, parents, and writers. It saves them time by pointing out the exact places where AI text was found in high proportions.
AI Insights represents just the beginning.
We’re already developing new features and products based on AI Insights that will provide even more value to our clients. As AI continues to evolve, our commitment to transparent, effective detection methods grows even stronger.
The future of AI detection is not just in accuracy but in providing meaningful, actionable insights that help users understand and address the challenges of AI-generated content in their work and learning environments.
Head of Data Science at Copyleaks
All rights reserved. Use of this website signifies your agreement to the Terms of Use.