Evaluating the Accuracy of the Copyleaks AI Image Detector

A Step-by-Step Methodology

We believe it is more important than ever to be fully transparent about the AI Image Detector’s accuracy, including rates of false positives and false negatives, as well as areas for improvement, to ensure responsible use and adoption. This comprehensive analysis aims to provide complete transparency around our AI Image Detector V1.1 model testing methodology.

Our model is designed to detect AI-manipulated portions of an image by producing an overlay of the detected areas. Testing verifies that the AI Image Detector achieves high detection accuracy in distinguishing between authentic human photos and AI-generated or AI-manipulated images, while maintaining an extremely low false positive rate.

Test date: February 1, 2026

Publish date: February 15, 2026

Model tested: V1.1

Methodology

Using a dual-team system, we have designed our evaluation process to ensure top-level quality, standards, and reliability. We have two independent departments evaluating the model: the Data Science and the QA teams. Each team works independently with its own evaluation data and tools and does not have access to the other team’s evaluation process. This separation ensures the evaluation results are unbiased, objective, and accurate. It is also essential to note that all testing data is strictly separate from the training data; we test our models only on new images that have never interacted with our AI Image Detector. To ensure our testing remains relevant and challenging, we continuously update our evaluation datasets to include images generated by the latest GenAI models.

Test-sets Construction

For every model release, the Copyleaks QA and Data Science teams independently gather and create a variety of testing datasets. Each dataset consists of a finite number of images with an expected label indicating its origin. The datasets are divided into two categories:The Copyleaks QA and Data Science teams independently gathered and created a variety of testing datasets. Each dataset consists of a finite number of images with an expected label indicating its origin. The datasets are divided into two categories:

  • Fully-Human: Authentic images captured by a camera and not altered by generative AI. These were collected from verified datasets or manually created.

     

  • Fully-AI: Images generated entirely by the most up-to-date AI models.

AI-generated images were created using a wide variety of generative AI models. The tests were executed against the Copyleaks API, and we aggregated the scores to calculate the model’s performance. 

The evaluation was conducted exclusively on images that meet these technical requirements: a minimum dimension of 512×512 pixels, a file size under 32 MB, and a resolution under 16 megapixels, as defined in the documentation.

Evaluation Metrics

The product makes a prediction in the form of an overlay of the AI-generated segments. The overall performance is then evaluated based on how accurately the model classifies the images, according to their ground-truth category.

Performance Metrics by Image Type

To provide a clear and robust measure of accuracy, we use different pixel-level metrics depending on the type of image being tested:

  • For Human Images: The key metric is the pixel-level False Positive Rate (FPR). For an image to be considered a successful detection (True Negative), the percentage of pixels incorrectly flagged as “AI” must be less than 5%. This stringent threshold ensures the model avoids falsely accusing authentic images.

  • For AI Images: The primary metric is the pixel-level True Positive Rate (TPR). For an image to be considered a successful detection (True Positive), the percentage of pixels correctly identified as “AI” must be greater than 95%. This ensures the model comprehensively recognizes fully generated content.

Aggregated Reporting Metrics

The overall accuracy figures presented in the Results tables, such as TNR (Human Accuracy) and TPR (AI), are aggregated from these pixel-level success criteria. For example, the TNR is the percentage of all tested human images that successfully met the <5% false positive pixel threshold.

Results

Data Science Team Test

The Data Science team conducted the following independent test on a large, diverse dataset containing images of varying resolutions, capturing devices, image generators, and content types.

Dataset's name Human Images (n=31,374) AI Images (n=33,947)
Accuracy 98.6% 97.6%

QA Team Test

The QA team conducted an independent test using images created explicitly for evaluation after the model was trained. The test dataset comprises images of varying resolutions, captured by different devices, generated by various image generators, and featuring diverse content types.

Dataset's name Human Images (n=10,000) AI Images (n=10,000)
Accuracy 99.3% 98%

Error Analysis

During the evaluation process, we identify and analyze incorrect assessments to enable the data science team to correct the underlying causes. All errors are systematically logged and categorized based on their character and nature in a “root cause analysis process.” This process aims to understand the underlying causes of errors and identify repeated patterns, ensuring the ongoing improvement and adaptability of our model. These insights are used to refine future versions of the model.

Limitations

While our model achieves state-of-the-art results, no detection system is perfect, and our model can make mistakes, such as misclassifying a specific pixel set.

The AI Image Detector is specifically trained to identify manipulations from the latest generative AI tools. The system does not currently detect other common image alterations, including:

  • Manual Edits: Changes made by a person using traditional photo editing software like Photoshop.

     

  • Collages: Images created by combining parts of different authentic photos.

     

  • Simple Filters & Adjustments: Applying photographic effects (e.g., “vintage,” black & white) or basic adjustments (e.g., sharpening, blurring, cropping).