How AI Detectors Work: Perplexity, Burstiness & Classifiers Explained
If you've ever pasted text into an AI detector and gotten a confident "98% AI" score, you might assume the tool somehow knows who wrote it. It doesn't. AI detectors don't detect authorship — they measure statistical properties of text and estimate how likely a language model produced it. Understanding those properties demystifies the whole category.
Nearly every modern detector relies on some combination of three signals.
Signal 1: Perplexity
Perplexity measures how "surprised" a language model is by your text. Technically, it's a measure of how predictable each word is given the words before it. When a model reads a sentence and every next word is exactly what it would have chosen, perplexity is low. When the text keeps making unexpected, less-probable choices, perplexity is high.
Here's the catch for AI text: large language models generate by picking statistically likely tokens, so their output naturally has low perplexity. Human writing, full of personal quirks and unconventional phrasing, tends to score higher. Detectors treat low perplexity as a sign of machine authorship.
Signal 2: Burstiness
Burstiness measures variation — specifically, how much perplexity (and sentence length and structure) fluctuates across a document. Humans write in bursts: a long sentence with three clauses, then a short one, then a fragment. Older language models produced flatter, more uniform output, so low burstiness became another red flag.
This is why one of the most effective ways to make text read as human is simply to vary sentence length dramatically — a technique we cover in our guide to making ChatGPT sound human.
Signal 3: Trained classifiers
The most modern detectors don't just compute perplexity by hand — they train a classifier (often a transformer model) on large labeled sets of human and AI text. The classifier learns subtle "fingerprints": certain transitional structures, certain rhythm patterns, certain phrasings that correlate with machine output. This is how tools like Turnitin's AI model and GPTZero have improved at catching paraphrased AI content.
Ready to humanize your writing?
Paste your draft and get clear, natural-sounding text in seconds — no sign-up required to try it.
Try the HumanizeMyPaper humanizer freeWhy detectors produce false positives
Because all three signals measure patterns, not origin, any human writing that happens to be very predictable can get flagged. This isn't hypothetical:
- A widely cited Stanford study found AI detectors misclassified a majority of essays written by non-native English speakers as AI-generated, because that writing tends to use more common vocabulary and simpler structures.
- Formal academic prose — literature reviews, methods sections — is naturally uniform and can score high on AI indicators.
- There are documented cases of students wrongly accused after a detector flagged original work, with the score later acknowledged as an error.
This is the core limitation: a detector can tell you text looks statistically like AI output, but it cannot prove who actually wrote it.
What this means in practice
For writers, three takeaways follow directly from the mechanics:
- Varied rhythm matters most. Burstiness is the easiest signal to influence with genuinely better writing.
- Generic phrasing hurts you. The more predictable your word choices, the lower your perplexity — and the more "AI" you read, even if you wrote every word.
- Scores aren't proof. If you're ever flagged unfairly, the false-positive research above is your evidence that detector scores are probabilistic, not definitive.
The takeaway
AI detectors are statistical instruments measuring predictability and uniformity. They're a useful signal and a flawed judge. Knowing how they work helps you write more naturally — and helps you push back when a tool gets it wrong. For a deeper look at the academic side, see our guide on whether Turnitin detects ChatGPT.
Frequently asked questions
What signals do AI detectors actually measure?
Most detectors combine three signals: perplexity (how predictable the word choices are), burstiness (how much sentence rhythm varies), and a trained classifier that has learned the statistical fingerprint of AI writing.
Why do AI detectors flag human writing?
Because they measure statistical patterns, not authorship. Writing that is very consistent and uses common vocabulary — common among non-native English speakers and in formal academic prose — can look 'too predictable' and trigger a false positive.
Are AI detectors reliable?
They're useful as a signal but not as proof. Independent studies show accuracy varies widely and false-positive rates are non-trivial, which is why most institutions treat detector scores as one input, not a verdict.