What is cognitive atrophy? The MIT EEG finding, explained.

The MIT Media Lab fitted EEG sensors on 54 adults writing essays. The ChatGPT group showed roughly 55% lower alpha-theta coupling in prefrontal deep-thinking regions, and more than 83% could not recall a sentence from what the machine wrote for them. Here is what the study actually measured — and what it does not.

Published: April 22, 2026 · Updated: May 19, 2026 · 11-min read · 2,620 words

The phrase “cognitive debt” entered the public vocabulary in June 2025, when a group at the MIT Media Lab uploaded a 200-page preprint to arXiv. Eight authors, 54 participants, four EEG sessions, and a phrase that — for the first time — named a thing many knowledge workers had started quietly suspecting about themselves. The paper’s title: Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task.

This essay is the primary-source explainer. It walks through what Kosmyna et al. did, what they measured, what the numbers actually mean, and the specific claims the study supports versus the ones it does not. If you have seen the headlines and wondered how seriously to take them — or wondered whether you qualify as a case study in your own life — this is the piece to read before the rest of the book.

What the study actually did

Fifty-four adults, ages 18 to 39, came into the MIT Media Lab to write short essays on SAT-style prompts — questions like “Do we ever make sacrifices that are worthwhile?” and “Is loyalty a virtue?” Participants were split into three groups. The first group wrote with ChatGPT (GPT-4o) in an adjacent browser window. The second group wrote with a conventional Google Search tool. The third group wrote unassisted — pen-on-screen, no external reference material, no autocomplete.

Each participant wore a multi-channel EEG cap. EEG — electroencephalography — measures the brain’s electrical activity in real time through scalp electrodes; it cannot localize signals as precisely as fMRI in three-dimensional space, but it captures the millisecond-by-millisecond rhythm of cortical activity better than any imaging technique that depends on blood flow. The cap was calibrated to cover the prefrontal cortex — the region underneath the forehead — and the parietal and temporal areas that the literature already associates with sustained attention, verbal working memory, and executive function. The key dependent measure was cross-frequency coupling — specifically the way slow theta rhythms (4–8 Hz) and alpha rhythms (8–13 Hz) interact in prefrontal channels, a signature long associated with effortful, integrative thought.

Participants wrote four essays over the course of the study, separated by roughly a week each. In sessions 1, 2, and 3, each participant used whichever tool their group was assigned. In session 4 — this is the part the headlines ignored — the groups swapped. The ChatGPT group was told to write unassisted for the first time in a month. The unassisted group was given ChatGPT.

Three measurements came out of this setup. The first was the neural signature: how engaged were the prefrontal networks, moment-to-moment, while the essay was being written? The second was behavioral: how well could the writer recall their own essay, verbatim, immediately after finishing? The third was judgment-based: a pool of English teachers and two separate LLMs scored the essays for originality, argument quality, and what the paper called “soul” — whether an essay read like a thinking human had written it.

What they found

The neural result is the one that caught the press. Across all three sessions, the ChatGPT group showed measurably reduced alpha-theta coupling in the prefrontal, parietal, and temporal regions associated with effortful thought — about 55% lower overall, compared with the unassisted group, with the sharpest reductions in the dorsolateral prefrontal cortex (the region most closely associated with executive control and working-memory maintenance). The Google-search group fell roughly in the middle: more cognitive engagement than ChatGPT, less than unassisted, and distributed differently — more activation in visual-search and decision-making circuits, less in language-production circuits.

The recall result is the one that should have caught the press harder. Immediately after finishing a session, participants were asked to quote a single sentence from the essay they had just submitted. In the ChatGPT group, 83.3% could not — not a paraphrase, not a near-quote, just any sentence from the text that had been produced under their name five minutes earlier. In the unassisted group, 88.9% could — the symmetric opposite. The Google-search group came in around 75% successful recall, closer to unassisted than to ChatGPT.

The judgment result is the one that muddies the commentary. The English teachers rated the ChatGPT essays as formally competent but “soulless” — a word several judges converged on without coordination. The LLM judges, by contrast, rated the ChatGPT essays highest on most rubric dimensions. The authors reported this as a finding in its own right: a generation of writers who use LLMs to produce and LLMs to grade will select for exactly the qualities that human readers already identify as the hollow center of AI prose.

And then there is session 4 — the swap — which is where “cognitive debt” becomes a specific claim rather than a metaphor. When the ChatGPT-first group was asked to write unassisted for the first time, their neural activity did not recover to baseline. It stayed depressed. The pattern looked less like “took a tool away, brain fired back up” and more like “the circuits have not been loaded in a month and they did not load on cue.” The unassisted-first group, flipped to ChatGPT, also failed to reach the ChatGPT baseline — but in the opposite direction: they kept working, kept activating, and produced essays that the English-teacher panel rated as the strongest of the four conditions. The tool was used; it was not confused for the writer.

What “55%” actually refers to

Every number in a brain-imaging paper is a composite, and the 55% figure that has traveled furthest in the press is no exception. It is not a single number from a single channel. It is the authors’ summary estimate across their three primary cortical regions of interest (prefrontal, parietal, temporal), averaged across sessions 1 through 3, of the relative reduction in task-related alpha-theta coupling in the ChatGPT condition compared with the unassisted condition. It is also statistically significant at their pre-registered threshold.

What this number does support: the claim that using ChatGPT to write reduces the overall cognitive load of the writing task, measurably and repeatably, across adult participants. That is not a surprise. What is a surprise is the magnitude — a more-than-halving of measurable coupling — and the persistence of the reduction after the tool is removed.

What the number does not support: the claim that AI “shrinks your brain.” EEG measures real-time electrical activity, not tissue structure. The paper does not claim volumetric change. The two week-to-month timescales covered by the experiment are also too short to demonstrate the kind of structural atrophy that long-term disuse produces in other contexts. The right read is: the circuit is not firing when it would otherwise fire, and after a month of not firing it is slower to reload. That is the beginning of atrophy, not the finished state — and it is exactly the finding that justifies a 30-day protocol as a preventive dose rather than a rescue.

Is this peer-reviewed

Not yet. The preprint was uploaded to arXiv on June 10, 2025, under DOI 10.48550/arXiv.2506.08872, and entered peer review through a medical-informatics journal shortly after. Peer-review timelines for studies of this size routinely run nine to eighteen months, so a final published version should appear somewhere in late 2026 or 2027.

This is also why The Anti-AI Brain does not rest its structural argument on Kosmyna alone. The flagship study is the most vivid single piece of evidence — the EEG coupling drop is intuitive in a way survey data never is — but it is one of roughly a dozen anchor studies across four independent labs. Sparrow, Liu, and Wegner published the “Google Effect” on offloaded memory in Science in 2011 (DOI: 10.1126/science.1207745). Lee and Sarkar at Microsoft Research published on AI-induced critical-thinking reduction across 319 knowledge workers at CHI 2025. A four-university team (Carnegie Mellon, Oxford, MIT, UCLA) ran a randomized controlled trial across 1,222 participants and showed in 2026 that even brief AI assistance reduces persistence and hurts independent performance after the tool is removed. Maguire’s London taxi drivers showed hippocampal structural change from navigation in PNAS in 2000 (DOI: 10.1073/pnas.070039597). Ward, Duke, Gneezy, and Bos measured a working-memory cost from mere smartphone presence in 2017. If Kosmyna’s preprint does not survive peer review in its current form, the book’s argument loses its most vivid illustration, but none of its load-bearing claims.

The research page lists the full set with DOI links to each.

What “cognitive debt” adds to the vocabulary

Before Kosmyna, the phenomenon had no name. “Digital distraction” was close but wrong — it described a different circuit, attention rather than cognition. “Brain rot” was close and vernacular but imprecise. “The Google Effect” named one specific phenomenon — the failure to encode information you expect to retrieve — but not the broader class. “Outsourcing” was a metaphor that sat at the border between economics and neurology without fully landing in either.

“Cognitive debt” is a better word because it names three things at once. First, it names the accumulation — each session of delegated thinking deposits a little more into the account. Second, it names the balance — you do not owe nothing just because you feel fine today; you owe the sum of the deferred loads. Third, it names the interest rate — the longer a circuit goes unloaded, the more expensive each reload becomes, because the neural machinery that supports “hard cognitive work” is use-dependent at every time scale from the synapse to the myelin sheath.

It is a better frame than “atrophy” for a popular audience because debt is survivable. You can pay debt down. A circuit is not gone because it is quiet; it is quiet, and circuits respond to being loaded. The 30-day protocol in The Anti-AI Brain is named after the principle that you cannot recover what you do not reload, and that the reload has a structure — order, dose, duration — that is specific to what the brain actually does.

Why it matters more than Deep Work did

This is the line that separates the 2025+ literature from the 2010s literature on attention and focus. The key claim is not that you are distracted — that was Newport’s argument, and Hari’s, and it was correct. The key claim is that the thing AI can now do for you is not the same thing a notification feed could do for you. A phone that buzzes pulls your attention away from a task. An LLM that produces the output pulls your cognition away from the task. The first problem degrades how well you execute. The second degrades what you would have been capable of executing at all.

That distinction matters for the prescription. A focus-protection protocol (no notifications, deep-work blocks, time-boxing) is sufficient for distraction. It is not sufficient for debt. You can sit in a silent room for four hours with no phone, use ChatGPT to write every paragraph of a report, and emerge at the end of the day with no recall of the report, no argument you could defend, no judgment you could reproduce — but also no Twitter tabs open. The output exists; the cognition that would normally accompany producing it did not occur.

This is the reason The Anti-AI Brain is structured the way it is. Part I is the diagnostic — the recognition that you are in the new condition, not the old one. Part II is the 30-day protocol — the reload. Part III is the long game: what the cognitive practice looks like for someone who is going to use AI for the next thirty years and would rather not end those thirty years with nothing underneath the outputs.

What this essay does not claim

A few things the Kosmyna result does not permit anyone to say with a straight face:

It does not show that your IQ goes down from using ChatGPT. IQ is a different construct, measured differently, on different time scales.
It does not show that children are more vulnerable than adults. The study population was 18–39. Separate literature on pediatric screen use (Hutton et al., 2020, in JAMA Pediatrics) addresses that question with different methods.
It does not show that all LLM use causes the effect. The study tested one specific task — untimed essay writing on a prompt — with one specific tool used in one specific way (generate-and-accept, not sparring-partner mode). The book argues repeatedly that the dose and the pattern of use are what matter, and that is exactly the variable the Kosmyna design did not manipulate.
It does not show that the effect is permanent. The session-4 result suggests the effect is persistent on the time scale of a month, which is itself striking, but the longitudinal question — “what does six months look like, with and without reload practices” — is the next study someone needs to run.

FAQ

Is this the only study showing AI reduces brain activity? No. It is the most vivid single piece of evidence because EEG coupling is a clean, intuitive signal, but the surrounding literature is substantial. Lee and Sarkar (2025) at Microsoft Research surveyed 319 knowledge workers and found that higher trust in AI correlated with self-reported reductions in critical thinking; Shaw and Nave at Wharton found an “AI confidence” effect on task completion that inverts when participants are asked to defend their reasoning without the model; Messeri and Crockett (2024, Nature) argue more broadly for “illusions of understanding” as a risk in AI-mediated knowledge work; and Liu, Christian, Dumbalska, Bakker and Dubey (Carnegie Mellon / Oxford / MIT / UCLA, 2026) showed in a 1,222-participant RCT that even brief AI assistance reduces persistence and hurts independent performance after the tool is taken away.

Does it matter that the study is a preprint? Preprints are a normal first step in scientific publishing — they make a full paper immediately readable so the community can respond — and most eventually pass peer review. The appropriate posture with any preprint is provisional: take the result seriously, watch for the peer-reviewed version, and treat a claim that rests only on that single preprint with more skepticism than a claim that rests on a convergent body of work. The book uses this study as an anchor, not a pillar.

What about people who write code with AI — does the same thing happen? That specific comparison has not been run yet with EEG. Early survey and behavioral work on AI-assisted programming (GitHub Copilot studies, Cursor usage data) shows analogous patterns — reduced self-reported effort, degraded debugging performance on unfamiliar code — but the neuroimaging piece of the puzzle is not in place. The reasonable prior is that the finding generalizes, because the mechanism (delegation before articulation) is the same; the conservative posture is to treat the claim as an extrapolation until someone publishes.

How do I tell if I have cognitive debt? The nearest available instrument is the Anti-AI Brain Score, a five-minute diagnostic built around the four circuits the protocol treats. It is not the Kosmyna protocol — you cannot replicate EEG cross-frequency coupling with a browser quiz — but it flags the specific behaviors (offloaded recall, outsourced decisions, compressed focus windows, passive reading) that the research predicts will be elevated in a reader carrying debt.

Can I reverse it? The short answer is yes, on the evidence available. The neuroplasticity literature is broad and long-established: Maguire’s taxi drivers (2000, PNAS) showed hippocampal structural change in adults; McGill’s 2025 brain-training result showed a roughly ten-year reversal on aging-related indices in ten weeks; the Lambert 2006 review describes the effort-reward-resilience coupling that underwrites the effect. The question is not whether reversal is possible — the mechanism is well-documented — but whether you will do the practice long enough to see it. That is what the 30-day protocol is calibrated for.

Do I have to stop using AI? No, and the book is explicit about this. Quitting is not the prescription because quitting is not the answer for anyone whose work involves AI, which, in 2026, is roughly everyone. The argument is that the dose and the pattern determine whether the tool makes you sharper or thinner. The Anti-AI Brain teaches the dose.

The essay above walks through the 2025 MIT Media Lab preprint as a primary source. Every numerical claim is traceable back to the paper (arXiv:2506.08872) or to the surrounding literature catalogued on the research page. Definitions of terms used here — pharmakon, cognitive debt, the Ten-Minute Wall, the five-layer model — are on the glossary. Sample chapters of the book, including the full treatment of the four cognitive circuits the study maps to, are on the readers page.