Why Most Brain Training Fails — And What Effective Cognitive Training Needs to Do
Most brain training trains the game as practice effects. That does not mean the whole idea is wrong.
There is a version of brain training that really does work. You practise a working-memory task for a few weeks, your score goes up, and you feel sharper. The app shows you a graph. Progress.
The problem is that “progress on the graph” and “progress in your thinking” are not the same thing. And a large body of research now makes clear that most cognitive training — the kind sold in apps, subscriptions, and productivity programmes — mainly produces one thing: getting better at the training.
That sounds obvious, but it took researchers years of careful meta-analyses to establish just how hard it is to show that cognitive training improves anything outside the task being trained. The most thorough review on working-memory training found reliable gains on trained and closely related tasks — but weak evidence for broad improvement in reasoning, learning, or intelligence (Melby-Lervåg et al., 2016). You get better at the game. The game does not reliably make you smarter.
This is worth sitting with, because the promise of brain training is not “you’ll get better at this specific game.” The promise is transfer: that practising something cognitively demanding will strengthen the underlying machinery of cognition. And transfer is genuinely difficult to produce, and even harder to measure honestly.
Near transfer is easy. Far transfer is hard.
The field distinguishes three levels of transfer that are worth knowing:
Near transfer is improvement on the trained task or tasks that look very similar. This is almost always achievable. It is also the least interesting outcome, because it mostly reflects task-specific learning — new routines for the specific format — rather than broader cognitive change.
Intermediate transfer is applying the same underlying structure in a new but related format. Harder to produce, but plausible with good design.
Far transfer is the real goal: applying what you have learned in genuinely different situations — new problem types, new domains, real decisions, real work. This is where the evidence is cautious. It is also where most brain-training marketing quietly skips past the science.
Most apps are implicitly promising far transfer while only reliably delivering near transfer. That gap is not a minor footnote. It is the entire argument.

Why training often fails to transfer
There are several reasons a trained skill does not transfer, but one of the clearest is surface overfitting. The learner gets fluent at the look and feel of the task — the colour scheme, the timing, the format — without building a cognitive structure that survives a change in wrapper.
Perceptual learning research shows that learning can be highly specific to surface features: location, orientation, modality, task format. Get better at a visual n-back task and you may not be substantially better at anything that doesn’t look exactly like a visual n-back task.
A second problem is that most working-memory training asks users to remember items — letters, locations, sounds — rather than relations. But fluid reasoning, the kind that matters for learning, problem-solving, and real-world judgment, is not mainly about holding more items. It is about holding, comparing, and transforming relations between things. Research consistently finds that relational integration predicts fluid reasoning above simpler working-memory measures. Training that stays at the item level is training the wrong level.
A third problem is strategy. Someone can become skilled at a cognitive game without understanding the transferable structure underneath it. Without explicit instruction on why a technique works and when to apply it outside the training context, the skill stays inside the app.
What better training would actually need to do
This is where I want to push past the usual critique-of-brain-training piece, because the failure of most apps does not mean the idea of structured cognitive training is wrong. It means that most implementations are too thin.
A training protocol that takes transfer seriously would need to combine several things:
Attention and cognitive control as a foundation. Before asking someone to reason, it is worth establishing whether they can extract relevant information under uncertainty and resist distraction. Research on the Majority Function Task (MFT) suggests that cognitive-control capacity — how much task-relevant information can be brought under control — is trainable, and shows selective transfer to attention and verbal-learning outcomes. Selective is the key word. Attention control helps prepare the system; it is not a complete solution.
Relational working memory, not just item rehearsal. Training should move explicitly towards binding, comparing, and transforming relations, because this is the level at which fluid reasoning operates. A useful design rule: train relations before merely increasing difficulty.
Explicit strategy prompts, not hints. The aim is not to nudge the user toward the right answer. The aim is to teach a reusable thinking structure. Prompts like What must be true? What relation is the same? What would make this wrong? act as handles for coordinating attention, working memory, and reasoning around a transferable pattern. Meta-analytic evidence on inductive-reasoning training and self-explanation both support this approach.
Wrapper variation. If a skill only works in the format it was trained in, it has not transferred. A better approach is to present the same underlying relational structure across different surfaces — a visual puzzle, then a verbal analogy, then a decision problem, then an applied scenario. The test is not “did the score improve?” but “does the same structure survive when the wrapper changes?”
Problem spaces, not only drills. Real thinking happens in situations with a current state, a goal, constraints, possible moves, and feedback. Training that includes structured problem-space tasks — where the learner asks Where am I? What is the goal? What constraints matter? What is my next test? — is more ecologically valid than isolated drills. Schema instruction and structured problem-solving training have reasonable evidence behind them, particularly in bounded domains.
Real-world cue-linking. A strategy you learned inside an app is only useful if it fires in the wild. Implementation intentions — cue-linked plans of the form “if I’m stuck on a problem for more than 30 seconds, then I will write: current state → goal → constraint → next test” — have consistent meta-analytic support for goal attainment. The aim is not to practise a cognitive move inside the training session. The aim is to create a conditioned response that activates under real-world conditions.
Spacing, sleep, and delayed re-checking. Massed practice sessions are not the right format. A skill that only works immediately after a training block has not consolidated. Distributed practice and delayed re-tests are better indicators of durable learning. A relation that survives tomorrow, in a new wrapper, is more meaningful than a relation that only works right after drilling.
What this means for IQ Mindware
I want to be transparent about where this leads me, as the cognitive neuroscientist building IQ Mindware.
The product is not built around the claim that it raises IQ. That claim is not well-supported by the evidence, and I am not willing to make it.
What I am building instead is what I would call an evidence-generating far-transfer protocol. The idea is to train component mechanisms — attention control, relational working memory, inductive reasoning, strategy use, implementation intentions, problem-space navigation — and then test whether those components transfer through changed task surfaces, delayed re-checks, and real-world deployment.
The strongest honest claim for IQ Mindware right now is this: it is designed around the conditions under which far transfer becomes more plausible and measurable. It is built to ask better questions of the evidence, not to assert answers the evidence does not yet support.
That is a harder sell than “seven minutes a day to a sharper mind.” It is also the only claim I think is defensible.
If you want the fuller evidence-based version of this article — with the complete reference list of what we know about effective brain training, section-by-section breakdown of the seven training principles, and the IQ Mindware protocol overview — I’ve put that on the IQ Mindware site here: iqmindware.com — Does Brain Training Work? What the Evidence Really Says



Mark, this is the clearest explanation I've seen of why most cognitive training research is asking the wrong questions. The distinction you draw between near, intermediate, and far transfer is a point that's rarely articulated with such honesty in this field.
To perform effectively under pressure, one must be able to grasp, contrast, and reinterpret the relationships between different ideas. The current training is focused on the wrong level of abstraction, failing to address the cognitive demands of real-world performance.
Your seven design principles are a learning science manifesto disguised as a brain training specification. You have expertly applied established research: wrapper variation is encoding specificity; implementation intentions are Gollwitzer; spacing and delayed re-checking are Bjork and Cepeda; and strategy prompts are metacognitive monitoring. It seems you have distilled fifty years of cognitive psychology to answer the single, crucial question: "What would have to be true for far transfer to actually occur?"
IQ Mindware's transparency in not claiming to raise IQ is a commendable move. I wish more practitioners in related fields would adopt it. Much of what is currently marketed lacks evidential support, and the field would progress more rapidly if researchers were more willing to state this publicly.
Your piece raises a critical question that remains partially unanswered: How do we measure capability in domains where the very method of testing that capability is disputed? While brain training benefits from objective benchmarks, most workplace learning does not. The capability you're building is testable; the capabilities most L&D programs claim to build are not. This asymmetry warrants far more attention than it currently receives.