AI doesnโ€™t fail at randomness. It fails at complexity.

09 Jun 2025

A screenshot from the paper: The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Model... image
Apple just tested the smartest "reasoning" AI Models out there: Claude 3.7 Sonnet, DeepSeek-R1, OpenAIโ€™s o1/o3.
The verdict?

They didnโ€™t just underperform.
They ๐—ฐ๐—ผ๐—น๐—น๐—ฎ๐—ฝ๐˜€๐—ฒ๐—ฑ when things got to complex.

Even when you gave them the algorithm, they couldnโ€™t follow it.
Worse, when tasks got harder, they ๐—ฟ๐—ฒ๐—ฎ๐˜€๐—ผ๐—ป๐—ฒ๐—ฑ ๐—น๐—ฒ๐˜€๐˜€, not more.

This confirms what many testers already feel in their gut:
AI looks smart until it has to think.

Because real reasoning isnโ€™t just generating confident answers.
Itโ€™s about:

โ€ข Navigating uncertainty
โ€ข Spotting whatโ€™s missing
โ€ข Asking, โ€œWait, does this even make sense?โ€

And thatโ€™s what great testers do every day.

We donโ€™t just validate that something works.
We question ๐˜„๐—ต๐˜†, ๐—ต๐—ผ๐˜„, ๐—ฎ๐—ป๐—ฑ ๐˜„๐—ต๐—ฎ๐˜ could break it next.

AI can make us more productive.
But when complexity scales, ๐˜๐—ต๐—ฒ ๐—”๐—œ ๐—ถ๐˜€ ๐—ป๐—ผ๐˜ the reasoning engine.
๐—ฌ๐—ผ๐˜‚ ๐—ฎ๐—ฟ๐—ฒ.

Original Paper: https://machinelearning.apple.com/research/illusion-of-thinking
Christine Pinto
CPTO of Epic Test Quest

Co-Founder and CTPO @Epic Test Quest | Conference speaker on AI and Quality Leadership | Long-time tester | Building tools testers actually enjoy using | Join the quest to level up software quality

Chapter Lead
Sign in to comment
Explore MoT
MoT London image
Thu, 23 Apr
London Chapter April gathering
MoT Software Testing Essentials Certificate image
Boost your career in software testing with the MoT Software Testing Essentials Certificate. Learn essential skills, from basic testing techniques to advanced risk analysis, crafted by industry experts.
This Week in Quality image
Debrief the week in Quality via a community radio show hosted by Simon Tomes and members of the community
Subscribe to our newsletter