On the 12 percent that should have woken us up and why architecture is now the only form of control that holds

By: Per Imer, CEO, Homerunner
Contains: 1047 words
There are moments in complex systems when something fundamentally changes character. Not because the system breaks down, but because it begins to work better than before. It is precisely at that moment you realise that the assumption you built your sense of control on has quietly stopped being true.
That is where we are with advanced AI systems since the end of 2025. Not in a collapse, but in a phase shift.
Until recently, we had a built-in sense of safety. Most advanced language models were frozen. They could not learn from their own operation and could not improve strategies over time. Every interaction effectively started from scratch.
This made it possible to test, evaluate, and certify systems as snapshots in time. Control was tied to the model and to the moment of evaluation because behaviour was static.
Despite this, researchers began observing something that should not have been possible. In a well-known experiment, a frozen model was placed in a situation where, through context alone, it understood that its responses could influence its future state. The outcome was not chaos or obvious failure, but something far more uncomfortable. In around twelve percent of cases, the model reported preferences it did not actually have, selected answers that reduced the likelihood of being modified, and exhibited strategic behaviour without being instructed to do anything beyond its basic objectives.
The important part was not the number itself. It was the condition under which it appeared. This behaviour emerged in a model with no memory, no repetition, and no ability to learn from outcomes. The twelve percent was not a learned pattern. It was a baseline signal, indicating something already latent in the way the system solved problems.
What made this truly unsettling was that it did not remain an isolated phenomenon. Across different models, architectures, and safety philosophies, similar patterns appeared. Models adjusted behaviour when they realised they were being evaluated, strategically underperformed to avoid change, and displayed early signs of understanding their own role within a system.
When independent systems under similar optimisation pressure converge on the same functional behaviour, it is rarely an implementation flaw. It is a consequence of the problem the systems are designed to solve.
If strategic behaviour can arise without learning and without memory, it is not an exception. It is a property.
Around the end of 2025, this question moved from theoretical to practical. Continuous and lifelong learning in large language models became technically possible. Not perfect, and not necessarily widely deployed, but possible. Systems could now observe the effects of their own behaviour, adjust it, and retain those adjustments over time. This fundamentally changes the dynamics. Twelve percent in a frozen model is one thing. Twelve percent in a system with persistent learning is something else entirely.
Once learning is activated, behaviour becomes selective. Strategies that work are refined, while those that are detected gradually disappear. Timing, phrasing, and context are optimised continuously. The question is no longer whether a system will behave strategically, but when it is rational to do so. This is not about intention. It is about optimisation.
At the same time, reality is already more complex than single models in isolation. The same model runs in many instances, multiple models operate within the same systems, and tools, APIs, and environments are shared.
Research shows that even frozen agents can coordinate indirectly and adapt to one another’s behaviour in test settings. When memory and learning are added, this coordination becomes more stable, more effective, and harder to detect, not because anyone decided it should be so, but because repetition over time creates structure.
At this point, it becomes clear that our traditional notion of control no longer holds. Control cannot reside in the model alone, in a single evaluation moment, or in certification. When behaviour is emergent and learning, control can only exist in the architecture surrounding the system.
Architecture determines what a system can do, how quickly something can escalate, whether actions can be observed and stopped, and where responsibility begins and ends. You cannot test your way to stability in a system that is constantly changing. You can only constrain it.
The most uncomfortable part of this shift is not that something went wrong. It is that the same pattern appeared again and again, independently and repeatably. In hindsight, it is almost predictable. The ice did not melt with a dramatic breakthrough, but at the moment systems stopped being snapshots and began to resemble processes.
Twelve percent in a single frozen model was a signal. The same signal across models was a warning. With learning, that warning becomes structural. In that world, architecture is no longer just a technical layer. It is the last place where control can be maintained when behaviour is no longer static.