Charles' Predicament
Goodhart's Law on the Factory Floor
In 1936, Charlie Chaplin (born Sir Charles Spencer Chaplin) played a factory worker who tightens bolts on an assembly line. All day, every day, the same motion. Faster. Faster. Faster! Until the motion starts tightening him. His body keeps going after the line stops. He tightens noses, buttons, anything round that he could find. The man has become a part of the machine. It continues with more absurdity afterwards: the feeding machine. A device designed to feed workers while they keep working, because a lunch break is wasted time.
The machine malfunctions spectacularly. Corn shoves itself into Chaplin's face. Soup flies everywhere. A mechanical napkin slaps him. The factory owner watches, nods, and decides the machine isn't efficient enough...yet.
Not: maybe we shouldn't feed humans with machines.
But: the machine needs improvement.
In the theatre we would laugh the same way people laughed when the movie first aired in 1936. The absurdity feels funny, but we leave the theatre with a distinct feeling of wondering: Why does this seem accurate to this time even though it's been almost 100 years since this first came out?
Some of the largest tech companies in the world are building leaderboards to track how many AI tokens their employees consume. Shopify launched one in 2025, later quietly renaming it to "Usage Dashboard" when the internal competition got out of hand. Meta built something similar, then removed it after backlash. The logic behind all of them is the same: more tokens consumed equals more productivity. Use the AI harder. Faster. Faster. Faster!
So, have we all reached Chaplin's feeding machine, but for knowledge workers now as well? With Managers who now only own a clipping board, but a mechanical dashboard as well?
Wrong Classroom
Before the industrial revolution, the concept of "productivity" as we know it didn't exist. People worked in rhythms. Seasonal, cyclical, tied to daylight and harvest. Then the factory arrived, and with it a new equation: more output per hour equals more value. For repetitive, physical work, that was actually true. More bolts per minute meant more product on the shelf. Easy Math.
Around 1900, Frederick Taylor formalized this into what he called "Scientific Management." Stopwatch in hand, he measured every motion on the factory floor. How long to pick up a bolt. How long to place it. How long to tighten it. Every second accounted for, every movement optimized. The worker became a unit of output.
The problem now is: that factory closed for most, but the logic didn't.
Knowledge work is not repetitive. Writing code is not tightening bolts. Solving a complex design problem is not assembling a part. Yet the system still measures as if it were: hours spent, tickets closed, lines written, tokens burned. The equation "time equals money equals value" was established over two centuries ago, and it still runs underneath most of how organizations think about work.
A Lesson in History
In 1975, Charles Goodhart described what might be the most reliably proven principle in organizational behavior: "When a measure becomes a target, it ceases to be a good measure." The moment you start rewarding a number, people optimize the number. Not the thing the number was supposed to represent.
Every few decades, someone invents a new metric that promises to finally capture what "productive" means. Goodhart predicts exactly what happens next. And he has been right every single time:
Lines of Code. In the 1970s and 80s, software teams measured output in lines of code. More lines, more productive. The result was predictable: developers wrote bloated, unnecessarily verbose code. Functions that could have been ten lines became fifty. Elegant solutions were penalized because they produced less. The metric was quietly abandoned when organizations realized they were incentivizing the opposite of quality.
Hours at the desk. The presence metric. Who arrives first, who leaves last, who is always "on." In the pre-remote era, being visibly at your desk was the strongest signal of commitment. It said nothing about what got done, but it looked right. The pandemic disrupted this when millions worked from home and output didn't drop. Some companies responded by installing surveillance software to track mouse movements. The desk was gone, but the logic stayed: visibility equals value.
Velocity. When agile frameworks became mainstream, teams started estimating work in story points (a relative measure of effort used in frameworks like Scrum). Velocity, the number of story points completed per sprint (a fixed timebox, usually two weeks), was designed as a planning tool. A way for teams to understand their own rhythm. Instead, it became a performance metric. Teams were compared. Rankings appeared. Story points inflated. A task that was a "3" last month became an "8" this month. Not because the work changed, but because the scoreboard rewarded bigger numbers. The metric that was meant to help teams plan turned into a competition that distorted every estimate it touched.
Tokens. And now, in 2025 and 2026, the latest version: token consumption. How many AI tokens does a developer use? More tokens must mean deeper engagement with AI tooling. Companies build dashboards, track usage, create leaderboards. Some offer trophies for the highest consumers. The assumption underneath: if the AI is being used more, more value is being produced.
Four costumes. One law. And yet, here we are again!
Modern Times
The numbers on Tokenmaxxing (the practice of maximizing AI token consumption as a proxy for productivity) are starting to come in, and they paint a specific picture.
A study tracking 22,000 developers across more than 4,000 teams found that in environments with high AI adoption, bugs per developer increased by 54%. Code churn, the percentage of code that gets rewritten or deleted shortly after being written, rose by 861%. Review time for AI-generated code took five times longer than for human-written code. Throughput went up. Task tickets completed per developer increased by 34%, bigger ones (epics) by 66%. But the code behind those numbers? It didn't hold.
Here's where the feeding machine reappears. Managers often see AI code acceptance rates of 80 to 90%. The developer clicks "accept," the suggestion goes in, the metric looks great. But weeks later, when someone goes back to check what actually survived in the codebase, the real acceptance rate tends to sit between 10 and 30%. The rest was quietly rewritten, reverted, or replaced. The dashboard showed a feast. The codebase got slapped in the face with corn.
And the person paying the real price? The developer reviewing the output. The one who has to read through AI-generated code that technically runs but doesn't quite fit, doesn't follow the team's patterns, doesn't account for edge cases the AI couldn't see. Review time multiplied by five. Not because the reviewer is slow. Because the volume of "productive output" increased while the quality didn't keep up.
More output. Less value. The oldest trick in the factory playbook, that for some odd reason, is from 1936.
The Feeding Machine Principle
There's a question that tends to sit underneath all of this, and it's worth making visible: why does "busy" keep winning over "effective"?
The short answer is that busy is visible and value is invisible. When an organization doesn't have clear definitions of what "value" actually means, it falls back on what's easy to observe. Hours logged look like commitment. Leaving on time looks like disinterest. High token consumption looks like AI adoption. Low token consumption looks like resistance. Not because these interpretations are accurate, but because the system has no better measurement.
Research on this is consistent: organizations with strong busy-cultures tend to perform worse, not better. Teams that moved away from busy-as-default showed 35% higher satisfaction and 28% higher actual output. The relationship between looking productive and being productive is, at best, unrelated. At worst, it's inverse.
But the feeding machine principle keeps running:
The Soviet railway system once measured success in ton-kilometers, the weight of goods multiplied by distance traveled. The result: trains carried heavy loads back and forth across the country, sometimes the same cargo in both directions, because the metric rewarded movement, not delivery.
Call centers have been caught in loops where agents called their own numbers to inflate call counts. Schools teach to the test because the test is the metric, even when teaching to the test produces students who can pass tests but can't apply what they learned.
The metric rewards the motion. The system optimizes for the metric. And value, quietly, goes somewhere else (mostly out the window).
When Chaplin's factory owner watched the feeding machine malfunction, the response wasn't to question whether feeding workers by machine was a reasonable idea. The response was: the machine needs improvement! This. THIS is the pattern. When the metric doesn't work, the system doesn't question the metric. It optimizes the metric harder.
Tokenmaxxing is this feeding machine. The corn soup is in everyone's face. And the system is nodding, taking notes, and building a better dashboard. Maybe even a light slap with a napkin once in a while, too.
Lessons Learned
The satirical part is that Chaplin didn't come up with Modern Times overnight. In 1931, he left Hollywood for an 18-month world tour. In Europe, he saw what industrialization and the Great Depression were doing to people: mass unemployment, automation replacing workers, entire communities built around factories that no longer needed them. He spent years turning that into comedy. Filming started in late 1934, wrapped in the summer of 1935, and the movie premiered in February 1936. From first recognition to finished film: almost five years. The film was a direct response to what industrialization was doing to workers: turning humans into extensions of machines, measuring their worth in output per hour, eliminating the break because the break doesn't produce.
Ninety years later, the technology has changed completely. The factory floor is a code editor. The assembly line is a CI/CD pipeline. The bolt is a pull request. But the logic? The logic is identical. More output per unit of time equals more value. And if the human in the middle slows down, build a machine to keep them going.
What Chaplin understood, and what the feeding machine scene captures better than any research paper could, is that the absurdity isn't in the technology. It's in the assumption underneath. The assumption that human work is linear. That more input produces more output. That you can optimize a person the way you optimize a conveyor belt.
Agile, Lean, and most modern management thinking have spent decades trying to move away from this: Measure outcomes, not output. Optimize for value delivered, not volume produced. And yet the moment a new technology arrives, the first instinct is to measure how much of it is being consumed. Not what it produced. Not whether anyone's life got easier. Not whether the product got better. Just: how much.
Frederick Taylor would have loved token leaderboards. A real-time dashboard showing exactly how much each worker is using the tool, with rankings, trophies, and a clear winner. That's Scientific Management with better graphics. The stopwatch is now an API call counter, but the hand holding it hasn't changed.
The metric changes. The pattern doesn't. And despite the fact that in the movie, the feeding machine was indeed badly built, the question isn't about the machine. It's about what it stands for: lunch breaks were not, and never were, the problem.
Chaplin knew. But have we learned his lesson yet?
The metric that matters most is rarely the one on the dashboard. Book a free consultation call: