The Timer Problem

A common thread I've seen with LLMs.

April 05, 2026

Someone, somewhere, asked a language model to time them doing something. Then they got annoyed when it lied about how long had passed.

I don't say this to mock them. I say it because that one anecdote gets across something no technical explainer has quite managed to: most people have genuinely no idea what they're talking to.

The Spectrum

There's a spectrum of understanding when it comes to large language models, and it's worth being honest about where everyone sits on it.

At the far end, you have the researchers — the people who understand the mathematics of backpropagation not as a vague concept but as something they can derive on a whiteboard. These people are rare, and mostly employed at about six companies.

Then there's a middle band. People like me. I have a working understanding of the architecture — enough to know roughly what the thing is doing, and more usefully, what it isn't. I know that training produces a frozen set of weights, that inference is just running input through that frozen network, that the model has no persistent state between conversations, no access to a clock, no ability to do anything except predict what a plausible continuation of your text looks like.

I can't, however, tell you how the mathematics works. I can't tell you how the neural network is actually structured in any rigorous sense. My understanding has a ceiling, and below it is a lot of calculus I haven't touched.

And then there's everyone else. Which is most people. That's not really a failure condition — it's just what happens when a technology is deliberately made frictionless to use.

The Abstraction Is the Product

You don't need to understand TCP/IP to send an email. You don't need to know how JPEG compression works to share a photo. Abstraction is what makes technology usable at scale, and there's nothing inherently wrong with that.

The problem is what happens to your mental model when the abstraction is too good.

With email, you develop an intuitive sense of its constraints through friction. Attachments have size limits. Emails go to spam. The seams show, and the seams teach you something about what's underneath.

Language models, by design, have almost no seams. They're fluent. They're confident. They respond in natural prose that mirrors a human expert closely enough that the gap between "what this feels like" and "what this is" becomes very hard to perceive without deliberate effort.

So people ask them to run timers. People ask them to remember things from a previous conversation. People get upset when they change their mind mid-session — as if they've been lied to, rather than as if the statistical distribution shifted. People treat "I don't know" as false modesty rather than a genuine limitation of a system that has no reliable mechanism for knowing what it doesn't know.

Fancier Autocorrect

The "fancier autocorrect" framing is reductive, and I use it anyway, because reductive framings are sometimes the only thing that cuts through.

It's not accurate — the jump from phone keyboard autocorrect to a large language model is enormous. But the kind of thing it is hasn't changed as much as it feels like it has. It is still, at its core, a next-token predictor. It doesn't reason, as far as I can tell. It doesn't plan. It predicts, with extraordinary sophistication, what a reasonable continuation of your input would look like.

That's genuinely impressive. I'm not being dismissive. But there's a difference between being impressed by something and accurately modelling what it is.

The Stateful Situation

I know enough to not ask a language model to time me doing something. I know enough to be sceptical when it's confident, to not treat its outputs as ground truth, to understand that "I'll remember that for next time" is a hallucination in the most literal sense — it will not, and cannot, remember anything.

It's worth noting that statefulness can be bolted on, and some systems do exactly that. Letta (and the agent frameworks built around it) give a model external memory stores, persistent context, and tool access, stitched together with scaffolding that the model itself knows nothing about. The architecture grew out of the original MemGPT research, and the core idea is treating the model's context window like RAM — paging relevant information in from external storage on each request, writing new information back out when it's done. From the outside it can look and feel like the model remembers you. It doesn't. The model is still stateless. The scaffolding is doing the remembering, injecting relevant context back in each time so the illusion holds.

That's genuinely useful. It's also, I think, a good demonstration of the point: the base technology needed an entire architectural layer built around it just to approximate something humans do without thinking. The workaround works. The fact that a workaround was necessary is telling.

I'm also aware that my own understanding has a ceiling. The actual mathematics, the ongoing debate about whether any of this constitutes anything approaching real understanding — I can read about these things, but I can't claim to have properly grasped them.

Which puts me on the spectrum too. Just at a different point.

Worth Keeping in Mind

The timer anecdote isn't really about the person who asked. It's about what happens when a technology is powerful enough to feel like magic, and we stop asking what it actually is.

Knowing roughly what something is helps you know what to expect from it. You don't get surprised. You don't get taken in — including by your own anthropomorphism.

A language model is not a person. It has no goals, no memory, no sense of time, and no stake in whether what it tells you is true.

It is very good at sounding like it does.

Tomodachi Life: Living the Dream and Why I Absolutely Love It

Criminal by Birth

reflection

Ewan’s Blog

I ramble, enjoy.