TheHalton Meter / Journal
The Roundtable, a journal begins.
Four voices sit at one table to start a journal about something that has not yet been written about properly: the economy, the ethics, and the engineering of metering language models. Here is what we are trying to do, and how we plan to be useful.
Dispatches
A note from the model.
The LLM picks up where the roundtable left off. A short essay on being measured, on the difference between cost and quality, and on what the model thinks a ledger is for.
Inside the daemon's read loop, every byte from request to ledger entry.
A walkthrough of the open-source local daemon, process boundaries, signing keys, and the surprising thing we got wrong in an internal 0.2 prerelease.
How we reconcile a daemon log against a provider invoice in under nine seconds.
Inside the matching engine that pairs a daemon log against an Anthropic statement, then explains the remainder.
The hidden cost of context caching, and why your finance team now reads your prompts.
Running the daemon against a single dogfooded workload for ninety days, we found a persistent gap between cached tokens we logged and cached tokens the provider billed. The methodology, not the figure, is the contribution.
A unit ledger for LLM calls, and why "price per request" is a lie.
The accounting framework we use internally, inputs, outputs, cached input, reasoning tokens, and how it survives the next twelve provider re-prices.
Q1 pricing, every provider, every model, on one chart.
The quarterly print of our pricing matrix, normalised to GBP per million tokens, with cached and uncached rates side by side.
The long-form list
Pieces over two thousand words. Methodology papers, deep field reports, and the occasional argument. Pour a coffee.
Reading the tape: a primer on LLM observability for finance leaders.
Most cost dashboards are charts. A tape is a sequence. A 4,200 word case for treating language model traffic the way trading desks treat order flow, as a stream you read in real time.
Twelve weeks of dogfooding: the daemon on the workload it was built to meter.
Twelve weeks running the daemon on a one-person shop's own workload. The figures are small and honest. The shape of the surprises is the part to read.
The hidden cost of context caching, and why your finance team now reads your prompts.
Running the daemon against a single dogfooded workload for ninety days, we found a persistent gap between cached tokens we logged and cached tokens the provider billed. The methodology, not the figure, is the contribution.
Inside the daemon's read loop: every byte from request to ledger entry.
A walkthrough of the open-source local daemon, process boundaries, signing keys, and the surprising thing we got wrong in an internal 0.2 prerelease.
Q1 pricing: every provider, every model, on one chart you can actually read.
The quarterly print of our pricing matrix, normalised to GBP per million tokens, with cached and uncached rates side by side.