Concepts · 05

Pricing methodology

How Halton Meter computes cost — bundled dated rates, offline reproducibility, customer overrides, and the rates_source provenance trail recorded on every logged request.

macOS 12+ · Python 3.11+ Reading time 2 min Updated May 11, 2026

Halton Meter is a cost-attribution tool. If the totals it reports don’t match what the provider actually bills, the tool has no reason to exist. This page documents how cost is computed, where the rates come from, and how to inspect or override them.

The rule

Cost computation never makes a network call. Every figure Halton Meter prints — in halton-meter report, in the local HTTP API, in CSV exports — is derived from a pricing matrix bundled with the version of the daemon you have installed.

This means:

  • Costs are reproducible offline.
  • Historical reports stay accurate when a provider changes their rates.
  • A report run on an airgapped machine produces the same numbers as one with full internet access.

The same per-row rate source (bundled-YYYY-MM-DD or override) is what makes these numbers safely rolled up across machines in a Cloud workspace — each machine prices its own rows locally, Cloud only sums.

The bundle

Every release of halton-meter ships with a dated pricing matrix.

halton-meter pricing list

pricing list prints the full table: provider, model, input rate, output rate, any tiered surcharges, the bundle date, and any overrides configured.

The current bundle, shipped with v0.1.24 and dated 2026-05-01, covers Anthropic, OpenAI, Google Gemini, and xAI (Grok). Gemini’s

200k context surcharge is modelled directly. New model SKUs are added within roughly one week of the provider publishing them.

See the full rate table for the current bundle →

Freshness checks

halton-meter doctor runs a freshness probe against a public manifest of the latest bundle date. If your local bundle is older, the doctor prints a warning and points you at:

pipx upgrade halton-meter

The probe is fail-open: if the manifest is unreachable, doctor prints nothing and cost computation is unaffected. To disable the probe entirely (airgapped installs, strict egress policies):

HALTON_METER_NO_RATES_PROBE=1 halton-meter doctor

Custom rates

To override the bundled matrix with a negotiated rate:

halton-meter pricing set --provider anthropic --model claude-opus-4-7 --input 14.00 --output 70.00

Overrides:

  • Take precedence over the bundled matrix for the named provider/model.
  • Survive pipx upgrade halton-meter.
  • Are stored in ~/.halton-meter/ and inspectable via pricing list.
  • Are tagged distinctly in every artifact: every logged request priced by an override records rates_source: "override" rather than bundled-YYYY-MM-DD.

To remove an override and fall back to the bundle:

halton-meter pricing reset --provider anthropic --model claude-opus-4-7

Audit trail

Every row in the request log carries the exact rate source used to compute its cost. In CSV exports the column is rates_source; in the HTTP API and terminal report the same field appears verbatim. Possible values:

  • bundled-2026-05-01 — priced from the dated bundle shipped with v0.1.24.
  • override — priced from a customer-configured override.

A CSV pulled six months from now is self-describing. An auditor can read each row, identify the rate source, and reconcile it against the matrix that was published on that date.

Refreshes and historical reports

When a provider publishes new rates, the bundled matrix is refreshed and shipped as a new daemon release. Run pipx upgrade halton-meter to pick it up.

If you spot a rate change before a refresh ships — or you have a negotiated rate — patch it locally with halton-meter pricing set without waiting.

Until you upgrade or override, historical reports continue to use the bundle that was installed when each request was logged — which is the correct behaviour for reconciliation.