Halton Meter has two OpenAI adapters because OpenAI has two distinct call surfaces — and they don’t share auth, paths, or wire shapes:
adapters/openai.py— ownsapi.openai.com. The classic Bearer-token API surface used by the OpenAI Python and Node SDKs, Cursor’s GPT integration, and any direct HTTP client.adapters/openai_codex.py— ownschatgpt.com. The OAuth surface Codex (the ChatGPT-account version) routes through. Different paths, different auth, different parsing — sameprovider = "openai"in the row so reports aggregate cleanly.
Both adapters share name="openai" so a report --by provider rolls
them up. The mode column distinguishes them at row level.
api.openai.com — costed paths
| Path | Modes |
|---|---|
/v1/chat/completions | Standard, streaming |
/v1/responses | The current-generation Responses API |
/v1/embeddings | Embeddings (input tokens only; cost = input × rate) |
Non-/v1/chat, /v1/responses, /v1/embeddings paths are observed
but not metered. Examples: /v1/models, /v1/files, control-plane
endpoints.
chatgpt.com — Codex OAuth surface
OpenAI Codex when run under a ChatGPT account uses
chatgpt.com-prefixed endpoints with OAuth tokens (not API keys). The
Codex adapter is a sibling of the OpenAI one specifically because the
two surfaces share neither auth shape nor URL space — broadening the
main adapter would have made both fragile. The Codex adapter
maps each captured call to the same requests row schema as the API
adapter, so reports treat them uniformly.
This is a Halton Meter differentiator. LiteLLM, Helicone, Langfuse, and OpenLLMetry capture API-key traffic only; Codex via ChatGPT auth is invisible to all of them. Halton Meter captures it because it intercepts at the network layer, not the SDK.
Captured fields
For both adapters:
provider = "openai"model— from the responseinput_tokens,output_tokens— from theusageblockcache_read_tokens— whenusage.prompt_tokens_details.cached_tokensis presentcost_usd_minor_units— against the active rate card
thinking_tokens and cache_write_tokens are zero for OpenAI; those
columns are Anthropic-shaped.
Streaming
/v1/chat/completions and /v1/responses both support stream=true,
which emits text/event-stream chunks. The adapter buffers, parses
the final usage chunk, and writes the row. Partial streams
write tokens_complete = false.
Tools that route through these adapters
| Tool | Adapter | Path |
|---|---|---|
| OpenAI Python / Node SDK | openai.py | api.openai.com via certifi or NODE_EXTRA_CA_CERTS |
| Cursor with GPT back-end | openai.py | Same |
| Codex (ChatGPT account) | openai_codex.py | chatgpt.com via Node trust |
curl https://api.openai.com/... | openai.py | system keychain or CURL_CA_BUNDLE |
Verifying API-key capture
$ halton-meter run -- curl -sS https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{"model":"gpt-4.1-mini","messages":[...]}'
$ halton-meter report --since 5m --by model For Codex / ChatGPT capture, run Codex normally after init --apps;
the OAuth flow happens against chatgpt.com and is captured by the
Codex adapter without further setup.
Error classification
Both adapters classify OpenAI errors into the seven canonical buckets — see Error classification. Shipped in v0.3.0.
| HTTP | Provider error.type / code | error_class | retryable |
|---|---|---|---|
| 400 | invalid_request_error | bad_request | false |
| 401 | authentication_error | auth | false |
| 403 | permission_denied / country-blocked | auth | false |
| 404 | NotFoundError / model_not_found | bad_request | false |
| 408 | APITimeoutError | timeout | true |
| 409 | ConflictError | server_error | true |
| 422 | UnprocessableEntityError | bad_request | false |
| 429 | rate_limit_error (RPM / TPM throttle) | rate_limit | true |
| 429 | insufficient_quota (billing exhausted) | auth | false |
| 500 | APIError / InternalServerError | server_error | true |
| 502 | bad gateway | server_error | true |
| 503 | overloaded / slow_down | server_error | true |
| — | APIConnectionError | network | true |
The two HTTP 429 rows are the key distinction: a rate_limit_error is a
throttle (back off, retry), but insufficient_quota is exhausted billing
(auth, not retryable). See the judgement-call note on the
concept page.
Host matching
api.openai.com and chatgpt.com are matched by exact equality (with
optional :port). Subdomains do not match.