Drop-in proxy¶
The proxy records each AI call and forwards it to the real provider — so you get request-level attribution without touching application logic. You change one thing: the provider SDK's base URL.
Fail-open guarantee
The proxy runs on a hard latency budget. If Argmin is slow or unreachable, the request is forwarded to the provider anyway. The proxy can never block or fail your production traffic.
How it works¶
graph LR
App[Your app] -->|base_url = Argmin proxy| Px[Argmin proxy]
Px -->|forwards| Prov[OpenAI / Anthropic]
Px -.->|emits InvocationEvent| Pipe[Attribution pipeline]
Prov -->|response| Px --> App
The proxy passes your request through unchanged (including your provider API key,
which Argmin does not store), captures metadata — model, tokens, latency, cost —
and emits an InvocationEvent. Content is never captured.
Endpoints¶
Relative to your Argmin instance host (provided during onboarding):
| Provider | Proxy base path |
|---|---|
| OpenAI | /api/v1/proxy/openai/v1 |
| Anthropic | /api/v1/proxy/anthropic/v1 |
OpenAI example¶
Anthropic example¶
Attributing the call¶
To attribute beyond "this API key", forward identity/service hints as headers your platform team configures (e.g. an identity header or your existing trace headers). Ask your onboarding contact for the header convention enabled on your instance.
Streaming¶
Streaming responses (Server-Sent Events) are supported and forwarded incrementally; token counts are derived from the stream. Bedrock's binary event stream is partially supported — confirm coverage with your onboarding contact if you rely on it.