Send events (optional)¶
Connecting your cloud (the onboarding step) gives Argmin billing- and usage-level attribution from your provider's own data. That's enough for most customers.
If you want request-level attribution — per-call cost, tokens, latency, and the identity/service that made the call — you can additionally send Argmin events. There are two ways, and they're additive:
-
Ingestion API
Emit an
InvocationEvent(or a generic observability event) with a single HTTPPOST. Best when you already have a place in your code or pipeline that sees each AI call. -
Drop-in proxy
Point your provider SDK's base URL at Argmin's proxy. Argmin records the call and forwards it, failing open if anything is slow. Zero application code changes beyond a base URL.
When you don't need this¶
- You only need cost/budget attribution at the team/account level → the cloud connector already covers it.
- Your inference runs entirely on Bedrock / Vertex AI / Azure OpenAI and you've enabled the relevant logging → Argmin reads that through the connector.
When you do want this¶
- You call providers directly (OpenAI, Anthropic, etc.) from application code and want each request attributed.
- You need decision-time signals (live cost, model recommendations) at the call site.
- You want latency and token counts per request, not just aggregate billing.
Non-negotiables¶
- Fail-open. No ingestion or proxy path may block your production traffic. The interceptor runs on a hard latency budget and forwards regardless.
- No content capture. Argmin never stores prompt or completion text. Send metadata (tokens, model, cost, identity) — not message bodies.