Drop-in proxy¶

The proxy records each AI call and forwards it to the real provider, so you get request-level attribution without touching application logic. You change one thing: the provider SDK’s base URL. The <your-venturi-instance> placeholder below is your data-plane host in either deployment mode (see Deployment modes); your onboarding contact provides it.

Fail-open guarantee

The proxy runs on a hard latency budget. If Venturi is slow or unreachable, the request is forwarded to the provider anyway. The proxy can never block or fail your production traffic.

How it works¶

graph LR
  App[Your app] -->|base_url = Venturi proxy| Px[Venturi proxy]
  Px -->|forwards| Prov[OpenAI / Anthropic]
  Px -.->|emits InvocationEvent| Pipe[Attribution pipeline]
  Prov -->|response| Px --> App

The proxy passes your request through unchanged (including your provider API key, which Venturi does not store), captures metadata (model, tokens, latency, cost) and emits an InvocationEvent. Content is never captured.

Endpoints¶

Relative to your Venturi instance host (provided during onboarding):

Provider	Proxy base path
OpenAI	`/api/v1/proxy/openai/v1`
Anthropic	`/api/v1/proxy/anthropic/v1`

OpenAI example¶

PythoncURL

from openai import OpenAI

client = OpenAI(
    base_url="https://<your-venturi-instance>/api/v1/proxy/openai/v1",
    api_key="<your OpenAI key>",   # forwarded as-is; not stored by Venturi
)

client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "…"}],
)

curl https://<your-venturi-instance>/api/v1/proxy/openai/v1/chat/completions \
  -H "Authorization: Bearer <your OpenAI key>" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"…"}]}'

Anthropic example¶

PythoncURL

from anthropic import Anthropic

client = Anthropic(
    base_url="https://<your-venturi-instance>/api/v1/proxy/anthropic/v1",
    api_key="<your Anthropic key>",
)

client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "…"}],
)

curl https://<your-venturi-instance>/api/v1/proxy/anthropic/v1/messages \
  -H "x-api-key: <your Anthropic key>" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{"model":"claude-sonnet-4-6","max_tokens":1024,"messages":[{"role":"user","content":"…"}]}'

Attributing the call¶

To attribute beyond “this API key”, forward identity/service hints as headers your platform team configures (e.g. an identity header or your existing trace headers). Ask your onboarding contact for the header convention enabled on your instance.

Streaming¶

Streaming responses (Server-Sent Events) are supported and forwarded incrementally; token counts are derived from the stream. Bedrock’s binary event stream is partially supported: confirm coverage with your onboarding contact if you rely on it.

Confirm events are landing