Loading OmniRoute...
Languages: ๐บ๐ธ English | ๐ง๐ท Portuguรชs (Brasil) | ๐ช๐ธ Espaรฑol | ๐ซ๐ท Franรงais | ๐ฎ๐น Italiano | ๐ท๐บ ะ ัััะบะธะน | ๐จ๐ณ ไธญๆ (็ฎไฝ) | ๐ฉ๐ช Deutsch | ๐ฎ๐ณ เคนเคฟเคจเฅเคฆเฅ | ๐น๐ญ เนเธเธข | ๐บ๐ฆ ะฃะบัะฐัะฝััะบะฐ | ๐ธ๐ฆ ุงูุนุฑุจูุฉ | ๐ฏ๐ต ๆฅๆฌ่ช | ๐ป๐ณ Tiแบฟng Viแปt | ๐ง๐ฌ ะัะปะณะฐััะบะธ | ๐ฉ๐ฐ Dansk | ๐ซ๐ฎ Suomi | ๐ฎ๐ฑ ืขืืจืืช | ๐ญ๐บ Magyar | ๐ฎ๐ฉ Bahasa Indonesia | ๐ฐ๐ท ํ๊ตญ์ด | ๐ฒ๐พ Bahasa Melayu | ๐ณ๐ฑ Nederlands | ๐ณ๐ด Norsk | ๐ต๐น Portuguรชs (Portugal) | ๐ท๐ด Romรขnฤ | ๐ต๐ฑ Polski | ๐ธ๐ฐ Slovenฤina | ๐ธ๐ช Svenska | ๐ต๐ญ Filipino | ๐จ๐ฟ ฤeลกtina
omniroute multi-provider AI proxy router.
proxy router that sits between AI clients (Claude CLI, Codex, Cursor IDE, etc.) and AI providers (Anthropic, Google, OpenAI, AWS, GitHub, etc.). It solves one big problem:
Different AI clients speak different "languages" (API formats), and different AI providers expect different "languages" too. omniroute translates between them automatically.
graph LR
subgraph Clients
A[Claude CLI]
B[Codex]
C[Cursor IDE]
D[OpenAI-compatible]
end
subgraph omniroute
E[Handler Layer]
F[Translator Layer]
G[Executor Layer]
H[Services Layer]
end
subgraph Providers
I[Anthropic Claude]
J[Google Gemini]
K[OpenAI / Codex]
L[GitHub Copilot]
M[AWS Kiro]
N[Antigravity]
O[Cursor API]
end
A --> E
B --> E
C --> E
D --> E
E --> F
F --> G
G --> I
G --> J
G --> K
G --> L
G --> M
G --> N
G --> O
H -.-> E
H -.-> G
OpenAI format as the hub:
N translators (one per format) instead of Nยฒ (every pair).
single source of truth for all provider configuration.
object with base URLs, OAuth credentials (defaults), headers, and default system prompts for every provider. Also defines , |
|
and merges them over the hardcoded defaults in . Keeps secrets out of source control while maintaining backwards compatibility. |
|
, . |
|
flowchart TD
A["App starts"] --> B["constants.ts defines PROVIDERS\nwith hardcoded defaults"]
B --> C{"data/provider-credentials.json\nexists?"}
C -->|Yes| D["credentialLoader reads JSON"]
C -->|No| E["Use hardcoded defaults"]
D --> F{"For each provider in JSON"}
F --> G{"Provider exists\nin PROVIDERS?"}
G -->|No| H["Log warning, skip"]
G -->|Yes| I{"Value is object?"}
I -->|No| J["Log warning, skip"]
I -->|Yes| K["Merge clientId, clientSecret,\ntokenUrl, authUrl, refreshUrl"]
K --> F
H --> F
J --> F
F -->|Done| L["PROVIDERS ready with\nmerged credentials"]
E --> L
provider-specific logic using the Strategy Pattern. Each executor overrides base methods as needed.
classDiagram
class BaseExecutor {
+buildUrl(model, stream, options)
+buildHeaders(credentials, stream, body)
+transformRequest(body, model, stream, credentials)
+execute(url, options)
+shouldRetry(status, error)
+refreshCredentials(credentials, log)
}
class DefaultExecutor {
+refreshCredentials()
}
class AntigravityExecutor {
+buildUrl()
+buildHeaders()
+transformRequest()
+shouldRetry()
+refreshCredentials()
}
class CursorExecutor {
+buildUrl()
+buildHeaders()
+transformRequest()
+parseResponse()
+generateChecksum()
}
class KiroExecutor {
+buildUrl()
+buildHeaders()
+transformRequest()
+parseEventStream()
+refreshCredentials()
}
BaseExecutor <|-- DefaultExecutor
BaseExecutor <|-- AntigravityExecutor
BaseExecutor <|-- CursorExecutor
BaseExecutor <|-- KiroExecutor
BaseExecutor <|-- CodexExecutor
BaseExecutor <|-- GeminiCLIExecutor
BaseExecutor <|-- GithubExecutor
| Most complex: SHA-256 checksum auth, Protobuf request encoding, binary EventStream โ SSE response parsing | ||
| ), Google OAuth token refresh | ||
orchestration layer โ coordinates translation, execution, streaming, and error handling.
| Central orchestrator (~600 lines). Handles the complete request lifecycle: format detection โ translation โ executor dispatch โ streaming/non-streaming response โ token refresh โ error handling โ usage logging. | |
| โ converts SSE back to Responses format. | |
sequenceDiagram
participant Client
participant chatCore
participant Translator
participant Executor
participant Provider
Client->>chatCore: Request (any format)
chatCore->>chatCore: Detect source format
chatCore->>chatCore: Check bypass patterns
chatCore->>chatCore: Resolve model & provider
chatCore->>Translator: Translate request (source โ OpenAI โ target)
chatCore->>Executor: Get executor for provider
Executor->>Executor: Build URL, headers, transform request
Executor->>Executor: Refresh credentials if needed
Executor->>Provider: HTTP fetch (streaming or non-streaming)
alt Streaming
Provider-->>chatCore: SSE stream
chatCore->>chatCore: Pipe through SSE transform stream
Note over chatCore: Transform stream translates<br/>each chunk: target โ OpenAI โ source
chatCore-->>Client: Translated SSE stream
else Non-streaming
Provider-->>chatCore: JSON response
chatCore->>Translator: Translate response
chatCore-->>Client: Translated JSON
end
alt Error (401, 429, 500...)
chatCore->>Executor: Retry with credential refresh
chatCore->>chatCore: Account fallback logic
end
Format detection (): analyzes request body structure to identify Claude/OpenAI/Gemini/Antigravity/Responses formats (includes |
|
โ ), alias resolution with collision detection, input sanitization (rejects path traversal/control chars), and model info resolution with async alias getter support. |
|
| every provider: Google (Gemini, Antigravity), Claude, Codex, Qwen, Qoder, GitHub (OAuth + Copilot dual-token), Kiro (AWS SSO OIDC + Social Auth). Includes in-flight promise deduplication cache and retry with exponential backoff. | |
| Combo models: chains of fallback models. If model A fails with a fallback-eligible error, try model B, then C, etc. Returns actual upstream status codes. | |
| ) to concrete provider/model pairs based on availability and priority. |
sequenceDiagram
participant R1 as Request 1
participant R2 as Request 2
participant Cache as refreshPromiseCache
participant OAuth as OAuth Provider
R1->>Cache: getAccessToken("gemini", token)
Cache->>Cache: No in-flight promise
Cache->>OAuth: Start refresh
R2->>Cache: getAccessToken("gemini", token)
Cache->>Cache: Found in-flight promise
Cache-->>R2: Return existing promise
OAuth-->>Cache: New access token
Cache-->>R1: New access token
Cache-->>R2: Same access token (shared)
Cache->>Cache: Delete cache entry
stateDiagram-v2
[*] --> Active
Active --> Error: Request fails (401/429/500)
Error --> Cooldown: Apply backoff
Cooldown --> Active: Cooldown expires
Active --> Active: Request succeeds (reset backoff)
state Error {
[*] --> ClassifyError
ClassifyError --> ShouldFallback: Rate limit / Auth / Transient
ClassifyError --> NoFallback: 400 Bad Request
}
state Cooldown {
[*] --> ExponentialBackoff
ExponentialBackoff: Level 0 = 1s
ExponentialBackoff: Level 1 = 2s
ExponentialBackoff: Level 2 = 4s
ExponentialBackoff: Max = 2min
}
flowchart LR
A["Request with\ncombo model"] --> B["Model A"]
B -->|"2xx Success"| C["Return response"]
B -->|"429/401/500"| D{"Fallback\neligible?"}
D -->|Yes| E["Model B"]
D -->|No| F["Return error"]
E -->|"2xx Success"| C
E -->|"429/401/500"| G{"Fallback\neligible?"}
G -->|Yes| H["Model C"]
G -->|No| F
H -->|"2xx Success"| C
H -->|"Fail"| I["All failed โ\nReturn last status"]
format translation engine using a self-registering plugin system.
graph TD
subgraph "Request Translation"
A["Claude โ OpenAI"]
B["Gemini โ OpenAI"]
C["Antigravity โ OpenAI"]
D["OpenAI Responses โ OpenAI"]
E["OpenAI โ Claude"]
F["OpenAI โ Gemini"]
G["OpenAI โ Kiro"]
H["OpenAI โ Cursor"]
end
subgraph "Response Translation"
I["Claude โ OpenAI"]
J["Gemini โ OpenAI"]
K["Kiro โ OpenAI"]
L["Cursor โ OpenAI"]
M["OpenAI โ Claude"]
N["OpenAI โ Antigravity"]
O["OpenAI โ Responses"]
end
| on import. | ||
(system prompt extraction, thinking config), (parts/contents mapping), |
||
, , state management, registry. |
||
, , |
// Each translator file calls register() on import:
import { register } from "../index.js";
register("claude", "openai", translateClaudeToOpenAI);
// The index.js imports all translator files, triggering registration:
import "./request/claude-to-openai.js"; // โ self-registers
SSE Transform Stream โ the core streaming pipeline. Two modes: (full format translation) and |
|
(whitespace-tolerant), (filters empty chunks for OpenAI/Claude/Gemini), |
|
| for application logs and the call log pipeline for persisted request artifacts. | |
// |
flowchart TD
A["Provider SSE stream"] --> B["TextDecoder\n(per-stream instance)"]
B --> C["Buffer lines\n(split on newline)"]
C --> D["parseSSELine()\n(trim whitespace, parse JSON)"]
D --> E{"Mode?"}
E -->|TRANSLATE| F["translateResponse()\ntarget โ OpenAI โ source"]
E -->|PASSTHROUGH| G["fixInvalidId()\nnormalize chunk"]
F --> H["hasValuableContent()\nfilter empty chunks"]
G --> H
H -->|"Has content"| I["extractUsage()\ntrack token counts"]
H -->|"Empty"| J["Skip chunk"]
I --> K["formatSSE()\nserialize + clean perf_metrics"]
K --> L["TextEncoder\n(per-stream instance)"]
L --> M["Enqueue to\nclient stream"]
style A fill:#f9f,stroke:#333
style M fill:#9f9,stroke:#333
, ), authentication, shared |
|
| ) | ||
OpenAI format as the hub. Adding a new provider only requires writing one pair of translators (to/from OpenAI), not N pairs.
. The factory in selects the right one at runtime.
. Adding a new translator is just creating a file and importing it.
strings. If the first fails, fallback to the next automatically.
mechanism.
| header | |||
| header | |||
| endpoint | |||
flowchart LR
A["Client"] --> B["detectFormat()"]
B --> C["translateRequest()\nsource โ OpenAI โ target"]
C --> D["Executor\nbuildUrl + buildHeaders"]
D --> E["fetch(providerURL)"]
E --> F["createSSEStream()\nTRANSLATE mode"]
F --> G["parseSSELine()"]
G --> H["translateResponse()\ntarget โ OpenAI โ source"]
H --> I["extractUsage()\n+ addBuffer"]
I --> J["formatSSE()"]
J --> K["Client receives\ntranslated SSE"]
K --> L["logUsage()\nsaveRequestUsage()"]
flowchart LR
A["Client"] --> B["detectFormat()"]
B --> C["translateRequest()\nsource โ OpenAI โ target"]
C --> D["Executor.execute()"]
D --> E["translateResponse()\ntarget โ OpenAI โ source"]
E --> F["Return JSON\nresponse"]
flowchart LR
A["Claude CLI request"] --> B{"Match bypass\npattern?"}
B -->|"Title/Warmup/Count"| C["Generate fake\nOpenAI response"]
B -->|"No match"| D["Normal flow"]
C --> E["Translate to\nsource format"]
E --> F["Return without\ncalling provider"]