Skip to main content

OpenAI-Compatible API

Orka exposes an OpenAI-compatible API at /openai/v1/chat/completions and /openai/v1/models, allowing any OpenAI-compatible client to use Orka as a provider. This includes tools like Continue, Cursor, and others.

Orka acts as a proxy to whichever LLM provider is configured in your cluster (Anthropic, OpenAI, Azure OpenAI, etc.), with credentials managed securely via Kubernetes Secrets and Provider CRDs.

Breaking change: These endpoints moved from /v1/ to /openai/v1/ — update your client configurations accordingly. See also Anthropic Compatibility for the Anthropic-native proxy.

Endpoints

MethodPathDescription
POST/openai/v1/chat/completionsChat completions (streaming & non-streaming)
GET/openai/v1/modelsList available models from configured providers

Both endpoints require authentication via Authorization: Bearer <token> using a Kubernetes ServiceAccount token.

Model Name Format

The model field supports two formats:

  • provider/model — e.g., anthropic/claude-sonnet-4-20250514. The part before / matches a Provider CRD name, and the part after is the model name sent to that provider.
  • model — e.g., claude-sonnet-4-20250514. Uses the default provider (from --chat-provider flag or a Provider CRD named default).

Prerequisites

  1. Provider CRD configured in the cluster:
apiVersion: core.orka.ai/v1alpha1
kind: Provider
metadata:
name: anthropic
namespace: default
spec:
type: anthropic
secretRef:
name: anthropic-secret
key: api-key
defaultModel: claude-sonnet-4-20250514
  1. Secret with the API key:
apiVersion: v1
kind: Secret
metadata:
name: anthropic-secret
namespace: default
type: Opaque
stringData:
api-key: sk-ant-...

Azure OpenAI provider example

If you use Azure OpenAI, configure a Provider with type: azure-openai:

apiVersion: core.orka.ai/v1alpha1
kind: Provider
metadata:
name: azure-openai
namespace: default
spec:
type: azure-openai
secretRef:
name: azure-openai-secret
key: api-key
baseURL: https://<resource>.openai.azure.com
defaultModel: gpt-4o-deployment
azure:
deploymentName: gpt-4o-deployment
apiVersion: "2024-02-15-preview"
  1. ServiceAccount token for authentication:
# Create a service account
kubectl create serviceaccount orka-client

# Bind it to the orka viewer role (or a custom role)
kubectl create clusterrolebinding orka-client-binding \
--clusterrole=orka-task-viewer \
--serviceaccount=default:orka-client

# Get a token
export ORKA_TOKEN=$(kubectl create token orka-client)

Using with Continue

Configuration

Configure Continue to use Orka as an OpenAI-compatible provider. Add to your Continue configuration:

{
"models": [
{
"title": "Claude Sonnet 4 (via Orka)",
"provider": "openai",
"model": "anthropic/claude-sonnet-4-20250514",
"apiBase": "https://orka.example.com/openai/v1",
"apiKey": "YOUR_ORKA_TOKEN"
}
]
}

Environment

Set your Orka API token:

export ORKA_TOKEN=$(kubectl create token orka-client)

Using with curl

Non-streaming

curl -X POST https://orka.example.com/openai/v1/chat/completions \
-H "Authorization: Bearer $ORKA_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4-20250514",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 1024
}'

List models

curl https://orka.example.com/openai/v1/models \
-H "Authorization: Bearer $ORKA_TOKEN"

Supported Features

FeatureSupported
Chat completionsYes
Streaming (SSE)Yes
Tool/function callingYes
System messagesYes
Multi-part contentYes (text parts extracted)
max_tokens / max_completion_tokensYes
temperatureYes
stop sequencesYes
stream_options.include_usageYes
Image inputsNot yet (text extracted from multi-part)
EmbeddingsNot supported
Audio / VisionNot supported

Architecture

┌─────────────┐ ┌─────────────────────────────┐ ┌───────────────┐
│ Continue │────▶│ Orka API Server │────▶│ Anthropic API │
│ (or any │◀────│ /openai/v1/chat/completions │◀────│ OpenAI API │
│ OAI client) │ │ │ │ Azure OpenAI │
└─────────────┘ │ Provider resolution: │ └───────────────┘
│ - Provider CRD lookup │
│ - Secret-based API keys │
│ - Model routing │
└─────────────────────────────┘

Orka transparently proxies requests to the backend LLM provider. The client manages its own tool execution loop — Orka simply forwards the messages and tool definitions to the LLM and returns the response.

Note: Both the OpenAI and Anthropic endpoints inject Orka's built-in tools (web_search, code_exec, etc.) and run server-side tool execution by default. Set X-Orka-Tools: disabled header to use as a transparent proxy instead.