Documentation — TokenFlow

1. Get an API key

Sign up free — you'll get an API key starting with tk_ immediately. No card required, $5 of credit included to play with.

Treat your API key like a password. Don't commit it to git. Use environment variables.

2. Python — the OpenAI SDK works as-is

If you have OpenAI's Python SDK installed, you're already done. Two lines change.

from openai import OpenAI

client = OpenAI(
    api_key="tk_your_key_here",
    base_url="https://api.tokenflow.dev/v1",
)

resp = client.chat.completions.create(
    model="smart-chat",
    messages=[
        {"role": "user", "content": "Write me a haiku about Python."}
    ],
)
print(resp.choices[0].message.content)

Node.js — same SDK, same change

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "tk_your_key_here",
  baseURL: "https://api.tokenflow.dev/v1",
});

const resp = await client.chat.completions.create({
  model: "smart-chat",
  messages: [{ role: "user", content: "Hello" }],
});
console.log(resp.choices[0].message.content);

Ollama-style? Also works.

If your code talks to a local Ollama, point it at TokenFlow instead. The /api/chat, /api/generate, and /api/tags endpoints all work the same way — just add your API key in the Authorization header.

curl https://api.tokenflow.dev/api/chat \
  -H "Authorization: Bearer tk_your_key_here" \
  -d '{
    "model": "smart-chat",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Streaming responses

Same as OpenAI. Set stream: true and iterate. Server-sent events. Same chunk format.

stream = client.chat.completions.create(
    model="fast-chat",
    messages=[...],
    stream=True,
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Function calling / tools

Pass tools the same way you would with OpenAI. The response uses the same tool_calls structure. Use the coder-pro or smart-chat alias for best tool-use behavior.

resp = client.chat.completions.create(
    model="smart-chat",
    messages=[...],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "parameters": {"type": "object", ...},
        },
    }],
)

Error handling

Standard HTTP status codes. The codes you'll actually see:

401 — your API key is wrong or revoked
402 — you hit your budget cap, or your balance is empty
404 — model alias doesn't exist (typo? check the list below)
429 — you hit a rate limit (per second or per day)
503 — temporarily can't fulfil the request, retry with backoff

The response body is always JSON: {"error": {"message": "...", "type": "..."}}. Same shape as OpenAI.

Available aliases

Use these as the model field. We pick the underlying provider — you don't have to.

smart-chat — general-purpose chat, balanced cost & quality
fast-chat — speed-first chat, optimized for low TTFT
coder-pro — code generation, refactoring, tool-use
deep-reasoning — long-context, multi-step problems
vision-pro — image understanding
embed-default — vector embeddings

See the pricing page for per-token rates on each.

Five-minute quickstart.

On this page

1. Get an API key

2. Python — the OpenAI SDK works as-is

Node.js — same SDK, same change

Ollama-style? Also works.

Streaming responses

Function calling / tools

Error handling

Available aliases

That's it. Really.

Product

Company

Legal