Don't see your question? Email us — we read everything.
Three differences that matter day to day:
smart-chat instead of gpt-4o. We pick the model that gives you the right balance of cost and quality, and you don't have to rewrite your code when prices shift.No. Change two things — the API key and the base URL — and your code runs unchanged. We support the OpenAI SDK's full surface: chat, streaming, tools, vision, embeddings. Same response shapes, same error formats, same SDK methods.
Prompts and responses are not retained after a request completes. We log metadata — token counts, latency, status codes, model used, your API key prefix — for billing and your own usage dashboard. You can export or delete this metadata any time from your dashboard.
One exception: if a request errors, we briefly cache request metadata (not content) for debugging and rate-limit decisions. That's purged within 24 hours.
The next request returns HTTP 402 (Payment Required) with a clear error message. Your code can catch it and fall back to a cached response, a degraded mode, or just show the user "AI features paused — try again tomorrow." The cap resets at the start of the next period (daily, weekly, or monthly — your choice).
Soft alerts at 50%, 75%, and 90% give you warning before the hard stop hits. You can also raise the cap yourself from the dashboard at any time.
Two options:
There are no auto-charges. If your balance hits zero, requests stop. You decide when and how much to add.
Yes — we call this a private node. Run a model on your own hardware (anywhere Ollama runs), connect it to TokenFlow with a single command, and it shows up as a private alias only your account can use. You bypass per-token pricing for those models entirely. Useful for fine-tuned models, niche open-source models, or if you have spare GPU capacity you want to use directly.
Most aliases route to multiple providers under the hood. If one is unhealthy, the next request automatically goes to a fallback. You don't see the outage; your users don't see the outage. If we ever can't fulfil a request at all, you get a clear error code so you can handle it in code.
No free credits on signup, but your first deposit is doubled — top up $10, get $20 of usable balance. The bonus $10 is spendable on fast-chat (our cheapest, fastest alias), the real $10 works on every model. The bonus expires 90 days after credit. Plenty for prototyping and small side projects without us giving away money to tire-kickers.
Why not free credits? Honest answer: every free request costs us real money to upstream providers. We'd rather price the product fairly to people who pay than subsidize accounts that never convert.
There's nothing to cancel on the free or pay-as-you-go usage — just stop using it. If you're on a paid plan, cancel from the dashboard and you'll keep access through the end of the billing period. Your remaining credit balance stays usable for as long as your account exists. We don't pocket leftover credit.
The free and Starter plans run on best-effort infrastructure with 99.5% uptime targets. Pro plans get 99.9% with credits for any miss. Custom plans get a written SLA with whatever availability and response-time terms make sense for your use case.
Yes — that's what the Custom plan is built for. You create sub-accounts for each client, each with its own API keys, budgets, and usage tracking. You pay one bill, and you can mark up the per-token rate for the value you're adding. Your clients see their own usage; they don't see your underlying rate.
API keys are hashed at rest — even we can't read them after creation. If you lose one, you create a new one and revoke the old one. Each key can be scoped (read-only, inference-only, etc.) and rate-limited individually. We strongly recommend using separate keys for prod, staging, and dev so you can revoke one without breaking everything.
Email hello@tokenflow.dev. Pay-as-you-go users get community support (forum + best-effort email). Starter and above get email support with a 24-hour response target. Pro and Custom get priority queues with same-business-day responses.
We're indie hackers too. Email us — we'd rather you ask now than figure it out the hard way later.
Get in touch