Same AI answers, 90% cheaper.
One API in front of OpenAI, Anthropic, Google, DeepSeek, and 5 more. We classify each prompt, route it to the right model, and cache anything we can — pay-as-you-go, no subscription.
Real savings on real prompts
Live numbers from production. Same answer quality, fraction of the cost.
"Explain quantum entanglement in three sentences."
"Write a Python function to sort a list."
"What is the current price of Bitcoin?"
Numbers include our 30% markup. Direct pricing reflects May 2026 public rates.
How smart routing works
Behind the scenes — four steps every request takes through our gateway.
1. Classify
2. Route
3. Cache + Compress
4. Verify
Built for real workloads
Plug it into your code or use it as a chat — same backend, same savings.
160+ models
OpenAI, Anthropic, Google, Mistral, Groq, DeepSeek, xAI, Perplexity — one OpenAI-compatible endpoint.
Cascade routing
Cheap model first, escalate only when our verifier flags low confidence. Pay top tier only when you need it.
Real-time tools
Stock prices, weather, web search, Wikipedia, crypto, currency — wired in out of the box.
Honest pricing
Every response tells you what you paid and what you saved. No mystery middleware fees.
Plug into your agents in one line
LangChain, Continue.dev, Aider, OpenWebUI — anything that speaks the OpenAI API. One-line installers for popular tools when we ship them; copy-paste base URL + API key works today for everything else.
Pay only for what you use.
No subscriptions. $0.50 free credit on signup. Then top up as needed. No hidden fees.
Illustrative examples — actual usage varies by model and tokens.
Basic chat
gpt-4o-mini with cache-friendly prompts
~$0.0001 per message
- Pay-as-you-go credits
- One OpenAI-compatible API
Advanced chat
With tools: search, weather, market data
~$0.001 per message
- Pay-as-you-go credits
- One OpenAI-compatible API
Premium reasoning
gpt-4o-class and reasoning models
~$0.005 per message
- Pay-as-you-go credits
- One OpenAI-compatible API
Common questions
How can you be 90% cheaper if I pick the same model?+
Three layers stacked on top of the same model: (1) semantic cache reuses answers across users for repeated questions (free hits), (2) prompt compression shrinks token counts before we pay the provider, (3) Anthropic & OpenAI prompt-caching gives us 10× cheaper input. Even with our 30% markup, you typically end up paying 5–10× less than going direct.
What does "Auto" routing actually do?+
A classifier reads your prompt and decides task type (code, math, image, real-time, chat). Then a router picks the cheapest model that handles that task well — DeepSeek for code, Perplexity Sonar for "what is the current X", Gemini for long context, Claude Haiku for general chat. You always see what model was used in the response.
Do I have to use Auto?+
No. You can pick any of 160+ models explicitly via the OpenAI-compatible API or in the chat UI. Auto is just the default that tends to save the most money.
Is it OpenAI-compatible?+
Yes — `POST https://api.mycheapai.com/v1/chat/completions` with the same payload shape, plus streaming. Use the openai SDK, LangChain, or anything else by changing one env var.
How are payments handled?+
Stripe Checkout for top-ups. $1 minimum, $1000 maximum per top-up. Free $0.50 on signup. No subscription, no monthly minimums.
Where do you store my data?+
Logs of your prompts and responses live in our database for billing & verifier feedback. We never train on your data and don't share it. You can delete your account anytime — credits expire with it.
Try it free
$0.50 of credit on signup. No card required. Pay-as-you-go via Stripe when you want more.