October 12, 2025·2 min read·Alex Kargin

When to stop paying OpenAI — the break-even math

Cloud APIs win at volume zero. Local models win past a certain point. Here's the specific number where the line crosses for most small businesses.

pricingeconomicslocal-llm

Every small-business AI conversation eventually hits this question: should I pay per use (cloud API), or pay once to own (local model)?

The honest answer depends on one number: how many calls per month you make.

The setup

We'll use round numbers for clarity. Assume a typical chatbot reply: 5,000 tokens in (system prompt + context + user message), 300 tokens out.

OpenAI gpt-4o-mini — $0.15 per 1M input + $0.60 per 1M output. Per reply: ~$0.0009. 1,000 replies = $0.90.
Anthropic Haiku — comparable, maybe 10–20% different either way.

For modest volume (under 500/month), the cloud bill is rounding-error small. Don't overthink it.

Hardware: $20/month VPS covers it for small models
Electricity (if self-hosted on-prem): negligible for a 1B model on a modern CPU
Human time: ~1 hour/month watching for issues

Translate to dollars: ~$30/month, all in, flat.

At current prices, you'd have to run ~30,000 chatbot replies a month before OpenAI gpt-4o-mini costs more than a self-hosted small model.

That's roughly 1,000 conversations per day.

So for 99% of small businesses, cloud is cheaper on paper.

Three reasons the paper math loses:

Privacy. A dental practice that puts "insurance question" through OpenAI once has a HIPAA discussion. Local = no discussion.
Reliability. The cloud API goes down on the busiest Saturday of your year. Your local model doesn't.
Pricing risk. Cloud APIs raise prices with 30 days notice. Your local model costs what it cost a year ago.

For high-volume businesses (anyone doing lead capture at scale, or a real-time voice application), local wins on cost too.

Under 500 chats/month: use a cloud API. Save yourself the complexity.
500–5,000/month and privacy matters: go local, pay the one-time setup, move on.
Over 5,000/month: local is cheaper AND more reliable. No excuse.

For small businesses, we build local by default. Your chatbot, your server, your data. See the live demo running on exactly this setup.