February 10, 2025·2 min read·Alex Kargin

The DeepSeek moment, and what it meant for your coffee shop

When a Chinese lab shipped a reasoning model you could run on your own hardware, the ground moved for everyone — especially anyone paying per token.

open-sourcelocal-llmsmall-business

In late January, DeepSeek dropped a reasoning model that ran on commodity hardware and matched the commercial giants on several benchmarks. Overnight, the "only Big-AI can do this" argument got quieter.

For small businesses, the shift is not abstract. It means three things.

1. Your chatbot no longer has to phone home to California

Before: every customer question routed through OpenAI, metered in tokens, priced per thousand. After: a 2GB model on a $20/month VPS answers the same questions for a rounding error.

We ran the numbers on an answering-service-style use case — 500 conversations a month, ~5k tokens each. Cloud-API cost: $40–80. Local model cost: ~$0, once you own the machine. Math gets harder to argue with the more you scale.

2. "Data leaves your business" stopped being inevitable

The thing that actually scared the dentist, the accountant, the attorney — patient names going through a third-party API — is newly optional. On-prem models mean the conversation stays in your building.

3. The pitch shifted from "use AI" to "own your AI"

The interesting question for small business owners is no longer can I afford AI? — it's who do I want on the hook if it breaks, and who do I want owning the data?

What we're doing about it

We've been watching model quality on the small-business use cases we care about: answering service, review replies, appointment booking. DeepSeek wasn't the right fit for every one of those — small models win on speed and cost, bigger ones win on nuance. But the door opened.

We'll keep writing honestly about which model is actually winning each week. Most of it is undramatic: check benchmarks, run the model on your actual data, keep what earns its keep.

For the deeper technical takes, our engineering writing lives at kargin-utkin.com — we've been tracking the ML landscape there since 2025.

If your business is curious whether a local model would help yours, book a 30-minute call. No pitch if it doesn't fit.

1. Your chatbot no longer has to phone home to California

2. "Data leaves your business" stopped being inevitable

3. The pitch shifted from "use AI" to "own your AI"

What we're doing about it

Thirty-minute scoping call. No pitch, no retainer.