Integrating AI into web apps without the hype.
I've shipped four AI features into production this year. Two of them delighted users. One was ignored. One was actively mistrusted. The difference wasn't the model — it was how we presented the output and what we promised it could do.
Every product manager I've worked with this year has asked the same question: "Can we add AI to this?" Sometimes the answer is yes, and the result is genuinely useful. Other times, the answer is also yes — but the result is a feature nobody trusts, nobody uses, and nobody wants to maintain. The difference comes down to knowing when AI is the right tool and when it's just the trendy one.
Not every feature needs AI
This is the hammer-nail problem at an industry scale. Before reaching for an LLM, ask yourself some honest questions: could a search bar solve this? Could a filter solve this? Could a well-written template solve this? If the answer is yes, you don't need a language model — you need better product design.
AI should handle tasks where the input is ambiguous, the output is creative, or the rules are too complex to codify. Summarizing a long document? Good use case. Classifying customer intent from freeform text? Strong use case. Auto-filling a form with structured options? That's a dropdown menu, not a machine learning problem. Everything else is over-engineering with a higher API bill.
API vs. local models: the tradeoff matrix
The first real decision is where your model runs. Cloud APIs from OpenAI, Anthropic, or Google are fast to integrate — you're making HTTP requests and parsing JSON. The tradeoffs are ongoing cost per token, latency that varies with load, and data privacy concerns that your legal team will eventually ask about.
Local models via Ollama, llama.cpp, or similar runtimes flip those tradeoffs: no API costs, full data privacy, and predictable latency. But you're trading that for higher setup complexity, hardware requirements that most web servers don't meet, and models that are generally less capable than the frontier APIs.
For most web applications, the pragmatic path is: start with APIs, move to local when you have a concrete reason — whether that's privacy regulations, cost at scale, or latency requirements that cloud can't meet. Don't optimize for a problem you don't have yet.
Prompt engineering is product design
Your prompt is your product specification. This isn't hyperbole — the quality of your AI feature is directly proportional to how well you've defined what you want in the prompt. Vague prompts produce vague outputs. Specific prompts produce useful ones.
Be specific about format, tone, and constraints. Use system prompts to establish consistent behavior across sessions. Include examples — few-shot prompting — for complex tasks where the model needs to understand your expected output structure. Version your prompts like you version code, because a prompt change is a product change. And test your prompts against edge cases before shipping, not after users find them.
The teams that treat prompt engineering as an afterthought ship features that feel like demos. The teams that treat it as product design ship features that feel like tools.
Streaming responses and perceived speed
Nobody wants to wait eight seconds staring at a loading spinner for an AI response. The single most impactful UX improvement you can make is streaming tokens to the UI as they're generated. Show typing indicators. Use progressive rendering. A five-second response that streams word by word feels like one second. A five-second response behind a spinner feels like ten.
Use Server-Sent Events or WebSocket connections for the transport layer. Both work well; SSE is simpler for unidirectional streaming from server to client, which is the common case. Handle errors mid-stream gracefully — if the model fails halfway through a response, don't show a broken partial output. Buffer enough to detect errors before committing content to the screen.
Building trust in AI output
This is where most AI features fail, and it's rarely a technical problem. Users don't trust what they don't understand, and they especially don't trust what they can't correct.
Label AI-generated content clearly. Provide sources or reasoning when possible. Let users edit AI output before it's finalized — treat it as a draft, not a decree. Add confidence indicators where appropriate. Never present AI output as fact without human review. The best AI features feel like a helpful first draft, not an authoritative answer.
The best AI feature I've shipped doesn't feel like AI. It feels like the product is just... smarter. That's the goal — augmentation so natural that users forget there's a model behind it.
Closing thought
AI in web apps is 20% model selection and 80% product design. The hard problems aren't technical — they're about trust, UX, and knowing when AI helps versus when it gets in the way. Start with one feature, ship it, watch how users actually interact with it, and iterate. The teams building the best AI products aren't the ones with the most sophisticated models. They're the ones who understand their users well enough to know where intelligence helps and where it just adds noise.