AI2026Solo build

Real-Time AI Voice Agent

Express backend that closes the loop from spoken intent to grounded natural language, then back to synthesized speech — keeping the same RAG stack usable from plain text clients.

Not deployed publiclyBackend reference implementation — currently runs locally / on private infra. Architecture deep-dive and code walkthrough available on request.

Why voice + RAG

Speech lowers friction for end users, but raw LLMs hallucinate and drift from organizational truth. Coupling automatic speech recognition with semantic retrieval constrains generation to ingested documents (per avatar / tenant), so answers stay traceable to source material while remaining conversational when delivered via TTS.

Architecture

A layered monolith: HTTP adapters delegate to stateless services that wrap provider SDKs. Long-lived concerns (auth, validation, rate limits, structured logging) sit in middleware; orchestration for the primary user journey lives in `askService` (STT → language handling → RAG+LLM → TTS → persistence).

No message broker for the default path — each `/ask` request is an async pipeline of awaited I/O steps. That trade-off favours operational simplicity; horizontal scale is achieved by scaling stateless app instances and keeping Pinecone + MongoDB as shared backends.

What I learned

Voice is a transport layer — the trust boundary for factual answers is still retrieval + citation-oriented prompting, not the modality.

Orchestration clarity beats clever abstractions for small teams: one explicit pipeline (`askService`) is easier to operate than scattered triggers.

Multitenancy belongs in retrieval metadata — namespaces normalize 'which knowledge base' independent of embeddings math.

Other projects

View all →

Have a project in mind, or want to chat code?

I'm always up for a good conversation about software, startups, or where to find the best coffee in your city. Send a note and I'll reply soon.

sachannishchal@gmail.com

Location

Ghaziabad, Uttar Pradesh, India · Remote-friendly

Real-Time AI Voice Agent

Why voice + RAG

Architecture

What I learned

Other projects

Nexus CRM Enterprise

Tiger E-Bikes — EV Landing Platform

Helpkey — Hotel Booking Platform

Have a project in mind, or want to chat code?