Intelligent voice-enabled chatbot with RAG, speech recognition, and natural language understanding

Voice capture state with particle field

Call-to-action panel that opens the assistant
This AI chatbot serves as an interactive portfolio assistant, allowing visitors to ask questions about my skills, projects, and experience through text or voice. Built with modern AI technologies, it combines retrieval-augmented generation (RAG), speech recognition, and text-to-speech for natural conversations. Midway through the build I swapped the XTTS + whisper.cpp stack for Piper + Faster-Whisper and pinned the Docker dependencies (torch, sox, ALSA) so deployments stop breaking every time upstream images change.
Visitors needed an easy way to learn about my work without reading through multiple pages
AI-powered chatbot with voice interaction and contextual understanding of portfolio content
50% reduction in bounce rate, instant answers to common questions, memorable user experience
Click-to-talk interface with real-time transcription. Supports multiple languages and accents through Whisper's multilingual model.
RAG-powered responses retrieve relevant context from portfolio content before generating answers.
Optimized for speed with model caching, Piper TTS (sub-second generation), and local inference.
Token-protected endpoints for managing content, viewing analytics, and reindexing knowledge base.
Initial deployment failed with libtorchaudio.so errors due to missing system dependencies and XTTS requiring GPU drivers.
Solution: Added sox, libsox-dev, and alsa-utils to Docker image, swapped XTTS for Piper's ONNX runtime, and pinned torch/torchaudio to version 2.1.0 for compatibility. Implemented graceful fallback with TTS availability check in health endpoint.
LLM responses contained markdown formatting (**, _, #) causing poor TTS pronunciation and cluttered UI.
Solution: Built text cleaning pipeline removing special characters, converting markdown links to plain text, and formatting paragraphs. Applied before both display and TTS synthesis.
Config module changes weren't picked up despite rebuilding, causing ModuleNotFoundError.
Solution: Fixed COPY path in Dockerfile (from COPY . /app/backend to COPY . /app) to prevent double nesting. Created --no-cache deployment script for critical updates.
Average response time for text chat (including RAG retrieval + LLM inference)
TTS audio generation time using Piper low-latency mode
API costs - fully self-hosted with Ollama local inference
FAQ topics covered with semantic search accuracy
--no-cache for critical dependency changes. Saved hours of debugging.tts_available) for graceful frontend degradation.Experience the chatbot live on the homepage - ask about my projects, skills, or technical approach!