ayushv.dev — The Portfolio PlatformFull-Stack Architecture: AI Bot · WebGL Scene · Voice Avatar · WhatsApp CTA
A production-grade personal platform engineered from first principles. Not a template—every layer from the procedural WebGL sky to the FAISS RAG pipeline was built to demonstrate what a principal engineer actually ships.
System Context
Six external systems integrate into the platform. Visitors interact via browser, WhatsApp, and voice. Ayush manages the knowledge base via token-protected admin endpoints. No third-party CMS — all content is code.
Container Architecture
The frontend is a Next.js App Router SPA deployed on Vercel with static export for blazing CDN delivery. The backend is a standalone FastAPI service on Heroku — independently deployable, independently scalable.
Why separate backend?
FAISS indexing, Groq calls, and Simli streaming are CPU/IO-bound tasks unsuitable for Edge Runtime. A persistent Heroku dyno keeps the FAISS index warm in memory between requests.
Static export + streaming
Next.js `output: export` gives a pure static CDN site. Dynamic behaviour (chat, voice) proxies to the Heroku backend — best of both worlds: sub-50ms TTFB + real-time AI.
Three.js / WebGL Hero Scene
The entire hero background is a 9-layer WebGL pipeline written in React Three Fiber. It runs at adaptive DPR (1–1.5) with no external assets — everything is procedurally generated in GLSL.
MilkyWay Shader
- ›6-octave FBM star density
- ›Domain-warped nebula clouds (4 colours)
- ›Galactic core glow + dust lane absorption
- ›8 hand-placed hero stars with 4-ray diffraction spikes
- ›3 layers of fine star spray
Meteor System
- ›Outer glow quad — Gaussian width profile
- ›Inner core quad — power-curve sharp fade
- ›Nucleus sphere — blooms through post-processing
- ›Cross halo — 4-spike star at head
- ›Dual-colour sparks: hot orange + cold blue
Post-Processing Pipeline
- ›Bloom — luminance threshold 0.7, mipmap blur
- ›Vignette — 0.3 offset, 0.5 darkness
- ›Film Grain — 2% additive noise overlay
- ›ACES Filmic Tone Mapping
- ›Multisampling=0 for perf (MSAA replaced by bloom blur)
AI Chat Bot — RAG Flow
The chat widget is not a wrapper around a generic chatbot. It's a three-tier RAG system: FAISS vector retrieval → Groq LLaMA generation → navigation command injection. Every response is grounded in portfolio data.
Navigation Command Protocol
The LLM can emit [[NAVIGATE:/projects]] tokens in its response. The widget parses these, renders a navigation button, and performs client-side routing — no surprise redirects.
1// Navigation commands emitted by the LLM: [[NAVIGATE:/projects]]2// Widget parses and pushes via Next.js router — no surprise redirects3const navMatch = response.match(/\[\[NAVIGATE:(.*?)\]\]/);4if (navMatch) {5 const target = normalize_navigation_target(navMatch[1], userMessage);6 router.push(target); // smooth SPA navigation, not a full reload7}
JSON-LD Graph Injector
A tiny utility that keeps structured data composable across 20+ pages. Strips nested @context nodes and emits a single canonical @graph.
1// Single canonical @context — no duplicates across pages2export function JsonLd({ data }: { data: Record<string,unknown> | Array<Record<string,unknown>> }) {3 const payload = Array.isArray(data) ? data : [data];4 const graphNodes = payload.map(({ "@context": _ctx, ...rest }) => rest);5 const json =6 graphNodes.length === 17 ? { "@context": "https://schema.org", ...graphNodes[0] }8 : { "@context": "https://schema.org", "@graph": graphNodes };9 return <script type="application/ld+json" dangerouslySetInnerHTML={{ __html: JSON.stringify(json) }} />;10}
Voice Avatar — End-to-End
A single POST /converse round-trip handles: audio upload → Groq Whisper STT → RAG → LLaMA LLM → ElevenLabs TTS streaming → Simli lip-synced avatar. Transcripts arrive in response headers.
/converse endpoint
1# /converse — full voice-to-voice in one round trip2@app.post("/converse")3async def converse_endpoint(file: UploadFile = File(...)):4 # 1. STT: Groq Whisper5 user_text = await groq_voice.transcribe_audio(file)67 # 2. RAG context retrieval8 _, context = get_rag_context(user_text, config.RAG_TOP_K)910 # 3. LLM generation (Groq LLaMA → OpenAI fallback)11 ai_text = generate_answer(config.SYSTEM_PROMPT, context, user_text)1213 # 4. Stream TTS (ElevenLabs → Groq fallback) with transcripts in headers14 return StreamingResponse(15 synthesize_speech_stream(ai_text),16 media_type="audio/mpeg",17 headers={"X-Transcript": user_text[:2000], "X-Response": ai_text[:2000]},18 )
TTS Fallback Chain
ElevenLabs is primary (neural quality). On failure, Groq TTS takes over. The `synthesize_speech_stream()` function handles provider switching transparently — the widget never knows which provider responded.
Simli Video Avatar
Simli receives the MP3 audio stream and returns a WebRTC video feed with frame-accurate lip-sync. Integrated via a dedicated WebSocket router in the FastAPI backend. Face ID configurable via env var.
WhatsApp CTA — Payload & UX
The floating WhatsApp button uses the official wa.me deep-link protocol. No backend involved — a pure client-side Framer Motion component.
1// wa.me deep-link — zero setup, direct to WhatsApp chat2const phoneNumber = "918287048587"; // E.164 without +3const whatsappLink = `https://wa.me/${phoneNumber}`;45// Framer Motion spring animations for polish6<motion.a7 href={whatsappLink}8 initial={{ scale: 0, opacity: 0 }}9 animate={{ scale: 1, opacity: 1 }}10 whileHover={{ scale: 1.1 }}11 whileTap={{ scale: 0.9 }}12/>;
wa.me Link Anatomy
Animation Stack
- ›Scale-in on mount via
initial + animate - ›Hover: 1.1× scale (spring physics)
- ›Tap: 0.9× scale feedback
- ›Color: #25D366 → #128C7E on hover
- ›z-index 50 — above all overlays
SEO Architecture
A composable SEO layer that prevents the most common structured-data bugs. One @context per page head, always.
Admin & RAG Management
Token-protected admin endpoints allow hot-reindexing the FAISS knowledge base and querying structured chat logs — all without restarting the dyno.
Rebuild FAISS index from data/ sources. Reloads into memory in-process.
Partial ingestion with custom payload (subset of sources or URLs).
Returns mode (remote-vector / local-vector / lexical-fallback), metadata count, index path.
Paginated chat log query from SQLite. Includes retrieved chunk IDs and scores.
Key Design Decisions
remote-vector → local-vector (FAISS) → lexical keyword match → extractive fallback. The bot never goes down even if the embedder fails to load.
Static CDN gives sub-50ms TTFB worldwide. The streaming voice + chat backend runs on a separate, always-warm dyno. Pages load instantly; AI features stream in.
Instead of parsing free-text intent on the client, the LLM is instructed to emit structured [[NAVIGATE:...]] tokens that the widget handles deterministically. No surprise page changes.
ElevenLabs has the best voice quality but can be unavailable. Groq TTS is the instant fallback. Both use streaming (StreamingResponse) so audio playback starts within 300ms of the LLM finishing.
For a ~500 chunk knowledge base, in-process FAISS is faster than a Postgres round-trip and requires zero additional hosting. The index fits in memory on the smallest Heroku dyno.
The WhatsApp Business API requires a verified business account and carries per-message costs. wa.me is instant, free, and opens the app natively on any device — perfect for a portfolio CTA.