AI Customer Support Chatbot – SaaS Platform with Live Agent Handoff
Sole Developer & Architect

Technologies Used
Interactive Demo
Experience the full platform in action — watch as a visitor opens the chat widget, interacts with the AI assistant, and gets seamlessly handed off to a live operator. Both sides update in real-time.
A production-ready, multi-tenant SaaS platform that enables businesses to deploy AI-powered customer support chatbots on their websites in minutes. The system combines real-time AI conversations powered by Anthropic Claude with seamless live agent handoff, ensuring every customer inquiry is resolved — whether by AI or a human operator.
This is the same chatbot powering the live chat widget on this very website. It handles initial customer inquiries with AI, intelligently routes complex questions to human operators via real-time WebSocket connections, and collects contact information through natural conversation when no one is available.
The Problem
Most AI chatbot solutions charge unpredictable per-resolution fees ($0.99–$2.00 per AI interaction), require weeks of setup, or lock businesses into expensive per-seat licenses. Small and medium businesses need affordable, intelligent customer support automation without surprise bills or complex onboarding.
The Solution
I designed and built a complete SaaS platform from scratch — backend, AI engine, embeddable widget, operator dashboard, billing system, and deployment infrastructure. Businesses sign up, customize their AI's personality and knowledge base, copy a single script tag, and they're live.
Core Features
AI-Powered Conversations
- Real-time streaming responses powered by Anthropic Claude, delivering natural and context-aware conversations
- Per-tenant customizable AI personality via editable system prompts — no restart required
- Dynamic knowledge base that businesses can populate with FAQs, product details, and policies
- Configurable AI message limits per conversation with graceful upgrade prompts
Intelligent Human Handoff
- AI detects when a conversation needs human intervention and triggers handoff automatically
- Keyword detection ("talk to a person", "real agent") for immediate escalation
- Real-time Web Push notifications alert operators on both mobile and desktop
- Live wait timer shows visitors exactly how long they've been waiting
- Configurable timeout with automatic email fallback if no operator responds
Operator Dashboard (Progressive Web App)
- Progressive Web App installable from any browser — no app store required
- Real-time conversation list with status filters (AI active, awaiting human, resolved)
- Full conversation history visible when claiming a chat
- Contact management with CSV export for CRM integration and lead tracking
- Settings panel for AI prompts, knowledge base, business hours, operator management, and notification preferences
Conversational Lead Capture
- Collects visitor name, email, and phone through natural conversation — not intrusive form overlays
- Triggers automatically during handoff or when business is offline
- Fallback email notification sent to the business when no operator is available
- All collected contacts visible in the dashboard with timestamps and conversation context
Business Hours & Offline Handling
- Timezone-aware scheduling with per-day configuration for each day of the week
- AI adapts its behavior based on business hours — never promises a human connection when offline
- Offline mode triggers contact collection and email notifications automatically
Architecture & Technical Decisions
The entire platform runs as a single Docker container on a Hetzner Cloud VPS, consuming approximately 30–50MB of RAM. I chose this architecture deliberately to minimize operational costs while maintaining production reliability.
- Backend: Python FastAPI with async WebSocket support for real-time bidirectional communication between visitors and operators
- Database: SQLite with WAL mode — zero-config, no extra containers, with tenant-scoped queries for complete data isolation
- Chat Widget: ~8KB vanilla JavaScript with zero framework dependencies — loads instantly on any website without affecting page performance
- Real-time Communication: WebSocket connections for both visitor chat streaming and operator dashboard live updates
- Push Notifications: Self-hosted Web Push using VAPID encryption — no third-party notification service dependency
- AI Integration: Anthropic Claude SDK with streaming for token-by-token delivery, creating a responsive and natural chat experience
- Billing: Stripe Checkout with webhook handling for the full subscription lifecycle — signup, upgrades, cancellations, and failed payments
- Deployment: Docker Compose + Caddy reverse proxy with automatic HTTPS via Let's Encrypt
Multi-Tenant SaaS Design
Every customer operates in a fully isolated tenant with their own AI personality, knowledge base, operators, contacts, and usage limits. Tenant isolation is enforced at the database query level, and an in-memory caching layer handles widget token lookups to avoid database hits on every incoming message. Each tenant gets their own embeddable widget token, configurable branding, and independent billing through Stripe.
What Makes This Different
- Transparent pricing: Flat monthly plans instead of unpredictable per-resolution charges that can cost hundreds or thousands per month
- Minutes to deploy: Sign up, customize the AI, paste one script tag — live on your website
- Lightweight widget: ~8KB total vs. 50–100KB+ for competitors like Intercom or Zendesk
- No app store dependency: Operators install a Progressive Web App directly from their browser on any device
- Self-hosted option: Full data sovereignty — deploy on your own infrastructure for GDPR, PIPEDA, or CCPA compliance
- AI-native design: AI is the first responder by design, not a feature bolted onto a legacy ticketing system
Technologies Used
Python, FastAPI, WebSockets, Anthropic Claude AI, SQLite, Docker, Stripe, Progressive Web App (PWA), Web Push API, VAPID Encryption, Vanilla JavaScript, Caddy, Hetzner Cloud