AI Customer Support Chatbot – SaaS Platform with Live Agent Handoff

Sole Developer & Architect

Year: 2025

AI Customer Support Chatbot – SaaS Platform with Live Agent Handoff

Technologies Used

PythonFastAPIWebSocketsAnthropic ClaudeSQLiteDockerStripePWAWeb Push APIVanilla JavaScript

Interactive Demo

Experience the full platform in action — watch as a visitor opens the chat widget, interacts with the AI assistant, and gets seamlessly handed off to a live operator. Both sides update in real-time.

A production-ready, multi-tenant SaaS platform that enables businesses to deploy AI-powered customer support chatbots on their websites in minutes. The system combines real-time AI conversations powered by Anthropic Claude with seamless live agent handoff, ensuring every customer inquiry is resolved — whether by AI or a human operator.

This is the same chatbot powering the live chat widget on this very website. It handles initial customer inquiries with AI, intelligently routes complex questions to human operators via real-time WebSocket connections, and collects contact information through natural conversation when no one is available.

The Problem

Most AI chatbot solutions charge unpredictable per-resolution fees ($0.99–$2.00 per AI interaction), require weeks of setup, or lock businesses into expensive per-seat licenses. Small and medium businesses need affordable, intelligent customer support automation without surprise bills or complex onboarding.

The Solution

I designed and built a complete SaaS platform from scratch — backend, AI engine, embeddable widget, operator dashboard, billing system, and deployment infrastructure. Businesses sign up, customize their AI's personality and knowledge base, copy a single script tag, and they're live.

Core Features

AI-Powered Conversations

Real-time streaming responses powered by Anthropic Claude, delivering natural and context-aware conversations
Per-tenant customizable AI personality via editable system prompts — no restart required
Dynamic knowledge base that businesses can populate with FAQs, product details, and policies
Configurable AI message limits per conversation with graceful upgrade prompts

Intelligent Human Handoff

AI detects when a conversation needs human intervention and triggers handoff automatically
Keyword detection ("talk to a person", "real agent") for immediate escalation
Real-time Web Push notifications alert operators on both mobile and desktop
Live wait timer shows visitors exactly how long they've been waiting
Configurable timeout with automatic email fallback if no operator responds

Operator Dashboard (Progressive Web App)

Progressive Web App installable from any browser — no app store required
Real-time conversation list with status filters (AI active, awaiting human, resolved)
Full conversation history visible when claiming a chat
Contact management with CSV export for CRM integration and lead tracking
Settings panel for AI prompts, knowledge base, business hours, operator management, and notification preferences

Conversational Lead Capture

Collects visitor name, email, and phone through natural conversation — not intrusive form overlays
Triggers automatically during handoff or when business is offline
Fallback email notification sent to the business when no operator is available
All collected contacts visible in the dashboard with timestamps and conversation context

Business Hours & Offline Handling

Timezone-aware scheduling with per-day configuration for each day of the week
AI adapts its behavior based on business hours — never promises a human connection when offline
Offline mode triggers contact collection and email notifications automatically

Architecture & Technical Decisions

The entire platform runs as a single Docker container on a Hetzner Cloud VPS, consuming approximately 30–50MB of RAM. I chose this architecture deliberately to minimize operational costs while maintaining production reliability.

Backend: Python FastAPI with async WebSocket support for real-time bidirectional communication between visitors and operators
Database: SQLite with WAL mode — zero-config, no extra containers, with tenant-scoped queries for complete data isolation
Chat Widget: ~8KB vanilla JavaScript with zero framework dependencies — loads instantly on any website without affecting page performance
Real-time Communication: WebSocket connections for both visitor chat streaming and operator dashboard live updates
Push Notifications: Self-hosted Web Push using VAPID encryption — no third-party notification service dependency
AI Integration: Anthropic Claude SDK with streaming for token-by-token delivery, creating a responsive and natural chat experience
Billing: Stripe Checkout with webhook handling for the full subscription lifecycle — signup, upgrades, cancellations, and failed payments
Deployment: Docker Compose + Caddy reverse proxy with automatic HTTPS via Let's Encrypt

Multi-Tenant SaaS Design

Every customer operates in a fully isolated tenant with their own AI personality, knowledge base, operators, contacts, and usage limits. Tenant isolation is enforced at the database query level, and an in-memory caching layer handles widget token lookups to avoid database hits on every incoming message. Each tenant gets their own embeddable widget token, configurable branding, and independent billing through Stripe.

What Makes This Different

Transparent pricing: Flat monthly plans instead of unpredictable per-resolution charges that can cost hundreds or thousands per month
Minutes to deploy: Sign up, customize the AI, paste one script tag — live on your website
Lightweight widget: ~8KB total vs. 50–100KB+ for competitors like Intercom or Zendesk
No app store dependency: Operators install a Progressive Web App directly from their browser on any device
Self-hosted option: Full data sovereignty — deploy on your own infrastructure for GDPR, PIPEDA, or CCPA compliance
AI-native design: AI is the first responder by design, not a feature bolted onto a legacy ticketing system

Technologies Used

Python, FastAPI, WebSockets, Anthropic Claude AI, SQLite, Docker, Stripe, Progressive Web App (PWA), Web Push API, VAPID Encryption, Vanilla JavaScript, Caddy, Hetzner Cloud

Back to Projects

AI Customer Support Chatbot – SaaS Platform with Live Agent Handoff

Sole Developer & Architect

Year: 2025

Technologies Used

PythonFastAPIWebSocketsAnthropic ClaudeSQLiteDockerStripePWAWeb Push APIVanilla JavaScript

Interactive Demo

The Problem

The Solution

Core Features

AI-Powered Conversations

Real-time streaming responses powered by Anthropic Claude, delivering natural and context-aware conversations
Per-tenant customizable AI personality via editable system prompts — no restart required
Dynamic knowledge base that businesses can populate with FAQs, product details, and policies
Configurable AI message limits per conversation with graceful upgrade prompts

Intelligent Human Handoff

AI detects when a conversation needs human intervention and triggers handoff automatically
Keyword detection ("talk to a person", "real agent") for immediate escalation
Real-time Web Push notifications alert operators on both mobile and desktop
Live wait timer shows visitors exactly how long they've been waiting
Configurable timeout with automatic email fallback if no operator responds

Operator Dashboard (Progressive Web App)

Progressive Web App installable from any browser — no app store required
Real-time conversation list with status filters (AI active, awaiting human, resolved)
Full conversation history visible when claiming a chat
Contact management with CSV export for CRM integration and lead tracking
Settings panel for AI prompts, knowledge base, business hours, operator management, and notification preferences

Conversational Lead Capture

Collects visitor name, email, and phone through natural conversation — not intrusive form overlays
Triggers automatically during handoff or when business is offline
Fallback email notification sent to the business when no operator is available
All collected contacts visible in the dashboard with timestamps and conversation context

Business Hours & Offline Handling

Timezone-aware scheduling with per-day configuration for each day of the week
AI adapts its behavior based on business hours — never promises a human connection when offline
Offline mode triggers contact collection and email notifications automatically

Architecture & Technical Decisions

Backend: Python FastAPI with async WebSocket support for real-time bidirectional communication between visitors and operators
Database: SQLite with WAL mode — zero-config, no extra containers, with tenant-scoped queries for complete data isolation
Chat Widget: ~8KB vanilla JavaScript with zero framework dependencies — loads instantly on any website without affecting page performance
Real-time Communication: WebSocket connections for both visitor chat streaming and operator dashboard live updates
Push Notifications: Self-hosted Web Push using VAPID encryption — no third-party notification service dependency
AI Integration: Anthropic Claude SDK with streaming for token-by-token delivery, creating a responsive and natural chat experience
Billing: Stripe Checkout with webhook handling for the full subscription lifecycle — signup, upgrades, cancellations, and failed payments
Deployment: Docker Compose + Caddy reverse proxy with automatic HTTPS via Let's Encrypt

Multi-Tenant SaaS Design

What Makes This Different

Transparent pricing: Flat monthly plans instead of unpredictable per-resolution charges that can cost hundreds or thousands per month
Minutes to deploy: Sign up, customize the AI, paste one script tag — live on your website
Lightweight widget: ~8KB total vs. 50–100KB+ for competitors like Intercom or Zendesk
No app store dependency: Operators install a Progressive Web App directly from their browser on any device
Self-hosted option: Full data sovereignty — deploy on your own infrastructure for GDPR, PIPEDA, or CCPA compliance
AI-native design: AI is the first responder by design, not a feature bolted onto a legacy ticketing system

Technologies Used

Python, FastAPI, WebSockets, Anthropic Claude AI, SQLite, Docker, Stripe, Progressive Web App (PWA), Web Push API, VAPID Encryption, Vanilla JavaScript, Caddy, Hetzner Cloud