Technology Behind Ask Zen
Ask Zen is a modern conversational app built for speed, security, and scale—leveraging Google's AI and cloud infrastructure to deliver real-time, context-aware responses.
Knowledge Base
The Ask Zen system is powered by a structured, schema-driven knowledge base (KB) that supports deep world modeling, narrative logic, and high-precision AI grounding. Each knowledge domain—characters, scenes, events, buildings, legislation, and more—is defined by its own strict JSON Schema, ensuring consistency, completeness, and long-term maintainability.
- Schema-first design – Each object type follows a validated JSON Schema to enforce structure, validate input, and support intelligent editing tools
- Rational chunking – Schema categories allow for intelligent, context-aware chunking of information. Instead of arbitrary text blocks, embeddings are generated from logical groupings such as a character's affiliations, story arc, or personal timeline
- Embedding-aware schemas – An embedding-specific schema identifies which fields to encode for semantic search and context injection in AI queries
- Timeline-integrated data – Many objects include embedded timelines, allowing the system to reason across events and sequences while preserving historical context
- Consistency and validation – The KB supports automated consistency checks across categories, ensuring that, for example, characters do not appear in scenes they shouldn't be present for, or events don't contradict established world rules
This structured approach allows Ask Zen to serve as more than a chatbot—it becomes a reasoning engine grounded in a consistent, evolving knowledge graph. The result is a system that can generate accurate, context-aware, and canon-compliant answers at scale.
Frontend
- HTML/CSS/JavaScript – Lightweight, no framework, built for fast loading
- Google Identity Services – Secure sign-in using OAuth 2.0 and ID tokens
- marked.js – In-browser Markdown rendering for clean, styled answers
- Responsive design – Optimized layout for mobile and desktop
Backend
- FastAPI – Fast, modern web API framework (Python 3.11)
- Uvicorn – Lightweight ASGI server for high-performance async support
- Firestore – Secure, scalable NoSQL database for conversation history and embeddings
- Vertex AI Gemini – Google’s multimodal generative model for real-time, grounded AI responses
- Vertex AI Embeddings (Gecko) – Semantic embedding of user input for contextual grounding
- Embedding distillation - Powered by Gemini, compresses structured input into high-quality semantic summaries before embedding, improving retrieval accuracy and minimizing hallucinations.
- Query enrichment - Leverages prior user queries distilled with Gemini to capture user intent and context, enriching current queries with semantically related terms from earlier interactions.
Infrastructure
- Google Cloud Run – Fully managed serverless backend with autoscaling
- Artifact Registry – Container image storage and versioning
- Cloud Build – Automated Docker image builds and deployment
- Google OAuth Consent Screen – Secure access without collecting PII
Security & Privacy
- OAuth 2.0 with
sub
ID only – No email, name, or PII stored or processed
- Token verification via Google Auth – All tokens verified before processing
- App-wide HTTPS – Enforced via Cloud Run & domain configuration
DevOps & Local Environment
- CentOS-based local development environment – Mirrors production for consistent testing and behavior
- Apache reverse proxy – Proxies external access to a local Uvicorn server for full-stack integration testing
- Uvicorn with hot reload – Runs the FastAPI backend with auto-reload enabled, allowing real-time updates during development
- Identical runtime and structure – Local and production environments share the same codebase, config, and containerized deployment model for minimal drift
- Python virtualenv – Isolated package management to ensure repeatable local builds and dependency control
Useful Links
Ask Zen is built for real-world use, balancing performance, privacy, and deep AI integration. Proudly running on Google Cloud.