Architecture
Three planes, connected by signed tokens and mTLS.
At a Glance
| Layer | Components | Stack |
|---|---|---|
| Edge | Auth, rate limiting, routing, caching | Edge workers, distributed KV, SQLite at the edge, global CDN |
| Application | Business logic, inference, agent orchestration | Python, FastAPI |
| Data | Relational store, cache, object storage | Managed Postgres, Redis, object storage |
Edge Plane
The only plane that touches the public internet.
| Component | Purpose |
|---|---|
| Frontend | React 19 + Vite 6 SPA with immutable static assets |
| Gateway | First point of contact for all API traffic — auth, rate limiting, request routing |
| Auth | JWT issuance and verification; refresh tokens stored in distributed KV |
| Distributed KV | Read-heavy cache: feature flags, rate-limit counters, public config |
| SQLite at the edge | Small relational datasets: feature flags history, search autocomplete |
Application Plane
Stateless replicas behind an internal load balancer.
| Service | Purpose |
|---|---|
| FastAPI Hub | Primary API — user accounts, billing, tenant management |
| Inference Worker | Routes prompts to the cheapest provider within the latency budget |
| Search Worker | Multi-provider search with classification and caching |
| Chat Worker | Conversational AI with streaming responses |
Data Plane
| Store | Purpose |
|---|---|
| Managed Postgres | Tenant content, audit logs, user uploads |
| Managed Redis | Application cache, session state |
| Object Storage | User uploads, backups |
All inter-service communication uses signed JWTs. Service-to-service calls use mTLS. Data at rest is encrypted.