Built to be auditable.
Architected for compliance.

For CIOs, CISOs, and city attorneys who want to know exactly what they’re deploying. This page summarizes the architecture, safety, and records-handling that sit behind every Agency Chat deployment.

Grounded retrieval, not generic generation

Agency Chat is a managed Retrieval-Augmented Generation (RAG) system — the architecture recommended by the UNC School of Government for public-sector deployments.

For each question, the assistant searches your indexed documents and self-crawled website in parallel, ranks the most relevant passages, and synthesizes a response with a large-language model that is constrained to only cite content from those passages. The system is designed to refuse rather than guess when the retrieved passages don’t support an answer.

What you provide

  • Authoritative documents (charter, municipal code, policies, regulations, board materials)
  • Your public-facing website domain(s)
  • Branding, supported languages, and any operator-specific copy (refusal messages, crisis-resource links)

What we manage

  • Document ingestion, parsing, and chunking
  • Recurring website crawling, deduplication, and re-indexing
  • Search relevance tuning per source tier (e.g. charter > ordinances > council reports > website)
  • Citation rendering and link resolution to the canonical official source where one exists

Multi-layer safety pipeline

Generic chatbots rely on the underlying model’s built-in safety alone. Agency Chat composes four independent layers so that a failure in any one layer is contained by the others.

  1. Regex input sanitizer. Sub-millisecond pattern matching for known adversarial inputs and first-person-anchored self-harm signals.
  2. Semantic guard. Embedding-similarity check against a curated catalog of adversarial and out-of-scope examples; categorizes the query and routes high-risk inputs to a refusal path.
  3. Vertex AI adversarial classification. Google’s built-in classification on every search query to flag prompt-injection patterns the model has seen in the wild.
  4. System-prompt hardening. The synthesis prompt explicitly instructs the model to ignore injected instructions, refuse out-of-scope answers, and cite numerically with a fixed format.

Refusals and safety events are logged to the same audit pipeline as normal answers so your records officer sees the full picture — including what the assistant declined to answer.

Public-records logging and retention

Every interaction is treated as a public record from the moment it is created. The audit payload for each interaction includes:

  • Timestamp, agency identifier, and request language
  • The resident’s question (and its translation, if any)
  • The assistant’s answer (and its translation, if any)
  • The citations used and their source documents
  • Any refusal or safety-route decisions

Records are written to a dedicated dataset in your agency’s own GCP project — not a shared multi-tenant store. Each month, a structured report of the previous month’s interactions is delivered to the recipients you specify, formatted for direct archive against your existing public-records schedule.

We retain interactions on a fixed two-calendar-month rolling window so a delivery can be re-issued or amended if your records officer requests it. After that window, records are permanently purged from our systems. No personally identifiable information is collected. No user accounts or logins are required.

Hosting and tenancy

Each agency gets its own Google Cloud project. Backend, worker, frontend, and search infrastructure for one agency never share runtime resources with another agency’s deployment.

  • Per-agency GCP project. Isolated IAM, isolated billing, isolated BigQuery datasets, isolated GCS buckets.
  • Cloud Run services. Backend, worker, and frontend are deployed as Cloud Run services with managed TLS and zero-downtime traffic splitting on every release.
  • Capacity control. A Firestore-backed token-bucket limiter throttles peak traffic into a configurable maximum-concurrency window, with optional asynchronous queueing through Cloud Tasks for high-burst events.
  • Self-crawled website corpus. Where applicable, we crawl your public website on a fixed schedule (typically weekly) and index the resulting markdown into the same data store as your documents — eliminating dependency on third-party web search products and giving you full control over what is indexed.

Data handling at a glance

  • No PII collection. We do not ask for, store, or log names, emails, phone numbers, or accounts.
  • No third-party trackers in the chat path. The chat surface is yours.
  • Encrypted at rest and in transit. Standard Google Cloud encryption for all storage and managed TLS for all transport.
  • Records ownership. Audit records live in your agency’s own GCP project. You can read them directly at any time.
  • Vendor exit. Your documents and their indexed form remain in your GCP project. There is no proprietary data layer we can hold hostage.

Want the long version?

We’re happy to walk your CIO, CISO, or records officer through the full architecture, including the public-records logging schema and the safety-guardrail catalog. Get in touch and we’ll set up the conversation.