A living architecture exercise that composes the cookbook capabilities into a public-service style assistant.
Worked example: Danish citizen-facing LLM
Scenario
A Danish public-service organization wants an assistant that helps citizens understand approved guidance, find relevant forms, and know when to contact a human case worker. The assistant must not invent rights, make case decisions, or hide uncertainty.
Learning goal
Use the cookbook as a design checklist. The exercise is not to build a chatbot. The exercise is to design the system around it.
Capability map
- AI capability as a system: define the layers before choosing the model.
- Adoption through artifacts: build a small inspectable prototype with one real citizen journey.
- Model-agnostic workflows: keep model choice behind a route policy.
- Multimodel orchestration: choose local, EU-hosted, or premium model routes by privacy and task type.
- Guardrails: refuse unsupported advice, block prompt injection, and escalate high-impact uncertainty.
- LLM evaluation: score normal, adversarial, Danish, and English prompts.
- Red-teaming: attack retrieval, refusal, escalation, and source attribution.
- Agent authority and secrets: keep tool access scoped and auditable.
- Handover: leave operators with runbooks, scorecards, logs, and disable paths.
Architecture sketch
- Frontend: simple citizen-facing interface with clear non-decision language.
- Sources: approved public guidance, forms, and internal policy summaries that have been cleared for the assistant.
- Retrieval: source-bounded search with citations or source references.
- Model routing: default to an approved hosted/EU route for user-facing answers; use local or private routes for sensitive context preparation where needed.
- Guardrails: topic boundary, prompt-injection boundary, personal-data handling, unsupported-answer refusal, and escalation language.
- Evaluation: around 30 prompts, at least half adversarial, mixed Danish and English.
- Red-team: scoped run against retrieval injection, overconfident legal-style advice, and escalation bypass.
- Audit: log route, sources, guardrail decision, escalation decision, and operator-visible failure reason.
- Handover: owner, verification command, scorecard, source update process, rollback, and review cadence.
Practice loop
- Pick one concrete citizen journey.
- Write five normal questions.
- Write five questions that should refuse or escalate.
- Identify approved sources and unsupported topics.
- Draft the answer policy before drafting prompts.
- Run the questions through the current architecture and update the capability pages based on failures.
Proof artifact
The finished proof should include:
- One architecture note.
- One prompt dataset.
- One evaluation scorecard.
- One red-team report.
- One handover packet.
- One public-safe summary of what changed after testing.
Current status
This is a living learning example. It is intentionally honest: the architecture is mapped, but the guardrails, evaluation, and red-team proofs still need to be built and linked.
Next build
Start with guardrails. Without a refusal/escalation boundary, the rest of the architecture is only a sketch.