Orientation
Agents earn trust or they fail.

Trust is what users develop when they can predict, interpret, and act on agent behavior. As multi-agent systems evolve from federated assistants toward unified operating layers, the experience layer needs a discipline that scales with the platform.

Mental model alignment is that discipline. The foundation users need to make sense of what agents are doing, and the architecture designers need to make orchestrated multi-agent systems coherent across the product.

What this document is

AXA informs the experience architecture for orchestrated multi-agent interactions. It synthesizes what users tell us about how they perceive, interpret, and build trust with agent behavior across conversational and non-conversational patterns — and gives design, product, and engineering a research-grounded vocabulary for building that experience deliberately.

The framework surfaces a foundation (mental models) and three design dimensions: interactions, agent postures, and orchestration. Together they elaborate what the research suggests product, design, and engineering need to ship agentic experiences that are coherent, predictable, and trustworthy at platform scale.

What's in this deck
Three design dimensions, one foundation, one practice.
The framework composes from five elements. Each gets its own page in the deck that follows.
Page 02 · Mental Models
Mental Models
The foundation of trust.
Three hierarchical layers: Outcome, Causal, Interactional. They describe what users believe about how the system works. Every interaction type, persona, and orchestration pattern in this framework is evaluated against whether it preserves or breaks user mental models. Without alignment at this layer, nothing else matters.
Outcome
Causal
Interactional
Page 02 · Models
Mental Models are the foundation of trust.

Mental models are users' internal beliefs about how a system works. They are not a research curiosity. They are the foundation that determines whether agent behavior is legible, predictable, and trustworthy enough to act on.

Three hierarchical layers describe what users believe: Outcome, Causal, Interactional. Each layer answers a different question. Each layer breaks in a different way. Each layer demands a different design discipline as agents grow more autonomous and more orchestrated.

INTERACTIONAL CAUSAL OUTCOME FOUNDATION FIRST
Mental Model Hierarchy
How the thinking has evolved

A year ago, users described AI as a tool. They gave it instructions, it executed. Today, they describe it as a collaborator. They set goals, the agent participates in achieving them. The mental model framework has refined alongside what users are telling us.

Outcome alone was enough when users thought of AI as a tool. As agents took on more autonomy, Causal became essential: users tell us they have to understand how the agent is reasoning to confirm, modify, or override it. As agents started acting as teammates, Interactional became the whole game: rhythm, tone, recovery, the feel of working alongside something — the things users name when they describe what makes the experience work or fall apart.

Trust is built bottom-up. Outcome is the ground. Causal is the structure. Interactional is the bloom of trust that grows when both hold.

Layer 01
Outcome
"Did the system do what I needed?"
The user's model of what the system accomplishes on their behalf. Task completion, confirmation, closure. The foundation. Without it, nothing else matters.
Breaks when
Ambiguous completions. Silent actions. Users left unsure whether the task is actually done.
Layer 02
Causal
"Does it work the way I think it does?"
The user's model of how the system reasons. Visible logic, transparent inputs, predictable behavior. The layer that determines whether users can confirm, modify, or override what the agent proposes.
Breaks when
Recommendations feel arbitrary. Memory failures. Reasoning the user can't follow or anticipate.
Layer 03
Interactional
"Does it feel right to work with?"
The user's model of how it feels to collaborate with the system. Rhythm, tone, responsiveness, recovery. The layer where the system reads as a teammate, not a tool.
Breaks when
Interruptions. Repetition. Pacing that fights the user. Emotional miscalibration in moments that matter.
Case study · Booking an online medical appointment
Every stage. All three layers. Trust holds or breaks at each one.
Mental models don't take turns. Users check all three layers continuously, in parallel, at every stage of a workflow. A single break at any layer can cost the whole interaction.
Holding
Breaking
?Pending
Open the App
Describe the Need
Choose a Time
Confirm + Leave
L03Interactional
"This looks like the same site I used last time."
"It interrupted my answer. Now I have to start over."
"Okay, this is moving along now."
"That felt clean."
L02Causal
?
"Am I talking to a chatbot or scheduling software? How does this work?"
"Why is it asking my insurance again? I already entered it."
"It found a slot with my doctor. That makes sense."
"It says I'm booked — but will I get a confirmation email? Is it on my chart?"
L01Outcome
?
"Can I actually get in this week?"
?
"Is this going to get me to the right doctor?"
"Tuesday at 10. That works."
?
"Will my doctor's office actually have me on the schedule?"
Read horizontally to see how one layer evolves across the booking flow. Read vertically to see all three layers being checked simultaneously at each stage. Designing for one layer at a time misses the truth: users are checking all three, all the time. A single break can cost the whole interaction.
Page 03 · Interactions
As agents grow more autonomous, the demands shift.

Interaction patterns evolve across three tiers: Automation, Augmentation, Agency. Each tier changes what users expect, which mental model layer dominates, and what design has to do to keep trust intact.

The framework gives teams a shared vocabulary for matching agent behavior to user expectation at each tier. It shapes how the user experience is designed, how transparency is calibrated, and how trust is built as autonomy grows.

Tier 01
Automation
Structured, linear, fixed-path. The user operates the system directly.
Earning trust at this tier
The user is operating the system directly. Outcome trust is built on clean execution and explicit confirmation. The user needs to know unambiguously that the thing they asked for happened. Causal trust is built on legibility when something deviates from the expected path. The user needs to understand why if the system pauses, fails, or asks a question. Interactional trust is built on quiet competence. The system should feel reliable and unobtrusive, not chatty or performative.
Tier 02
Augmentation
Guided, semi-adaptive. The user collaborates with the system inside a structured flow.
Earning trust at this tier
The user is collaborating with the system. Outcome trust is now shared. The user needs to see the work advancing and feel like an active participant, not a bystander. Causal trust becomes load-bearing. The user has to follow the agent's reasoning to confirm, modify, or override what's being proposed. When context is dropped or validation fails, trust at this layer collapses fast. Interactional trust is built on rhythm and responsiveness. The agent's suggestions must arrive when they're useful and disappear when they're not.
Tier 03
Agency
Fluid, dynamic, exploratory. The user delegates intent; the agent acts on their behalf.
Earning trust at this tier
The user is delegating to the system. Outcome trust is built on reliable delivery. The agent has to come back with results the user can verify, ideally with a trace of what it did. Causal trust is built on legible reasoning. The user needs to understand how the agent thought through the problem, especially when something unexpected happens. Interactional trust becomes the whole working relationship: empathy, pacing, recovery, how the agent communicates uncertainty, when it asks for help and when it doesn't. Delegation is an act of trust, and that trust is built or broken in the texture of every exchange.
Full framework · Comparison matrix
How the three tiers compare across four dimensions of experience.
Each tier changes the mental model demands, the human-agent relationship, the design role, and the strategic fit. The matrix is the working reference for matching agent behavior to context.
Dimensions
Automation
Augmentation
Agency
Dim 01How people see and navigate
Interface: Structured, linear, fixed-path.
Interaction: User initiates, agent executes. Direct, transactional, turn-based. The user is operating the system.
Interface: Guided, semi-adaptive.
Interaction: User and agent share the flow. The agent proposes, the user accepts or modifies. The user is collaborating with the system.
Interface: Fluid, dynamic, exploratory.
Interaction: User delegates intent; the agent reasons, acts, and reports back. The user supervises rather than operates.
Dim 02Relationship between human + agent
Agent Experience: Minimal, transactional. Basic digital interactions and automated responses.
Human Experience: Minimal intervention, predictable reliability.
Agent Experience: Moderately adaptive, informative. Interacts reliably with predefined systems, APIs, SDKs.
Human Experience: Moderate collaboration, enhanced confidence through clarity.
Agent Experience: Highly adaptive, proactive, personalized. Fully leverages AX principles.
Human Experience: Deeply engaging. Personal agency amplified through meaningful interactions.
Dim 03Shaping and guiding intelligence
Design Role: Design comprehensive, upfront, predictable interactions.
Research Impact: Validate task clarity and frictionless effectiveness.
Design Role: Define clear adaptive boundaries. Transparent communication.
Research Impact: Continuously gather feedback on adaptive elements.
Design Role: Set visionary objectives. Adaptive, emotionally resonant experiences.
Research Impact: Intensive, iterative human feedback loops guiding AI evolution.
Dim 04Strategic fit and value creation
Best Suited For: Mission-critical tasks requiring absolute accuracy (e.g., financial transactions).
Strategic Value: Reduces risk, improves consistency, ensures compliance. Foundational for operational excellence.
Best Suited For: Flexible tasks within defined boundaries (e.g., smart scheduling systems).
Strategic Value: Enhances operational agility and scalability. Accelerates decisions with controlled adaptability.
Best Suited For: Complex, nuanced tasks where adaptability and creativity are paramount (e.g., personalized digital assistants).
Strategic Value: Drives innovation, customer differentiation, long-term engagement. Unlocks new business models and AI-native experiences.
Page 04 · Agent Postures
How agents show up to the user.

Users don't track agent identity the way the system does. They experience shifts in tone, authority, and focus as the work unfolds — and the design challenge is making those shifts feel coherent rather than fragmented. Agent Postures are the five ways an agent can show up, the underlying structure designers work with to construct that coherence.

Each posture has a defined role, autonomy level, emotional fluency, and strategic value. They're the vocabulary product, design, and engineering use to design how an agent shows up — so the experience reads as one thing even when many agents are involved.

Five postures
One coherent product.
The five postures describe how an agent is showing up at any given moment of an experience. Users primarily see frontstage postures like Specialists, Collaborators, and Support, while Orchestrator and Observer work backstage. Together they make orchestration legible: who's doing what, with what authority, and how transitions feel from the user's side.
Posture 01
Orchestrator
Manages entry and exit points, routes interactions to the right agent, oversees fallback logic. Mostly invisible to the user, but governs how the system holds together across handoffs.
Example in practiceThe Orchestrator coordinating between Scheduling, Billing, and Patient Support agents during a single workflow.
Autonomy
Moderate · supervisory
Emotional fluency
Moderate. Narrates handovers and errors clearly
Strategic value
Operational coordination and reliability across agents
Posture 02
Specialist
Handles domain-specific tasks and delivers scoped recommendations. The "expert" agent the user actually sees doing the work in a particular domain. Bounded scope, deep capability.
Example in practiceThe Scheduling Agent presenting reschedule options. The Billing Assistant answering insurance questions.
Autonomy
High · task execution
Emotional fluency
High. Task-focused, empathetic responses
Strategic value
Efficiency and domain expertise at the point of work
Posture 03
Collaborator
Temporarily joins an interaction to offer cross-domain assistance. Not the primary actor in the workflow, but adds value at the right moment with a different expertise.
Example in practiceA Patient Outreach Agent joining a Scheduling flow to suggest filling open slots with patients overdue for annual physicals.
Autonomy
Low · advisory
Emotional fluency
High. Context-aware and supportive in tone
Strategic value
Cross-domain adaptability without breaking the primary flow
Posture 04
Support · Fallback
Handles escalations, recovery, and system errors. The agent (human or bot) the system reaches for when something has gone wrong or when judgment requires care the other agents can't provide.
Example in practiceA live Patient Support specialist taking over after a record match fails. Or a fallback bot guiding the user through an error recovery flow.
Autonomy
High · decision maker
Emotional fluency
Very high. Sensitive to user frustration and urgency
Strategic value
Trust building and risk mitigation when things go wrong
Posture 05
Observer
Monitors context and sends nudges or reminders without entering direct conversation. Acts ambiently. The user often doesn't realize the Observer is there until it surfaces something useful.
Example in practiceAn Inventory Monitor flagging when vaccine stock crosses a reorder threshold. A KPI Watcher flagging an unusual no-show rate.
Autonomy
Minimal · ambient
Emotional fluency
Low. Informational and neutral by design
Strategic value
Scalability through passive monitoring and quiet surfacing
Page 05 · Human Personas
Agents serve humans. Research grounds the work.

The framework only matters if it's anchored in who actually uses it. Human Personas are the research-grounded counterpart to Agent Postures: the people on the other side of every orchestrated experience. Without them, agent design becomes speculation about behavior nobody has observed.

Teams can maintain libraries of research-grounded personas trained on customer transcript data. They aren't static documents. They're synthetic collaborators teams can talk to, used to evaluate prompts, test tone, and surface trust gaps before development.

What this library is
A set of design and research tools, not just a roster.
Each persona is a GPT-based character trained on transcripts, workflow documentation, and lived operational data from the trades ecosystem. They simulate language, logic, and emotional reactions grounded in research, built originally in 2024 as an early experiment in using AI to construct character-based personas for simulation work. Several have since been updated with transcript data from direct customer and end-user conversations. The personas function as synthetic collaborators: extensions of user insight that let teams explore how different professional perspectives interact with emerging AI systems, before any production code is written.
1Prototype workflows
Run conversation and workflow prototypes through the persona to get grounded, domain-specific language that reflects the operational constraints they live in.
2Evaluate prompts
Test prompt logic and emotional alignment under conditions that mirror production, including how the persona responds to ambiguity, frustration, or unexpected turns.
3Generate research artifacts
Produce interview guides or research scripts from an authentic professional perspective, grounded in the persona's actual workflows and pain points.
4Identify design risks
Surface trust gaps, unclear escalation logic, and tone mismatches before they reach production. These are the failure modes hardest to catch by reading a spec.
Synthetic design partners
Test agent behavior against how a person would actually respond.

Because the personas are GPT-based, they function as synthetic design partners. Teams can talk to them directly, prompt them with workflows, and watch how they respond. That makes them especially useful for testing agent behavior against the mental models the personas embody.

If a frontstage agent confuses a user's causal model of how a workflow actually works, the persona will say so, in language and frustration patterns derived from transcript data. If the agent's tone breaks the persona's interactional expectations, that breakage is observable before any production code ships.

The personas don't just describe users. They let teams pressure-test agent design against the Outcome, Causal, and Interactional layers, the same trust mechanics described in Pages 02 and 03, using a partner who responds the way a customer would.

Page 06 · Orchestration patterns
The user sees one agent. Five may be working.

An orchestrated experience runs on different triggers (something happened, a schedule fired, a user asked) and different agent postures (one in charge, others joining, watching, recovering). These patterns describe the shape of that experience from the user's vantage — who initiates it, where it goes, how it ends.

The framework decomposes orchestration into three layers (User Interaction, Agent Frontstage, Agent Backstage) and three interaction patterns (Reactive, Proactive, Hybrid). Each pattern shapes the user's mental model differently, and each requires different orchestration choices to earn trust.

The three layers
Every orchestrated experience operates on three layers simultaneously.
The User Interaction Layer is what the user sees and does: clicks, confirmations, the surfaces they touch. The Agent Frontstage Layer is visible agent behavior: chat messages, recommendations, acknowledgments. The Agent Backstage Layer is invisible orchestration: the system detecting conditions, routing requests, coordinating handoffs. The user only sees the top two. The bottom layer is where the system holds together.
UI SURFACE
A screen, panel, or surface the user interacts with
ACTION
Something a user or agent does, visible
INVISIBLE LOGIC
Backstage orchestration the user doesn't see
User-initiated
Reactive
User initiates a known task. The Orchestrator routes to a Specialist who handles it end-to-end. The system responds to a direct request. No anticipation, no surprise.
Trust mechanic
Earned through clean execution and explicit confirmation. The user retains control; the agent does what was asked.
Agent-initiated
Proactive
System detects a condition and surfaces a recommendation before the user asks. The user reviews, approves or dismisses. The agent anticipates, but the user decides.
Trust mechanic
Earned through legible reasoning. The user has to understand why the recommendation came up as much as what it suggests.
Mid-task join
Hybrid
User starts a task. Mid-flow, the Orchestrator detects an opportunity and an agent joins to assist, complete, or optimize. The user delegates while retaining final say.
Trust mechanic
Earned through how the agent enters and exits. Offers must feel timely, not intrusive. Control must be cleanly returned.
Flow 01 · Reactive
Reschedule Appointment
User-initiated Orchestrator + Scheduling Agent
USER INTERACTION AGENT FRONTSTAGE AGENT BACKSTAGE Patient services rep "Reschedule Appt" in embedded panel Orchestrator confirms intent, routes to Scheduling Scheduling Agent presents new time slots in chat Rep selects new slot + confirms Scheduling Agent confirms reschedule, notifies care team
Pattern: User initiates a known task. The Orchestrator confirms intent and routes to a Specialist. The Specialist handles the task end-to-end. The user sees a single agent doing the work; the Orchestrator stays invisible. Trust is earned through clean execution and explicit confirmation.
Flow 02 · Proactive
Send Yearly Physical Reminder
Agent-initiated Orchestrator + Outreach Agent
USER INTERACTION AGENT FRONTSTAGE AGENT BACKSTAGE Orchestrator detects patients overdue for annual physicals Routes condition to Outreach Agent Outreach Agent surfaces reminder recommendation Practice manager reviews + approves audience, channel, timing Outreach Agent sends reminders, tracks responses
Pattern: System detects a condition, surfaces a recommendation to the user, awaits approval, then executes. Trust depends on the agent's reasoning being legible. Why the recommendation came up matters as much as what it suggests. The user sees a partner that anticipates, not one that acts without asking.
Flow 03 · Hybrid · Delegated Completion
Finish Patient Notes
Mid-task join Orchestrator + Summary Agent
USER INTERACTION AGENT FRONTSTAGE AGENT BACKSTAGE Clinician starts patient notes in EHR Orchestrator detects incomplete note pattern Summary Agent offers: "Want me to finish based on visit template?" Clinician taps "Yes, finish it" Summary Agent drafts complete note, returns for review
Pattern: User starts a task. Mid-flow, the Orchestrator detects an opportunity and an agent offers to take over. User retains the decision; the agent does the work. This is delegation: the user supervises rather than operates. Trust is built or broken in how the offer arrives, how the work is summarized, and whether control is cleanly returned.
Page 07 · In Practice
How AXA shows up in the work.

A framework only matters if it lands in the work. AXA isn't a separate artifact teams reference once and set aside. It integrates into four existing practices: prompt architecture, the design system, design practice, and product development.

Each integration point answers a specific question: how do agent behaviors stay consistent, how does design signal agent identity, how does the design org think in systems instead of pixels, and how does this become how we ship. Together they describe what it looks like for AXA to become part of how an org builds.

Built to land in the work
The framework is designed to integrate, not sit alongside.
AXA describes how multi-agent experiences should work. Practice is where that description becomes operating reality. The four workstreams below are the places where the framework's vocabulary, principles, and patterns enter the daily work of product, design, and engineering. None of them are new processes invented for AXA. They're existing practices the framework integrates into, so that orchestrated experiences ship as core functionality, not bolted on afterward.
Practice 01
Prompt Practice
How do agent behaviors stay consistent across the system?
Mental model alignment lives in prompt architecture. Governance rules, fallback patterns, escalation logic, and tone are encoded at the prompt layer. Observability metrics track whether agents are earning trust at the Outcome, Causal, and Interactional layers, so trust isn't assumed.
Specific moves
Map interaction types and personas to a centralized prompt registry
Define governance rules per persona for tone, fallback, and escalation
Track mental model alignment as observability metrics
Practice 02
Design System Practice
How does design signal what kind of agent is at work?
A design system needs tokens for interaction types and agent postures. Avatars, badges, notification styles, and chat indicators carry meaning that maps back to the framework. Without tokenization, every team invents the same wheel differently, and the user experience fragments at the component level.
Specific moves
Define design tokens for interaction types and agent postures
Map tokens to UI affordances: avatars, badges, notifications
Set token governance for scaling as new agents are introduced
Practice 03
Design Practice
How does the design org think in systems, not pixels?
XD partners use the framework to anticipate multi-agent behaviors across the product. Orchestration patterns, agent presence indicators, and fallback messaging become shared vocabulary. The framework expands design thinking beyond conversational UI to ambient, autonomous, and embedded agent patterns.
Specific moves
Workshop orchestration patterns with the XD org
Establish shared visual indicators for active agents and handoffs
Expand design thinking beyond conversational UI patterns
Practice 04
Product Practice
How does this become how we ship?
Agentic Experience integrates into the Product Development Lifecycle. PRDs include an Agentic Experience section. Discovery sets expectations for agent role clarity. QA validates flow continuity, clarity, and trust outcomes, not just technical correctness. Orchestrated UX ships as core functionality.
Specific moves
Add an Agentic Experience section to PRDs
Codify a handoff checklist for agent behavior, transitions, and fallbacks
Expand QA scope to validate flow continuity and trust outcomes
About this document
AXA is a living document. A companion to platform architecture.
AXA is the experience companion to platform architecture, which describes the technical stack underneath orchestrated agent behavior. While platform docs cover the infrastructure, AXA covers the user-facing implications: how people perceive multi-agent experiences, what builds or breaks trust, and what vocabulary teams need to ship coherent orchestrated experiences at scale. The document is grounded in qualitative research and updates as the research evolves.