Back to Writings·March 25, 2026·12 min read·Design

Designing Trust Levels for Autonomous AI Agents

The hardest design problem in AI agents isn't the model, the tools, or the infrastructure. It's the autonomy boundary: how much should the agent do on its own, and when should it stop and ask?

Get this wrong in either direction and the product fails. Too much autonomy and users feel out of control. Too little and the agent is just a chatbot with extra steps. The solution is a trust gradient: a system where users explicitly set their comfort level, and the agent calibrates its behavior accordingly.

We implemented a four-level trust system in Planck. This post covers the design rationale, the behavioral differences across levels, and the patterns we've found for helping users migrate up the gradient.

The autonomy spectrum

Every action an AI agent takes falls somewhere on a spectrum from fully supervised to fully autonomous:

Supervised ←————————————————————————→ Autonomous

Ask before     Propose and      Do it and       Do it
everything     wait             notify          silently

Most AI products pick a fixed point on this spectrum and build their entire UX around it. Copilot-style products sit on the left (suggest, human executes). Fully autonomous agents sit on the right (execute, human reviews). Both are wrong for a general-purpose agent because users don't have a fixed comfort level. It varies by:

Action type: Most users are comfortable with the agent reading their calendar autonomously but not deleting events without confirmation.
Familiarity: A new user wants oversight on everything. After two weeks of correct behavior, they want less friction.
Stakes: Moving a 1:1 with a colleague feels low-stakes. Rescheduling a board meeting is not.
Context: During a busy week, users want more autonomy (just handle it). During a light week, they want more control (I have time to be involved).

A single autonomy setting can't capture this nuance. But a coarse gradient, combined with per-action-type overrides, gets remarkably close.

The four levels

Level 0: Escalate

The agent proposes every action and waits for explicit approval before executing anything.

Behavior:

Creates events: "I'd like to schedule [meeting details]. Should I create this event?"
Reschedules: "Moving this would protect your focus block. Want me to proceed?"
Declines meetings: "This conflicts with your focus time. Would you like me to decline?"
Updates preferences: "Based on your pattern, I'd suggest changing your focus preference to mornings. Should I update this?"

Best for: New users, users with complex/sensitive calendars, users who've been burned by automation before.

Trade-off: Maximum control, maximum friction. Every interaction requires a round-trip confirmation. This gets tedious quickly, which is by design. It motivates users to move to Level 1 once they trust the agent's judgment.

Level 1: Propose

The agent auto-proposes actions with its reasoning, executes on confirmation, and explains if declined.

Behavior:

Creates events: "I'll schedule [meeting details] for Tuesday at 2pm. This slot has the least impact on your focus time and gives you a 30-minute buffer before your 3pm. [Create / Change time / Cancel]"
Reschedules: "I suggest moving your 1:1 with Alex from 10am to 2pm. This would give you a 3-hour focus block in the morning. Alex is also free at 2pm. [Move / Keep as is]"
Declines meetings: "This meeting would break your last remaining focus block this week. I recommend declining with the note 'Can we do this async?' [Decline / Accept anyway]"

Best for: The default level for most users. Provides transparency and control while reducing the coordination burden.

Key design detail: The agent always explains why it's proposing what it proposes. This isn't just good UX. It's how users calibrate their trust. If the reasoning is consistently sound, users naturally migrate to Level 2.

Level 2: Notify

The agent executes most actions autonomously and notifies the user after the fact.

Behavior:

Creates events: "Done: scheduled [meeting] for Tuesday at 2pm. [Undo]"
Reschedules flexible events: "Moved your 1:1 with Alex to 2pm to protect your morning focus block. [Undo]"
Protects focus time: Automatically declines or proposes alternatives for low-priority meetings that conflict with focus blocks. Notifies after.
Escalates for: Declining meetings with external attendees, rescheduling non-flexible events, any action involving the user's manager.

Best for: Users who've used the agent for 1-2 weeks and trust its judgment. Power users who want speed.

The escalation boundary: Even at Level 2, certain actions always require confirmation. These are hard-coded, not configurable:

Declining meetings with external (non-org) attendees
Deleting events (vs. rescheduling)
Changing team-level policies
Exporting or deleting user data

This is a critical design decision. The trust level governs the default behavior, but certain actions are inherently high-stakes and should never be fully autonomous regardless of the trust setting.

Level 3: Auto-act

The agent handles everything autonomously. The user checks in when they want to, not when prompted.

Behavior:

All scheduling, rescheduling, and focus time protection happens without notification
The user sees the results in their calendar and can review the agent's action log
End-of-day summary includes a recap of actions taken
The same hard-coded escalation boundaries from Level 2 still apply

Best for: Users with high meeting volumes who genuinely want hands-off calendar management. Executives with EAs who want the AI to function as a first-pass filter.

Risk: Users at Level 3 can lose situational awareness of their calendar. The end-of-day summary partially mitigates this, but it's a real trade-off. We recommend this level only for users who have spent 2+ weeks at Level 2 with consistently positive outcomes.

The migration pattern

Users don't typically set their trust level once and leave it. They migrate. The pattern we've observed:

Week 1:     Level 0 → Level 1  (after ~20 correct proposals)
Week 2-3:   Level 1 → Level 2  (after ~50 executed proposals without correction)
Week 4+:    Level 2 → Level 3  (subset of users, usually high meeting volume)

The agent doesn't automatically upgrade the trust level. That would undermine the entire concept (the user should feel in control of the autonomy boundary). But the agent can suggest it:

"Over the past two weeks, you've approved 47 of my 49 proposals without changes. Would you like to switch to notify mode, where I'll execute actions and let you know after? You can always switch back."

This is a nudge, not a push. The user explicitly opts in.

Per-action-type overrides

The four-level system is the coarse control. Per-action-type overrides are the fine control.

A user might be at Level 2 (notify) globally but want Level 0 (escalate) for any action involving external attendees. Or Level 3 (auto-act) for focus time protection but Level 1 (propose) for rescheduling.

We don't expose this as a settings page with 20 toggles. Instead, the agent learns overrides from corrections:

If the user undoes an autonomous action, the agent asks: "Would you like me to confirm before doing this in the future?"
If the user consistently approves a particular action type without changes, the agent asks: "I notice you always approve these. Want me to just do them automatically?"

The overrides are inferred from behavior and confirmed by the user. This avoids the configuration burden of a complex settings matrix while still allowing fine-grained control.

What we learned

Level 1 is the right default

We originally defaulted new users to Level 0 (escalate everything). This was too friction-heavy. Users felt like they were training the agent rather than being helped by it. Defaulting to Level 1 (propose with reasoning) strikes the right balance: users feel assisted from the first interaction, and the reasoning transparency builds trust naturally.

The "undo" button is more important than the "confirm" button

At Level 2 and above, the undo button is the primary trust mechanism. Users are willing to let the agent act autonomously as long as they can reverse anything with one click. A fast, reliable undo reduces the perceived risk of autonomy and accelerates trust migration.

We made undo work by keeping a 24-hour action log with full reversal capability. If the agent created an event, undo deletes it. If it rescheduled, undo moves it back. If it declined, undo re-accepts. Every action has a corresponding reverse action.

Transparency scales trust

The single biggest factor in trust migration isn't accuracy (though that's necessary). It's transparency. Users who can see why the agent made a decision are far more comfortable with autonomy than users who just see the outcome.

This means the agent should explain its reasoning even when not asked. Not in a verbose way, but enough to make the logic visible. "Moved to 2pm (protects your focus block)" is a 6-word explanation that provides enough context for the user to evaluate the decision.

Trust is earned per-domain, not globally

A user might fully trust the agent with scheduling but not with declining meetings. Trust doesn't transfer automatically across action types. The per-action-type override system acknowledges this reality.

Some users never leave Level 1

And that's fine. Not everyone wants an autonomous agent. Some users genuinely prefer the propose-and-confirm model because it keeps them engaged with their schedule. The system should support this indefinitely without making these users feel like they're using the product "wrong."

Principles for trust design

If you're building an agent with autonomous capabilities:

Default to transparent, not autonomous. Let users see the reasoning before seeing the results. Trust is built through observation, not assumption.
Make the autonomy boundary explicit. Users should always know what the agent will do on its own vs. what it will ask about. Implicit autonomy boundaries create anxiety.
Hard-code safety rails. Some actions should never be fully autonomous regardless of trust level. Identify these early and make them non-negotiable.
Build undo before you build autonomy. If you can't reliably reverse an action, you can't safely automate it. The undo system is the foundation that makes trust levels viable.
Nudge migration, don't force it. Suggest trust level changes based on observed behavior. Never auto-upgrade. The user's feeling of control over the boundary is more important than optimizing for speed.
Track trust signals. Every undo, correction, and approval is a trust signal. Use these to refine the agent's behavior and to surface migration suggestions at the right time.

The goal isn't to make the agent as autonomous as possible. It's to make the agent as autonomous as the user wants it to be, and to make the process of discovering that boundary feel natural, not stressful.

Multi-Person Scheduling Is a Consensus Problem. Here's How We Solved It.