Chat mode vs Agent mode (and effort)

Halo’s workspace has two faces, switched with the Chat / Agent toggle in the composer. Under it, Agent mode adds two more controls — effort and run mode — that decide how hard the agent works and how often it checks with you.

Halo is in alpha on Base mainnet with real USDC.

Research the best L2s for a payments app and compare fees…

ChatAgent Effort Deep Run Auto-run

Effort: Auto Quick Standard Deep

Illustration — Chat / Agent, effort & run mode

Chat vs Agent

Chat — direct model chat, no tools: one paid inference per message. Best for quick questions and back-and-forth where you just want the model’s answer.
Agent — the full loop: it plans, uses tools, draws on memory, and may make several paid calls to complete one request. Best for research, multi-step tasks, and anything that needs the web or data tools.

The mode is remembered between visits.

Effort: Auto, Quick, Standard, Deep

In Agent mode, the effort chip sets how much work a request gets:

Quick — a fast answer to a fact, price, or definition: a slim prompt, at most one tool call, no plan. Lowest latency and cost.
Standard — a normal work session: a few tool calls, no multi-task plan.
Deep — a multi-task plan with synthesis and self-verification: the most tool use and paid calls, for reports and thorough research.
Auto — classifies each request for free (no inference) and routes it to Quick, Standard, or Deep for you.

If a request lands lighter than you wanted, you can push it further rather than re-asking.

Run mode: Auto-run vs Ask first

The run mode chip controls how autonomous the agent is:

Auto-run — plans, executes, verifies its own answer, and pursues the top follow-up; it only stops to ask when it’s genuinely blocked.
Ask first — proposes its plan for approval before spending, and offers follow-ups as choices.

(These are separate from per-tool permissions — each tool can still be set to Auto / Ask / Off in Tools & permissions.)

Which should I use?

Quick question → Chat, or Agent + Quick.
Task that needs the web or data → Agent + Standard.
Report or deep research → Agent + Deep.
Want to watch what it spends → Ask first; want it to just run → Auto-run.

What the agent remembers between chats: memory & context.
The tools the agent can pay for: enable tools with x402.

Chat mode vs Agent mode (and effort)

Chat vs Agent

Effort: Auto, Quick, Standard, Deep

Run mode: Auto-run vs Ask first

Which should I use?

Related