Design and Implement Agentic Loops for Autonomous Task Execution
What You Need to Know
An agentic loop is the core execution cycle that drives autonomous Claude-based agents. Understanding every phase of this cycle—and how to control it correctly—is fundamental to the entire exam.
The lifecycle works as follows: your application sends a request to Claude, then inspects the response's stop_reason field. When the value is "tool_use", Claude is requesting to call a tool. Your code executes that tool, appends the result back into the conversation history, and sends another request. When stop_reason is "end_turn", Claude has finished its work and the loop terminates.
A critical detail: tool results must be appended to the conversation history before sending the next request. This is what allows the model to reason about what happened and decide its next action. Without this, the agent loses its chain of reasoning.
Decision-making within the loop is model-driven. Claude decides which tool to invoke based on the accumulated context—there is no hardcoded sequence of tool calls. The model evaluates the situation, selects the most appropriate tool, observes the result, and repeats until it determines the task is complete.
Critical Anti-Patterns
- Parsing natural language to determine loop termination — Scanning the assistant's text for words like "done" or "complete" is unreliable. The model may use such words while still intending to call more tools.
- Arbitrary iteration caps as the primary stopping mechanism — A hard limit of, say, 10 iterations may cut the agent off mid-task. Iteration caps should be safety nets, not the main control signal.
- Checking assistant text content as a completion indicator — Examining the content type of the response (e.g., looking for text blocks) rather than the
stop_reasonfield leads to premature or missed termination.
stop_reason is the only reliable signal for controlling the agentic loop. When its value is "tool_use", continue the loop. When it is "end_turn", stop. Every other approach—text parsing, iteration counting, content-type inspection—is fragile and will eventually cause unexpected behavior.
An answer choice that suggests checking whether the assistant's response contains a text block (rather than a tool-use block) to decide when to stop the loop. This sounds plausible but is incorrect—the model can return both text and tool-use blocks in a single response. Always rely on stop_reason.
Setting max_iterations = 5 as the primary loop termination strategy. This truncates complex tasks arbitrarily. The correct approach is to use stop_reason for termination and treat iteration limits as emergency guardrails only.
Your customer-support agent intermittently terminates its loop before completing multi-step refund workflows. Debugging shows the loop exits when the assistant's response includes a text block, even if it also contains a pending tool call. What is the root cause and correct fix?
stop_reason field is the only authoritative signal. Option B addresses a symptom, not the cause. Options C and D are irrelevant to the control-flow problem.
Orchestrate Multi-Agent Systems with Coordinator–Subagent Patterns
What You Need to Know
Multi-agent systems on the exam follow a hub-and-spoke architecture. A single coordinator agent acts as the central hub, managing all communication between specialized subagents. Subagents never communicate directly with each other—every message flows through the coordinator.
Each subagent operates in isolated context. It does not automatically inherit the coordinator's conversation history, tool definitions, or system prompt. Whatever information a subagent needs must be explicitly provided when it is spawned.
The coordinator bears several critical responsibilities:
- Task decomposition — Breaking a complex goal into discrete, assignable sub-tasks
- Delegation — Selecting the right subagent for each sub-task and providing it with precisely the context it needs
- Result aggregation — Collecting outputs from subagents, resolving conflicts, and synthesizing a coherent final deliverable
- Dynamic subagent selection — Choosing which subagent to invoke based on the evolving state of the task, not a fixed sequence
The Narrow Decomposition Failure
A common failure pattern tested on the exam: the coordinator breaks a broad topic into too few or too narrow categories. For example, asked to research "AI in creative industries," the coordinator might decompose only into visual arts—missing music, writing, film, game design, and other domains. The resulting report would have significant coverage gaps, not because any subagent failed, but because the coordinator's initial decomposition was incomplete.
All communication flows through the coordinator. Subagents never talk directly to each other. The coordinator is the single point of control for decomposition, delegation, and aggregation. A poorly designed coordinator decomposition strategy is the most common root cause of incomplete multi-agent outputs.
When a multi-agent research report is missing entire categories, a tempting wrong answer is that the subagents' tools are misconfigured or that the model hallucinated. The actual root cause is almost always the coordinator's task decomposition—it never assigned those categories to any subagent.
Allowing subagents to share a global conversation state or to pass messages directly between each other. This violates the hub-and-spoke pattern and creates unpredictable cross-contamination of context.
You built a multi-agent research system to produce a report on "AI applications in creative industries." The coordinator spawns three subagents: one for visual arts, one for music composition, and one for creative writing. The final report is well-written and accurate for those three domains—but a reviewer notes it completely ignores film production, game design, advertising, and architecture. What is the most likely root cause?
Configure Subagent Invocation, Context Passing, and Spawning
What You Need to Know
Subagents are spawned using the Task tool. For an agent to be able to create subagents, "Task" must be included in its allowedTools configuration. Without it, the agent has no mechanism to delegate work.
The most important rule about subagent context: nothing is automatically inherited. A subagent does not receive the coordinator's conversation history, system prompt, or tool definitions unless you explicitly include them in the subagent's prompt. Every piece of information the subagent needs—background facts, specific instructions, relevant data, output format expectations—must be written into the prompt that spawns it.
Fork-Based Session Management
When an agent needs to explore multiple divergent approaches without polluting its main context, it can use fork-based sessions. Each fork creates an independent branch from a shared baseline state. This is useful when the agent needs to evaluate several strategies in parallel, then select the best outcome.
Parallel Subagent Spawning
A coordinator can spawn multiple subagents simultaneously by emitting multiple Task tool calls in a single response. This enables parallel execution of independent sub-tasks—significantly reducing total latency compared to sequential delegation.
Every piece of information a subagent needs must be explicitly included in its prompt. There is no automatic context inheritance. If you forget to pass the customer ID, the research topic, or the output format requirements, the subagent simply will not have them.
Assuming that subagents inherit the coordinator's system prompt or conversation history automatically. An answer choice stating "the subagent will use the coordinator's tools and context" is always wrong. Context must be passed explicitly.
Forgetting that "Task" must appear in allowedTools for an agent to spawn subagents. If this is missing, the agent physically cannot delegate—no amount of prompt engineering will enable it.
Implement Multi-Step Workflows with Enforcement and Handoff Patterns
What You Need to Know
The exam draws a sharp line between two enforcement strategies: programmatic enforcement (hooks, prerequisite gates, code-level checks) and prompt-based guidance (system prompt instructions, few-shot examples).
When a business rule requires deterministic compliance—meaning it must be enforced 100% of the time with zero exceptions—prompt instructions are insufficient. Prompts are probabilistic: Claude follows them most of the time, but with a non-zero failure rate. For rules where even a single violation has serious consequences (financial limits, data access controls, regulatory requirements), you must use programmatic enforcement.
Programmatic Prerequisites
A prerequisite gate is a code-level check that blocks a downstream tool call until a required prior step has been completed. For example, process_refund cannot execute until get_customer has returned a verified customer ID. The gate is not a suggestion—it physically prevents the tool from being called.
Structured Handoff Protocols
When an agent needs to escalate to a human or transfer to another system, it should follow a structured handoff protocol that includes:
- Customer details and verified identifiers
- A summary of the root cause or issue discovered
- What actions have already been taken
- Recommended next steps for the receiving party
For critical business rules, use hooks, not prompts. Programmatic enforcement (hooks, prerequisite gates) provides deterministic guarantees. Prompt-based instructions are valuable for guidance but have a non-zero failure rate and should never be the sole mechanism for high-stakes compliance requirements.
A system prompt that says "Never process refunds exceeding $500." This sounds like a reasonable safeguard, but it is prompt-based enforcement for a critical financial limit. The correct approach is a PostToolUse hook that programmatically intercepts refund calls exceeding the threshold and routes them to a human agent.
Adding few-shot examples showing the agent always calling get_customer before lookup_order. While helpful, this does not guarantee compliance. A programmatic prerequisite that blocks lookup_order until a verified customer ID exists is the only deterministic solution.
Production logs show that your customer-service agent occasionally processes refunds over $500 without human approval, violating company policy. The current implementation relies on a system prompt instruction: "Do not process refunds exceeding $500 without supervisor approval." What is the most effective fix?
Apply Agent SDK Hooks for Tool Call Interception and Data Normalization
What You Need to Know
Hooks in the Agent SDK are code-level interception points that run before or after tool calls. They provide deterministic guarantees—unlike prompts, which are suggestions the model follows probabilistically, hooks execute as code and are 100% reliable.
PostToolUse Hooks for Data Normalization
When your agent calls tools from multiple sources (different APIs, MCP servers, internal services), the returned data may use inconsistent formats—different timestamp representations, varying status-code conventions, mixed casing for identifiers. A PostToolUse hook can normalize this data automatically after each tool call, transforming heterogeneous outputs into a consistent schema before the model sees them.
Tool Call Interception for Policy Enforcement
A PreToolUse hook can inspect the tool name and parameters before execution and block calls that violate policy. For example, intercepting a delete_account call that targets an admin account, or blocking a send_email call with more than 100 recipients.
Hooks run as code, not as suggestions. They provide 100% reliable enforcement. When the exam asks about guaranteeing a behavior (blocking an action, normalizing a format, enforcing a limit), hooks are the correct mechanism. Prompt instructions—no matter how strongly worded—offer probabilistic compliance at best.
Using a system prompt instruction like "Always normalize timestamps to ISO 8601 format" instead of a PostToolUse hook. The prompt approach will sometimes fail silently, producing inconsistent data that breaks downstream systems. A hook guarantees normalization on every single tool response.
Relying on model self-policing for policy-violating tool calls—e.g., trusting the model to "never call delete_account on admin users." A PreToolUse hook programmatically inspects every call and provides a hard block, regardless of what the model intends.
Design Task Decomposition Strategies for Complex Workflows
What You Need to Know
The exam tests your ability to choose the right decomposition pattern for a given workflow. There are two fundamental approaches:
Fixed Sequential Pipelines (Prompt Chaining)
In a prompt chain, each step's output feeds directly into the next step's input, following a predetermined sequence. This works well when the workflow is predictable—you know in advance exactly what steps are needed and in what order.
A classic example: a code review pipeline that first analyzes each file individually (parallel per-file passes), then runs a cross-file integration pass that checks for data-flow consistency across all files. The stages are known ahead of time, so a fixed pipeline is efficient and straightforward.
Dynamic Adaptive Decomposition
When the workflow is exploratory or the next steps depend on intermediate discoveries, adaptive decomposition is appropriate. The agent generates an initial investigation plan, executes the first steps, then refines or extends the plan based on what it finds. Subtasks are created dynamically in response to emerging information.
For example, a debugging agent that starts by reading error logs, discovers a dependency conflict, pivots to analyze the package manifest, finds a version mismatch, and then generates a fix—none of which could have been predicted at the start.
Choose decomposition pattern based on workflow predictability. If you know all the steps in advance, use a fixed sequential pipeline (prompt chaining). If the steps depend on intermediate results and discoveries, use dynamic adaptive decomposition. Using a fixed pipeline for unpredictable work causes rigidity; using adaptive decomposition for simple sequential work adds unnecessary complexity.
Applying dynamic decomposition to a straightforward, well-defined sequential workflow (e.g., extract → validate → transform → load). This adds unnecessary complexity. A fixed pipeline handles this more reliably and with less overhead.
Using a fixed prompt chain for an open-ended investigation task where the required steps depend on what the agent discovers along the way. This creates rigidity and prevents the agent from adapting its plan when findings change the scope of the problem.
Manage Session State, Resumption, and Forking
What You Need to Know
Long-running or interrupted agent tasks need strategies for preserving and restoring state. The exam tests three mechanisms:
Named Session Resumption
Using --resume with a session identifier lets an agent pick up exactly where it left off. The full conversation history, tool results, and reasoning context are preserved. This is ideal when the prior context is still valid and the agent needs to continue its work.
Fork Session
The fork_session mechanism creates an independent branch from a shared baseline. Both the original and forked sessions retain the same history up to the fork point, then diverge independently. This is useful for exploring alternative approaches without polluting the main session's state.
Fresh Start with Structured Summary
When prior context has become stale—tool results are outdated, intermediate states no longer reflect reality, or the conversation has grown too long—starting a new session with a structured summary of key findings is more reliable than resuming. The summary captures the essential facts (decisions made, results found, current state) while discarding stale intermediate details.
When prior context is mostly valid, resume. When it is stale, start fresh with injected summaries. Session resumption preserves everything, which is powerful when context is current. But stale tool results and outdated intermediate states can mislead the agent. A fresh start with a carefully constructed summary gives the agent clean, accurate context.
Always resuming sessions even when tool results have become outdated (e.g., API responses from hours ago that no longer reflect current state). Resuming with stale context misleads the agent into making decisions based on obsolete data. A fresh start with an injected summary of the current state is more reliable.
Starting fresh every time and discarding all prior context. This wastes the agent's accumulated reasoning and forces it to re-derive conclusions it already reached. When prior context is valid, --resume is strictly more efficient.