How AWS Strands Agent Loop Works
7th March 2026
The Pattern: ReAct (Reason + Act) — Not Prompt Chaining
Strands uses the ReAct pattern — a recursive loop where the LLM reasons, optionally calls tools, observes results, then reasons again. It is NOT prompt chaining (where you have a fixed sequence of prompts). The loop is open-ended and driven by the model’s decisions.
The Core Flow
User prompt
│
▼
┌──────────────────────────────────┐
│ event_loop_cycle() │
│ │
│ 1. Send to LLM: │
│ - system_prompt │
│ - messages (full history) │
│ - tool_specs (available tools│
│ │
│ 2. LLM responds with either: │
│ a) stop_reason="end_turn" ──┼──► Done! Return response
│ b) stop_reason="tool_use" ──┼──► Execute tools, then RECURSE ↓
│ c) stop_reason="max_tokens"──┼──► Exception
└──────────────────────────────────┘ │
▲ │
└─────────────────────────────────────────────┘
What Exactly Gets Sent to the LLM Each Turn
Every single call to the LLM sends the ENTIRE conversation history. Here’s exactly what stream_messages() sends:
model.stream(
messages, # THE FULL message history (all turns)
tool_specs, # All available tool definitions
system_prompt, # System prompt (same every time)
)
So for a 3-tool-call interaction, the LLM calls look like:
| Call # | What’s sent |
|---|---|
| 1 | system_prompt + [user_msg] |
| 2 | system_prompt + [user_msg, assistant_msg_with_toolUse, user_msg_with_toolResult] |
| 3 | system_prompt + [user_msg, asst_toolUse, toolResult, asst_toolUse_2, toolResult_2] |
Yes — the full history grows with each cycle. The system prompt is sent every time (LLMs are stateless).
How Context is Maintained
Context is maintained through agent.messages — a single mutable list that accumulates all turns:
- User message → appended at invocation start (
_append_messages) - Assistant message (LLM response) → appended in
_handle_model_executionafter streaming completes - Tool result message → appended in
_handle_tool_executionafter tools run
# In _handle_model_execution:
agent.messages.append(message) # Assistant's response
# In _handle_tool_execution:
tool_result_message = {
"role": "user",
"content": [{"toolResult": result} for result in tool_results],
}
agent.messages.append(tool_result_message) # Tool results as "user" role
The conversation format follows the standard alternating role pattern that LLMs expect:
user → assistant (with toolUse) → user (with toolResult) → assistant (with toolUse) → user (with toolResult) → assistant (final answer)
The Recursion Mechanism
The loop is recursive, not iterative:
# In _handle_tool_execution, after tools complete:
events = recurse_event_loop(agent=agent, invocation_state=invocation_state)
Each recursion is a new event_loop_cycle() call. The recursion continues until the LLM returns stop_reason="end_turn" (meaning it’s done and doesn’t want to call more tools).
Context Window Management
When the accumulated messages exceed the model’s context window:
# In _execute_event_loop_cycle:
except ContextWindowOverflowException as e:
self.conversation_manager.reduce_context(self, e=e) # Trim history
# Then retry with reduced context
The ConversationManager strategies handle this:
- SlidingWindowConversationManager (default): Drops oldest messages
- SummarizingConversationManager: Summarizes old messages into a compact form
Key LLM Concepts Used
| Concept | How Strands Uses It |
|---|---|
| ReAct Loop | LLM reasons → calls tools → observes → reasons again (recursive) |
| Tool Calling | LLM returns structured toolUse blocks; framework executes and returns results |
| Stateless LLM Calls | Full history + system prompt sent on every call (LLMs have no memory between calls) |
| Streaming | Responses streamed chunk-by-chunk via model.stream() → process_stream() |
| Context Window | Managed by ConversationManager — sliding window or summarization when full |
What It’s NOT
- Not prompt chaining: There’s no fixed pipeline of prompts. The LLM decides when to stop.
- Not RAG: Though tools can do retrieval, the loop itself is pure ReAct.
- Not multi-turn memory: The agent doesn’t have persistent memory between
agent()calls unless you keep the sameagent.messageslist (which it does by default).
Summary
Strands implements a pure ReAct agent loop: send the full conversation (system prompt + all history + tool specs) to the LLM every turn, let it decide whether to call tools or answer, execute tools if requested, append results to history, and recurse. The agent.messages list IS the context — it grows with every turn and gets sent in full each time. Context window overflow is handled by the conversation manager trimming or summarizing history.
More recent articles
- OpenUSD: Advanced Patterns and Common Gotchas. - 28th March 2026
- OpenUSD Mastery: From Composition to Pipeline — A SO-101 Arm Journey - 25th March 2026
- Learning OpenUSD — From Curious Questions to Real Understanding - 19th March 2026