Blog Post #17: ReAct In-Depth: The Foundational Reason -> Act Loop

Table of Contents

How do you solve a problem you don’t know the answer to? Think about a simple question: “What was the first movie directed by the person who won the Best Actress Oscar in 1998?”

You don’t just guess. Your internal monologue probably goes something like this:

Thought: “First, I need to figure out who won Best Actress in 1998. I’ll search for that.”
Action: You open a web browser and type “Best Actress Oscar 1998”.
Observation: You read the search results: “Helen Hunt won for As Good as It Gets.”
Thought: “Okay, the person is Helen Hunt. Now I need to find the first movie she directed. I’ll search for that.”
Action: You type “first movie directed by Helen Hunt”.
Observation: You see the result: “Then She Found Me (2007).”
Thought: “I have all the pieces. The answer is Then She Found Me.” You then state the final answer.

This iterative loop of thinking, taking an action, and observing the result is the natural human way of solving problems that require external information. It’s also the foundational design pattern behind most modern AI agents, formalized in a groundbreaking paper called ReAct.

What is ReAct?

ReAct, which stands for “Reason + Act,” is a framework that synergizes the two greatest strengths of Large Language Models: their ability to reason (like in Chain-of-Thought) and their ability to interact with external tools (to act).

Instead of trying to solve a complex problem in one go, a ReAct agent breaks it down into a series of thought-action-observation steps. This makes the agent more reliable, transparent, and capable of tackling dynamic problems.

The ReAct Loop: A Visual Breakdown

The entire process can be visualized as a loop that continues until the agent is confident it has the final answer.

      +--------------------------+
.---> |         USER QUERY       |
|     +--------------------------+
|                |
|                v
|     +----------------------------+
|     |      LLM (with history)    | --.
|     |      Generates Thought     |   |
|     |        & Action            |   |
|     +----------------------------+   | (Is the answer found?)
|                |                     |
|                v                     |
|     +----------------------------+   |
|     |       ACTION (Tool Call)   | --' (Yes) --> FINAL ANSWER
|     | (e.g., search('query'))    |
|     +----------------------------+
|                |
|                v
|     +----------------------------+
|     |  OBSERVATION (Tool Result) |
|     +----------------------------+
|                |
'----------------' (Feed back into LLM context)

Query: The user provides the initial problem.
LLM (Reasoning): The LLM receives the query, plus the history of all previous steps. Its core task is to generate two things:
- Thought: An internal monologue explaining its reasoning. “I need to find X, so I should use tool Y.” This is the “Reason” part.
- Action: A specific, machine-parseable command to execute a tool. “search(‘Best Actress 1998’)”. This is the “Act” part.
Tool (Acting): The agent framework parses the Action and calls the corresponding tool (e.g., a web search API, a calculator, a database query).
Observation: The tool returns a result. This result is the Observation.
Feedback Loop: The Observation is added to the history and fed back into the LLM for the next turn, allowing it to refine its next Thought.

A Step-by-Step Example in Pseudo-Code

The magic of ReAct is powered by a clever prompt that instructs the LLM to respond only in the Thought: ... Action: ... format.

Let’s walk through our initial query: Query: "What was the first movie directed by the person who won the Best Actress Oscar in 1998?"

Turn 1

The agent starts with an empty history.

Prompt to LLM:Query: "What was the first movie directed by the person who won the Best Actress Oscar in 1998?" You have a 'search' tool. Respond with your reasoning in a 'Thought:' and the tool to use in an 'Action:'.
LLM Response:Thought: I need to solve this in two steps. First, find the winner of the Best Actress Oscar in 1998. Then, find the first movie directed by that person. I will start with the first step. Action: search('Best Actress Oscar 1998')
Agent executes: The framework calls the search tool with the query 'Best Actress Oscar 1998'.
Observation: The tool returns: "Helen Hunt won the Best Actress Oscar in 1998 for 'As Good as It Gets'."

Turn 2

The agent appends the observation to the history and calls the LLM again.

Prompt to LLM:Query: "What was the first movie directed by the person who won the Best Actress Oscar in 1998?" ... Observation: "Helen Hunt won the Best Actress Oscar in 1998 for 'As Good as It Gets'."
LLM Response:Thought: I have identified the person as Helen Hunt. Now I need to find the first movie she directed. I will use the search tool for this. Action: search('first movie directed by Helen Hunt')
Agent executes: Calls the search tool with the new query.
Observation: The tool returns: "Helen Hunt's directorial debut was the film 'Then She Found Me' (2007)."

Turn 3

The agent appends the new observation and calls the LLM a final time.

Prompt to LLM:Query: "What was the first movie directed by the person who won the Best Actress Oscar in 1998?" ... Observation: "Helen Hunt won..." Observation: "Helen Hunt's directorial debut was..."
LLM Response:Thought: I have found both pieces of information. The winner was Helen Hunt, and her first directed movie was 'Then She Found Me'. I now have the final answer. Action: finish('The first movie directed by the person who won the Best Actress Oscar in 1998 (Helen Hunt) was "Then She Found Me".')
Agent executes: The framework sees the special finish action and returns the content as the final answer to the user. The loop terminates.

Why ReAct is a Game-Changer

Transparency & Debuggability: The stream of Thought:s provides a perfect, human-readable log of the agent’s reasoning. When something goes wrong, you can see exactly where its logic failed.
Reliability & Self-Correction: If a tool fails or returns useless information, the agent can “observe” this failure and use its next “thought” to try a different approach, like rephrasing a search query.
Enhanced Capability: It allows agents to solve dynamic, multi-step problems that are impossible to answer from the LLM’s static training data alone.

Frameworks like LangChain and LlamaIndex have pre-built “ReAct Agents” that handle this logic for you. But by understanding the fundamental loop that powers them, you are better equipped to build, debug, and truly master the art of creating intelligent agents.

Author

Debjeet Bhowmik

Experienced Cloud & DevOps Engineer with hands-on experience in AWS, GCP, Terraform, Ansible, ELK, Docker, Git, GitLab, Python, PowerShell, Shell, and theoretical knowledge on Azure, Kubernetes & Jenkins. In my free time, I write blogs on ckdbtech.com