Blog Post #20: The Art of Task Decomposition: How Agents Learn to Plan

If you ask a person to “plan a vacation,” they don’t just stare blankly. Their mind instantly begins a process of task decomposition. A large, vague goal (“plan a vacation”) is broken down into a sequence of smaller, concrete, and achievable steps:

  1. Decide on a destination and budget.
  2. Research flights and accommodations.
  3. Book the flight.
  4. Book the hotel.
  5. Plan a daily itinerary.
  6. Pack bags.

This ability to create a structured plan is a hallmark of intelligence. For AI agents to move beyond simple, one-shot questions and tackle complex, multi-step goals, they too must master this art. But how does an agent learn to plan?

From Reactive to Proactive

So far, we’ve explored the ReAct (Reason -> Act) loop, which is fantastic for tasks where the next step is obvious. The agent thinks one step at a time. This is a reactive model.

But for a goal like, “Analyze our top three competitors and produce a summary report,” a simple reactive loop can get lost. The agent needs a higher-level strategy. This is where planning comes in. A planning agent is proactive. It formulates a multi-step strategy before it begins to act.

The Planner and The Executor

The magic of modern planning agents lies in a clever separation of duties, often orchestrated by a single, powerful prompt. Instead of asking the LLM, “What is the answer?”, we first ask, “What is the plan to find the answer?”

This creates two distinct phases:

  1. The Planning Phase: The agent makes a dedicated call to its core LLM. This “planner” prompt provides the LLM with the high-level user goal and, crucially, a list of all available tools with their detailed descriptions. The prompt instructs the LLM to return not an answer, but a structured, step-by-step plan.
  2. The Execution Phase: Once the plan is generated, an “executor” component in the agent’s code takes over. It reads the plan and carries out each step in sequence, often using the ReAct loop for each individual step. The output of one step is saved and made available as context for the next.

What a Plan Looks Like

Let’s imagine our agent is given the goal: “Write a brief report on the current weather and local time in Sodepur, West Bengal.”

The agent is also aware of two tools it has: get_current_weather(city: str) and get_current_time(timezone: str).

The “planner” LLM call would produce a structured plan that might look like this in JSON:

{
  "plan": [
    {
      "step": 1,
      "task": "Find the current local time in Khardaha. I need to determine the correct timezone first. A quick search reveals it's 'Asia/Kolkata'.",
      "tool": "get_current_time",
      "parameters": {
        "timezone": "Asia/Kolkata"
      }
    },
    {
      "step": 2,
      "task": "Find the current weather in Khardaha, West Bengal.",
      "tool": "get_current_weather",
      "parameters": {
        "city": "Sodepur, West Bengal"
      }
    },
    {
      "step": 3,
      "task": "Synthesize the gathered information (time and weather) into a single, coherent report for the user.",
      "tool": "final_answer_formatter",
      "parameters": {
        "input_from_step_1": "...",
        "input_from_step_2": "..."
      }
    }
  ]
}

The executor would then proceed:

  • Execute step 1, call get_current_time('Asia/Kolkata'), and store the result.
  • Execute step 2, call get_current_weather('Sodepur, West Bengal'), and store the result.
  • Execute step 3, feeding the results from the first two steps into a final LLM call to generate the polished report.

The Art of a Good “Planner” Prompt

The quality of the agent’s plan is directly proportional to the quality of the planning prompt and the clarity of its tool descriptions.

  • Provide a Role: Starting your prompt with “You are an expert project manager…” can prime the LLM to think in a more structured, logical way.
  • Be Clear About the Goal: “Write a report” is good. “Write a three-paragraph summary report for a non-technical audience” is much better, leading to a more tailored plan.
  • Give Constraints: “You can only use the web search tool a maximum of 5 times,” or “Prioritize using internal database tools before public web searches.”
  • Perfect Your Tool Docs: As we learned in our last post, if a tool’s description is vague, it will never be selected for the plan. The LLM relies entirely on these descriptions to understand its capabilities.

Beyond Simple Lists: Advanced Planning

While a linear, step-by-step list is the most common planning pattern, more advanced agents can create sophisticated plans:

  • DAGs (Directed Acyclic Graphs): The agent can identify steps that don’t depend on each other and plan to execute them in parallel, dramatically speeding up the process.
  • Hierarchical Planning: A “manager” agent can create a high-level plan and delegate the individual steps as sub-tasks to different “worker” agents, as seen in frameworks like CrewAI.

Conclusion

Task decomposition is the bridge between a simple reactive tool and a truly strategic AI agent. By prompting the LLM to first think and create a plan, we give it a structured path to follow, making its behavior more predictable, its reasoning more transparent, and its ability to solve complex, multi-step problems exponentially greater.

This shift from a purely reactive Reason -> Act loop to a proactive Plan -> Execute model is a fundamental leap in agent intelligence, allowing them to tackle the ambitious, real-world tasks we envision for them.

Author

Debjeet Bhowmik

Experienced Cloud & DevOps Engineer with hands-on experience in AWS, GCP, Terraform, Ansible, ELK, Docker, Git, GitLab, Python, PowerShell, Shell, and theoretical knowledge on Azure, Kubernetes & Jenkins. In my free time, I write blogs on ckdbtech.com

Leave a Comment