Agentic AI – Teaching Diagram

Layer 1 · Architecture — What an Agent Is Made Of

Core components

🎯

Goal / Task

User intent, objective

🧠

LLM Brain

Reasoning and decision engine

💾

Memory

Context window + vector store

📋

Planner

Decomposes goal into steps

🔧

Tools

Search, code exec, APIs

🌍

Environment

Files, services, databases

LLM Brain The core reasoning engine. It reads the goal, the available tools, and all prior observations, then decides what to do next — generate a thought, call a tool, or produce a final answer. It is stateless; all state lives in the context window or memory store.

Memory Short-term: the context window — the scrolling transcript of thoughts, tool calls, and observations the LLM can read right now. Long-term: an external store (vector DB, key-value store) the agent can read and write across sessions or when context overflows.

Tools & Environment Executable capabilities the agent can invoke: web search, code execution, file I/O, API calls, sending messages. The environment is the real world those tools act on. Tool results flow back as observations that update the agent's understanding of the task state.

Layer 2 · Agent Loop — The ReAct Cycle

Repeats until done

🎯

Receive Goal

Task injected into context

💭

Think

Chain-of-thought reasoning

⚡

Act

Invoke tool or sub-agent

👁️

Observe

Tool result enters context

🔍

Goal Met?

Check stopping condition

✅

Final Answer

Task complete, output delivered

ReAct: Reason + Act The agent interleaves reasoning traces ("I need to find X before I can do Y") with actions (tool calls). The scratchpad of thoughts is visible in the context, helping the model stay on track across many steps. This is fundamentally different from a single inference — it's a loop.

Stopping conditions The loop terminates when the LLM decides the goal is achieved and emits a final answer, when a maximum step limit is reached, or when a human-in-the-loop checkpoint intervenes. Without a clear stopping condition, agents can loop indefinitely — a key design concern.

Context accumulation Each Think → Act → Observe cycle appends to the context window. The growing transcript gives the agent "working memory" across steps. When the window fills, the agent can compress older steps or offload them to long-term memory — otherwise it loses track of earlier work.

Agent Patterns

🔄

ReAct Agent

A single LLM in a loop with access to tools. On each turn it writes a reasoning trace, picks one tool, observes the result, and repeats. Simple to implement and surprisingly capable for well-defined tasks.

web research code debug data analysis

🕸️

Multi-Agent

An orchestrator agent decomposes the task and delegates sub-tasks to specialized agents — a coder, a researcher, a critic — each with their own tools and context. Enables parallelism and deep specialization.

orchestrator sub-agents parallel work

🗺️

Plan-and-Execute

A planning phase first generates the full list of steps without executing them. An executor then works through the plan step by step. Separating planning from doing reduces mid-task drift on long-horizon tasks.

upfront plan step executor replanning

❌ Without Agents

The LLM answers in a single shot from its training data. It cannot look anything up, run code to verify, or break a complex task into steps. Multi-step problems require the user to manually chain prompts, copy outputs between steps, and babysit each stage. Errors compound silently.

✅ With Agents

The LLM plans, acts, and self-corrects in a loop. It searches for current information, writes and runs code to test its answers, reads files, calls APIs, and tries alternative approaches when one fails — all without human intervention at each step. Complex tasks become tractable.

The Loop Walk-Through

Goal Injection — Task and tools enter the context

The agent receives the user's goal as a system prompt, alongside descriptions of every available tool. These tool schemas (name, parameters, description) tell the LLM what it can do without hard-coding any logic. The agent also loads relevant long-term memories — past task outcomes, user preferences — so it starts informed, not blank. This context setup is the "working memory" the agent will reason over.

Think → Act — Reasoning trace then tool invocation

The LLM generates a thought: a scratchpad sentence explaining what it knows, what's missing, and what to try next. Then it emits a structured tool_use block — tool name and arguments — which the host intercepts and routes. The agent never runs code itself; it outputs structured intent and the host executes it. This separation keeps the agent auditable: every decision is written out in the transcript before it happens.

Observe — Tool result updates the agent's world model

The tool result is appended to the context as an observation message. The LLM now has new information: a search result, code output, a file's contents, an API response. It reads this, updates its reasoning, and decides whether the goal is met or another step is needed. This Observe → Think → Act loop is what makes agents adaptive — they respond to what actually happened, not what they assumed would happen.

Convergence — Recognizing when the task is done

After each observation the LLM checks a stopping condition: Is the goal achieved? Have all sub-tasks completed? Has a maximum step limit been hit? When it determines the task is done, it emits a final answer message — synthesizing everything in the context into a coherent response. Good agents also report what they did (tool calls, sources) so the user can verify the work, not just trust the conclusion.