Beyond the Magic: Unpacking the True Architecture of AI Agentic Systems
The seemingly magical ability of AI coding agents like Cloud Code or Cursor to execute complex tasks, such as creating a PostgreSQL database in AWS, relies not on a single monolithic intelligence, but on a precisely orchestrated architecture involving three distinct components: the user, the agent, and the Large Language Model (LLM). Contrary to popular belief, the LLM, while providing the “brain” for reasoning and decision-making, cannot directly interact with files, run commands, or act autonomously. Its function is to process context and generate responses or requests. The “agent” acts as the crucial intermediary and orchestrator, sitting between the user’s intent and the LLM, facilitating communication and execution. It manages the iterative loop of planning, acting, observing, and repeating, and is responsible for building and maintaining the context sent to the LLM, which includes system prompts and tool descriptions.
Actual task accomplishment hinges on “tools”—functions the agent can execute on the LLM’s behalf, such as reading files, executing bash commands, or editing code. The LLM analyzes user intent and available tool descriptions (provided in the context) to request specific tool executions from the agent; it never executes them directly. This execution cycle occurs within an “agent loop,” where the agent repeatedly sends updated context to the stateless LLM, processes tool requests, and incorporates results until a final answer is generated. For extending capabilities, the Model Context Protocol (MCP) enables dynamic integration of external tools. However, practical limitations exist, primarily context size, which dictates the maximum information an LLM can process per request. When context limits are reached, compaction techniques like summarization or truncation are employed, potentially impacting the LLM’s decision quality. Ultimately, the agent, despite being a “dumb actor” that only executes instructions, is indispensable for translating LLM intelligence into real-world actions.