Building AI Agents With Tool Use — From Zero to Working Code

An LLM that can call functions is an agent.

That’s it. That’s the whole concept. A regular LLM takes text in and gives text out. An AI agent takes text in, decides which tools to use, calls them, reads the results, and generates an answer from real data. No magic. Just a loop with function calls.

1. The Loop: How Agents Think

Every agent runs the same cycle: Think about what to do. Act by calling a tool. Observe the result. Decide if you’re done or need another loop. This is the ReAct pattern — and it’s the foundation of every agent framework.

The Agent Loop: Think → Act → Observe → Repeat

Every AI agent follows this cycle. The loop is the intelligence.

1 Think: "I need to look up the user's order status. I'll call the orders API."

2 Act: Calls get_order_status(order_id="12345")

3 Observe: API returns "shipped, tracking: UPS1234, arrives Thursday"

4 Decide: "I have what I need. Format the answer and respond."

The pink dot orbiting the diagram? That’s your agent. Each pass through the loop makes the answer more accurate. A simple question might take one loop. A complex task might take five. The agent decides when it’s done.

2. The Tool Box: What Agents Can Call

Tools are just functions with descriptions. You define them. The LLM reads the descriptions and picks the right one based on the user’s question. Click each tool to see its schema.

The Tool Box: What Agents Can Call

Tools are just functions. The agent picks which one to use based on the task.

Search Tool RAG, docs, knowledge base

▼

Searches your vector database or knowledge base. Returns relevant chunks. The most common tool type — almost every agent has one.

{searchTool}

API Tool REST calls, external services

▼

Calls external APIs — order lookup, weather, CRM, payments. The agent gets real-time data it couldn't possibly know from training.

{apiTool}

Code Execution Python, JavaScript, calculations

▼

Runs code in a sandboxed environment. For math, data processing, chart generation — anything the LLM can write but shouldn't hallucinate.

{codeTool}

Database Query SQL, read-only access

▼

Runs read-only SQL against your database. The agent generates the query from natural language. Always enforce read-only permissions.

{dbTool}

The key insight: tool descriptions matter more than tool code. If the description is vague, the agent picks the wrong tool. Write descriptions like you’re explaining it to a smart coworker who’s never seen the codebase.

3. Watch an Agent Execute: Full Trace

Here’s a real execution trace of an agent handling “What’s the status of my order?” Every thought, every tool call, every decision — timestamped and logged.

Agent Trace: Watch It Think

Real execution trace showing every thought, tool call, and decision.

0ms

THINK

User wants to know their order status. I need the order ID. Let me check the conversation — they said order #ORD-7891.

120ms

TOOL CALL

get_order_status(order_id="ORD-7891")

340ms

TOOL RESULT

status: "in_transit", carrier: "UPS", tracking: "1Z999AA...", eta: "2026-05-15"

360ms

THINK

Got the status. It's in transit with UPS. ETA is May 15th. Let me also check if there are any delivery exceptions.

380ms

TOOL CALL

get_tracking_details(tracking="1Z999AA...")

580ms

TOOL RESULT

exceptions: none, last_scan: "Memphis, TN — 6:42am"

600ms

DECIDE

I have everything I need. No exceptions. Will format a clear response with status, tracking link, and ETA.

620ms

RESPOND

Your order ORD-7891 is in transit via UPS. It was last scanned in Memphis, TN this morning. Expected delivery: May 15th. No delays detected.

620msTotal time

2Tool calls

2Think steps

$0.003Cost

Notice the two tool calls happened back-to-back. The agent didn’t wait for a human — it decided on its own that it needed tracking details after getting the order status. That’s the intelligence. Not any single tool call, but the sequence of decisions.

4. The Architecture: Three Layers

Every production agent has three layers. The brain (LLM) decides. The hands (tools) execute. The guardrails (safety) prevent damage. Skip any layer and you’ll regret it.

The Architecture: 3 Layers of Every Agent

Brain picks the action. Hands do the work. Guardrails keep it safe.

🧠

The Brain — LLM + System Prompt Decides what to do next

System prompt with role + rules Conversation history Tool definitions (function schemas) ReAct reasoning pattern

🔧

The Hands — Tools + APIs Executes actions in the real world

🔍 Search

🌐 API Call

💻 Code Run

🗄️ Database

📧 Email

📁 Files

🛡️

The Guardrails — Safety + Limits Prevents damage and runaway costs

Max loop iterations (prevent infinite loops) Token budget per request Tool permission allowlist Human-in-the-loop for destructive actions Output validation and content filtering

The guardrails layer is non-negotiable. Without it, an agent can loop forever (burning tokens), call destructive APIs, or generate harmful output. Max iterations, token budgets, and tool allowlists are the minimum.

5. When Agents Are Worth It — And When They’re Not

Agents add latency and cost. Every tool call is another API round-trip. Every think step burns tokens. If the task doesn’t need external data or multiple steps, a direct LLM call is better.

Agent vs Direct LLM — When Agents Win

Agents add overhead. Here's when they're worth it — and when they're not.

Agent Wins

Multi-step data lookup

Direct

25%

Agent

94%

Complex calculations

Direct

42%

Agent

98%

Real-time data access

Direct

Agent

91%

Direct LLM Wins

Simple Q&A

Direct

92%

Agent

90%

Creative writing

Direct

88%

Agent

85%

Text summarization

Direct

95%

Agent

93%

The rule: If the task needs external data or multiple steps, use an agent. If the task is pure language (writing, summarizing, translating), a direct LLM call is faster, cheaper, and just as good.

The sweet spot: tasks where the agent needs to look things up, combine data from multiple sources, or take actions. Order tracking, data analysis, multi-step workflows — that’s where agents shine. For simple Q&A or writing tasks, skip the agent overhead.