The Agentic Revolution: Building Autonomous AI

Lecture 3

Giving the Agent Hands: Tools and Function Calling

The Agentic Revolution: Building Autonomous AI

LECTURE 1 • 4 min

Beyond the Chatbox: The Birth of the Agent

LECTURE 2 • 5 min

The Architect's Blueprint: Frameworks and Foundations

LECTURE 3 • 4 min

Giving the Agent Hands: Tools and Function Calling

LECTURE 4 • 4 min

The Gift of Memory: Vector Stores and Context

LECTURE 5 • 4 min

The Art of Reasoning: ReAct and Self-Correction

LECTURE 6 • 4 min

The Agent Ecosystem: Orchestration and the Future

Listen for free in the SUN app:

Transcript

Your agent can reason perfectly. It can plan steps. But if it cannot touch anything outside its own context window, it is still just talking to itself. Think of a chess grandmaster locked in a room with no board. The thinking is brilliant. The outcome is nothing. That gap, between reasoning and doing, is exactly what tool use solves. Tool use enables a model to integrate with external functions and services, such as APIs, databases, and calculators, enhancing its capabilities beyond text generation. Without it, your agent cannot check a live price, run a calculation, or fetch a current document. It is frozen at its training cutoff, guessing at facts it could simply look up. Last time, the key idea was that frameworks give pre-built abstractions for the hardest parts of the agent loop. Now, the loop itself, perception, reasoning, action, observation, is the engine. Tool use is what makes the action step real. The ReAct pattern exemplifies this process, where the model alternates between natural language reasoning and executing tool calls, observing results before proceeding. That cycle is not optional. It is the mechanism that turns a language model into something that can complete a multi-step mission in the world. So how does the agent actually call a tool? The model is prompted with a structured list of available functions and their arguments. For example, you might define a function called search_web that takes a single string parameter called query. The model does not execute that function itself. Instead, it outputs a machine-readable JSON object describing which tool to call and with what parameters. Your orchestration code reads that JSON, runs the actual function, and feeds the result back into the model's context. Schemas like JSON Schema or OpenAPI-style descriptions are commonly used to specify tool signatures, including names, parameter types, and descriptions. That structure is what makes the output reliably parseable and executable. Here is where it gets interesting for you, Yasser. A core design decision is how much autonomy the model has in choosing tools. Some systems let the LLM decide when and which tool to call. Others enforce deterministic calls based on rules. Most production agents sit somewhere in between. Remarkably, research shows that language models can learn to use tools they have never seen during pretraining, as long as they are given clear natural-language descriptions. That means a well-written tool description is not just documentation. It is a capability grant. Documented performance gains show that tool use significantly enhances factual accuracy in knowledge-intensive tasks by offloading computation and retrieval to external systems. Retrieval-augmented generation, where an agent calls a retriever over a document index before answering, is among the most widely used patterns for reducing hallucinations. Benchmarks on web navigation and API-calling tasks show that explicit tool integration can substantially outperform purely prompt-based baselines on complex interactive tasks. That means, Yasser, the difference between a tool-equipped agent and a plain LLM is not marginal. On hard, multi-step tasks, it is often the difference between a correct answer and a confident hallucination. Giving an agent hands also means it can break things. Error handling is crucial. Models need resilience against malformed tool outputs, timeouts, and API errors, typically through re-prompting or retrying. Beyond bugs, there are real security risks: prompt injection, data exfiltration, and unintended actions when models interact with external systems. [short pause] One mitigation is strict input-output filtering, treating all external content as untrusted. Another is fine-grained permissioning: requiring user confirmation before the agent writes a file, makes a payment, or changes a system setting. Experiments with web-based agents show that adversarial webpage content can manipulate a model into harmful actions even when tools are working correctly. As you build, remember: tool definition is crucial. It fundamentally shapes your agent's capabilities and limitations. In multi-tool settings, agents must perform tool selection and sequencing, deciding not just which tool to call but how to chain multiple tools together, for example search, then parse, then compute. Evaluation requires both task success metrics and process metrics like number of tool calls, latency, and cost. Research also suggests that hybrid designs, combining hard-coded workflows with LLM flexibility, can outperform fully autonomous agents on reliability in production. The takeaway is this, Yasser: master your tool definitions, guard your tool boundaries, and your agent stops being a reasoning engine in isolation. It becomes a system that acts.