Tool Expressions: The Next Evolution in Agent Intelligence

The Human Drive Toward Abstraction

For millennia, we as humans have wielded tools. A hammer drives a nail. A saw cuts wood. A wrench turns a bolt. Each tool performed a single, concrete action. But as our ambitions grew more complex, something remarkable happened. We stopped thinking about individual tools and started thinking about processes. We built abstractions.

Think about building a cabinet. You don't think "first I'll use the saw, then the hammer, then the drill." You just think "I'll build a cabinet." The individual tools fade into the background, replaced by higher-level intentions. This shift from tools to workflows became more powerful than any single tool could ever be.

From Assembly to Expressions

Look at computing. In the early days, we programmed in assembly language, writing direct instructions to the machine. Every operation was explicit. Every detail exposed. It worked, but it was exhausting.

Then we created high-level programming languages. Instead of instructing the computer step-by-step, we could write expressions like result = calculate_score(user_data) + apply_bonus(user_level). We stopped telling the computer how to do something and started telling it what we wanted.

This wasn't just about convenience. It was a fundamental shift in how we interfaced with computational power. Expressions let us think at the level of intention rather than implementation.

The Tool Trap in Agent Systems

Today's AI agents face the same limitation early programmers did. Modern frameworks like LangGraph provide sophisticated workflow orchestration, but they still require the agent to coordinate tool sequences through multiple conversation turns. Developers in the OpenAI community have been asking how to chain tools where "the input for the second tool is the output from the first," noting that currently "this often requires two separate calls to the LLM."

Give an agent dozens of tools and ask it to solve complex problems, and you'll see it forced to think one instruction at a time. Consider this seemingly simple request: "Find Python experts who have emails and star them."

The agent must call find_contributors("Python experts"), wait for the response, call find_contributors("has email"), compute the intersection manually, call filter_contributors with that intersection, then finally call star_contributor. Multiple round-trips. Multiple conversation turns. Error-prone manual coordination.

The agent spends its energy on orchestration rather than intelligence. It's like we handed it assembly language when it needed a high-level programming language.

Introducing Tool Expressions

What if agents could think in expressions instead of instructions? What if instead of sequentially calling tools, they could compose them into declarative workflows?

That's exactly what we built. Tool expressions give agents building blocks for constructing complex operations declaratively. Instead of being limited to predefined tools, agents can now express sophisticated multi-step workflows as single, atomic operations.

Here's that same task with tool expressions:

find_contributors("Python experts and has email", then={
    "tool": "filter_contributors",
    "args": {"usernames": "$output"}
}).then({
    "tool": "star_contributor",
    "args": {"usernames": "$output", "star": True}
})

Or even more powerfully, using the chain expression for complex workflows:

chain(operations=[
    {"tool": "find_contributors", "args": {"query": "Python experts"}},
    {"tool": "filter_contributors", "args": {"usernames": "$output"}},
    {"tool": "find_contributors", "args": {"query": "has email"}},
    {"tool": "filter_contributors", "args": {"usernames": "$output"}},
    {"tool": "star_contributor", "args": {"usernames": "$output", "star": True}}
])

One expression. One tool call. Atomic execution.

The Power of Composition

Tool expressions unlock functional programming patterns for AI agents. We can transform outputs by finding contributors and immediately getting their full details. We can filter by searching for criteria and narrowing results in one atomic operation. We can trigger side effects by finding amazing contributors and starring them instantly.

The real power emerges in multi-step pipelines. Imagine adding a repository, finding its active maintainers, filtering to only those contributors, starring them all, and retrieving their complete profiles. All of this expressed as a single chain that executes atomically:

chain(operations=[
    {"tool": "add_repository", "args": {"repository_name": "pytorch/pytorch"}},
    {"tool": "find_contributors", "args": {"query": "active maintainers"}},
    {"tool": "filter_contributors", "args": {"usernames": "$output"}},
    {"tool": "star_contributor", "args": {"usernames": "$output", "star": True}},
    {"tool": "get_contributors", "args": {"usernames": "$data.starred_contributors"}}
])

The agent declares its intent once. The system handles the execution.

Declarative Data Flow

The key to tool expressions is placeholder resolution. It's a simple but powerful data flow language. When you write "usernames": "$output", the system automatically passes the previous operation's result. When you reference $data.starred_contributors, you're accessing the current state. Array indexing like $outputs[0] lets you reference specific previous results.

These placeholders are resolved at runtime, creating seamless data flow between composed operations. The agent doesn't manually pass data between steps. It declares what should flow where, and the system handles the rest.

Why This Matters

Tool expressions reduce cognitive load on agents. They spend less energy on coordination and more on reasoning, thinking at the level of intention rather than implementation. Composed operations execute as a single transaction. Either the entire workflow succeeds or it fails cleanly, with no partial states or orphaned operations.

The efficiency gains are immediate. One tool call instead of seven. One round-trip instead of multiple. One message in the context window instead of a sprawling conversation. Common patterns become reusable: once we've defined a "find and star" workflow, we can apply it anywhere.

Perhaps most importantly, expressions are validated before execution. If a tool doesn't exist or arguments don't match, the agent knows immediately, before wasting API calls on a workflow that will fail halfway through.

The Architecture

Under the hood, we leverage LangGraph for stateful workflow orchestration, dependency injection for automatic state and config passing, and async execution for parallel operations where possible. When an agent constructs a tool expression, the system parses it into an execution plan, resolves placeholders in each step's arguments, executes operations sequentially or in parallel, accumulates state changes atomically, and returns the final result.

All of this happens transparently. The agent declares what it wants. The system handles how.

Real-World Impact

Let's look at a concrete example. The task: "Add the PyTorch repo, find active maintainers, keep only them, and star them." Without tool expressions, this requires four separate tool calls, four conversation turns, and careful manual state tracking.

With tool expressions, we express it as a single declaration:

chain(operations=[
    {"tool": "add_repository", "args": {"repository_name": "pytorch/pytorch"}},
    {"tool": "find_contributors", "args": {"query": "active maintainers"}},
    {"tool": "filter_contributors", "args": {"usernames": "$output"}},
    {"tool": "star_contributor", "args": {"usernames": "$output", "star": True}}
])

The result? One tool call, one conversation turn, atomic execution. Twelve maintainers identified, filtered, and starred. Clear, declarative intent that executes reliably.

The Future

We're just beginning. Tool expressions open the door to conditional logic with if_then_else for branching workflows, loops with for_each for bulk operations, parallel execution for concurrent operations, error handling with try_catch for graceful degradation, and custom operators for domain-specific workflows.

Imagine agents that reason about workflows the way we reason about algorithms. At the level of composition, not instruction. That's where we're headed.