Code Agents

Agents that use code to perform tool calls

6 min read
#ai #agents #code

Code Agents are a new kind of agent that writes code to perform different actions. They’re changing how we think about AI tool usage.

Traditional agents make individual function calls—like asking “What’s 2+2?” then “What’s 4+4?” then “What’s 8+8?” Each question requires a round trip to the model.

Code agents think differently. They write a script: “Let me calculate powers of 2 up to 8.” One model call, multiple operations.

Why Code Agents Matter

There are several key benefits to code agents:

  • Token Efficiency: Saving on precious internet tokens (you don’t need as many tool calls to your agent)
  • Better Accuracy: More reliable for complicated tasks
  • Speed: Batch operations in single calls

Here’s a graphic from the Executable Code Actions Elicit Better LLM Agents paper which demonstrates why code agents are so valuable:

Code Agents

Building an Election Data Agent

Let’s see code agents in action by building an agent that can analyze election data. We’ll use the Hugging Face smolagents library, which has built-in support for code agents.

Our agent will query a DuckDB database containing Florida Election data to answer complex demographic questions.

Setting Up the Tools

In smolagents, defining tools is straightforward:

python
class DuckDBTool(Tool):
    name = "duckdb_query"
    description = """
    Execute SQL queries on a DuckDB in-memory database.
    This tool can create tables, insert data, and run SELECT queries.
    """

    inputs = {
        "query": {
            "type": "string",
            "description": "The SQL query to execute on the DuckDB database.",
        }
    }

    output_type = "string"

    def __init__(self):
        super().__init__()
        self.conn = duckdb.connect(":memory:")

    def forward(self, query: str):
        try:
            result = self.conn.execute(query).fetchall()
            if result:
                columns = [desc[0] for desc in self.conn.description]
                output = f"Columns: {', '.join(columns)}\n"
                for row in result:
                    output += f"{row}\n"
                return output
            else:
                return "Query executed successfully (no results returned)."
        except Exception as e:
            return f"Error executing query: {str(e)}"

This tool gives our agent direct access to a DuckDB in-memory database where it can create tables, insert data, and run complex queries—all through code.

Creating the Agent

python
agent = CodeAgent(
    tools=[
        DuckDBTool(),
        FinalAnswerTool(),
    ],
    model=InferenceClientModel(),
    max_steps=10,
    verbosity_level=2,
)

Unlike traditional agents that make individual tool calls, the CodeAgent writes and executes code blocks that can use multiple tools in sequence.

Simple Example: Hello World

Let’s start with a simple test:

python
agent.run("Echo hello world from the database.")

The agent thinks through the problem, then writes code to solve it:

Thought: To echo “hello world” from the database, I need to create a table, insert the string “hello world” into it, and then retrieve it using a SELECT query.

Code:

python
duckdb_query("CREATE TABLE greetings (message VARCHAR);")
duckdb_query("INSERT INTO greetings (message) VALUES ('hello world');")
result = duckdb_query("SELECT message FROM greetings;")
print(result)

Three database operations in one LLM call. Traditional agents would need three separate calls to achieve the same result—code agents batch the work efficiently.

Complex Analysis: Voter Demographics

Now let’s tackle something more challenging. We’ll ask our agent to analyze voter demographics:

python
agent.run("Which age group has the most voters? Show me party registration percentages by age group.")

How the Agent Approaches the Problem

Watch how the code agent breaks down this complex question into manageable steps. Unlike traditional agents that would need multiple back-and-forth calls, our code agent plans the entire analysis upfront and executes it systematically:

Step 1: Data Discovery The agent starts by exploring the database schema to understand what’s available. It runs PRAGMA table_info('voters') to discover the table has 40 columns including birth_date and party_affiliation—exactly what it needs.

Step 2: Age Group Analysis Next, it calculates voter counts by age group using SQL’s DATE_DIFF function to determine ages from birth dates:

View Age Group Analysis SQL
python
age_groups_query = """
SELECT
    CASE
        WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 18 AND 24 THEN '18-24'
        WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 25 AND 34 THEN '25-34'
        WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 35 AND 44 THEN '35-44'
        WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 45 AND 54 THEN '45-54'
        WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 55 AND 64 THEN '55-64'
        WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) >= 65 THEN '65+'
    END AS age_group,
    COUNT(*) as voter_count
FROM voters
WHERE birth_date IS NOT NULL
GROUP BY age_group
ORDER BY voter_count DESC
"""

Step 3: Complex Statistical Analysis Finally, it builds a sophisticated query using window functions to calculate party registration percentages within each age group—this is where the real power of SQL shines:

View Party Breakdown SQL
python
party_by_age_query = """
SELECT
    CASE
        WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 18 AND 24 THEN '18-24'
        WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 25 AND 34 THEN '25-34'
        WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 35 AND 44 THEN '35-44'
        WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 45 AND 54 THEN '45-54'
        WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 55 AND 64 THEN '55-64'
        WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) >= 65 THEN '65+'
    END AS age_group,
    party_affiliation,
    COUNT(*) as voter_count,
    ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (PARTITION BY
        CASE
            WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 18 AND 24 THEN '18-24'
            WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 25 AND 34 THEN '25-34'
            WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 35 AND 44 THEN '35-44'
            WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 45 AND 54 THEN '45-54'
            WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) BETWEEN 55 AND 64 THEN '55-64'
            WHEN DATE_DIFF('year', birth_date, CURRENT_DATE) >= 65 THEN '65+'
        END
    ), 1) as percentage
FROM voters
WHERE birth_date IS NOT NULL AND party_affiliation IS NOT NULL
GROUP BY age_group, party_affiliation
ORDER BY age_group, voter_count DESC
"""

The Results

The 65+ age group has the most voters (27,965), followed by 55-64 (15,310) and 35-44 (12,750).

Key findings from the analysis:

  • Republican (REP) registration is highest across all age groups
  • Republican percentage increases with age: 54.7% (18-24) to 63.4% (55-64)
  • No Party Affiliation (NPA) is second highest in younger groups but decreases with age
  • Democratic (DEM) registration is consistently around 14-16% except for 65+ where it jumps to 22.1%

This shows a clear pattern where older voters tend to be more Republican-leaning, while younger voters have higher rates of unaffiliated registration.

The Power of Code Agents

This example demonstrates several key advantages of code agents:

Complex Analysis: The agent performed multi-step data analysis that would require several back-and-forth exchanges with traditional function-calling agents.

Efficiency: Instead of multiple API calls, the agent planned and executed the entire analysis in just a few steps.

Flexibility: Code agents can adapt their approach based on the data structure they discover, writing custom SQL queries as needed.

Rich Insights: The agent not only answered the direct question but provided deeper insights about voting patterns across demographics.

Code agents represent a significant advancement in AI tooling, enabling more sophisticated and efficient problem-solving capabilities that bridge the gap between simple function calls and complex analytical workflows.