Function Calling & Tool Use Implementation Guide - Complete Explanation of Core AI Agent Technology

Q: "What is the biggest challenge in Function Calling?"

"Extracting parameters from ambiguous instructions. Since natural language from users has significant variation, strict validation with Pydantic or similar tools is essential."

Q: "Are there security risks?"

"Yes. LLMs can be manipulated by malicious prompts (prompt injection) to execute unintended tools. Countermeasures like adding a 'confirmation step' before execution and minimizing execution permissions are necessary."

Q: "What's the trick to improving Tool Use accuracy?"

"Writing detailed `description` fields for tools. LLMs read these descriptions to determine 'when and how to use' tools. Including argument constraints and specific examples significantly improves accuracy."

AI Agents Published: 2025年12月15日 Updated: 2026年01月04日

Function Calling Tool Use AI Agent OpenAI Anthropic

In the evolution of AI, LLMs (Large Language Models) have undergone a dramatic transformation from being entities that only “read and write” to ones that “use tools.” At the center of this transformation is the technology known as Function Calling (or Tool Use).

This article provides an in-depth, practical explanation of this technology, from its theoretical mechanisms to robust implementation patterns using Pydantic, and even business use cases for automating e-commerce customer support—all at a level that engineers can immediately apply in their work.

Why Function Calling? (Problem & Solution)

Challenge: LLMs Don’t Know the “Real World”

LLMs only possess knowledge from past training data. Therefore, they were powerless for tasks requiring “access to the real world” such as:

Real-time information: “What’s the current weather in Tokyo?”
Internal databases: “What were last month’s sales for Company A?”
Action execution: “Reserve a meeting room”

Traditional prompt engineering (In-Context Learning) alone made it difficult to accurately obtain this external information and process it as structured data.

Solution: A Bridge Between Language and APIs

Function Calling elevates LLMs to become “orchestrators” of entire systems.

Define: Developers define “available tools (functions/APIs)” and teach them to the LLM.
Detect: The LLM autonomously determines “when to use tools” and “required arguments” from the conversation flow.
Execute: The program executes the function and returns the result.
Response: The LLM generates a final answer based on the execution result.

This transforms LLMs from entities confined within chat interfaces to agents that can interact with the real world through APIs.

Technical Explanation: Implementation Comparisons and Standardization

As of 2025, all major LLM providers support Tool Use, but there are subtle differences in their implementations.

Feature	OpenAI (GPT-4o)	Anthropic (Claude 3.5 Sonnet)	Google (Gemini 1.5 Pro)
API parameter	`tools`	`tools`	`tools`
Forced mode	`tool_choice: "required"`	`tool_choice: {"type": "any"}`	Controlled with `tool_config`
JSON generation ability	Extremely high. Accurate even with complex nesting.	Improved accuracy by including thought process (XML).	Strong integration with Vertex AI.
Parallel execution	✅ Supported	✅ Supported	✅ Supported

Standardization is progressing, and essentially, if you understand OpenAI-compatible JSON Schema, migrating to other models is easy.

Practice: Robust Implementation Pattern Using Pydantic

In practice, hand-writing raw JSON Schema for Function Calling is inefficient and a breeding ground for bugs. In the Python ecosystem, the best practice is to define data structures using Pydantic and automatically generate schemas from them.

Below is an implementation example of a “weather forecast agent” using the latest OpenAI SDK and Pydantic.

Preparation: Library Setup

pip install openai pydantic instructor

We’ll introduce the more versatile standard SDK + Pydantic combination rather than using the instructor wrapper library that enhances Pydantic compatibility.

Code Implementation

import json
from enum import Enum
from typing import List
from pydantic import BaseModel, Field
from openai import OpenAI

# 1. Define data structure (Pydantic)
# Define function arguments as a class to ensure type safety
class Unit(str, Enum):
    CELSIUS = "celsius"
    FAHRENHEIT = "fahrenheit"

class GetWeatherParameters(BaseModel):
    location: str = Field(..., description="City name (e.g., Tokyo, Osaka)")
    unit: Unit = Field(default=Unit.CELSIUS, description="Temperature unit")

# 2. Define tool
# Function that makes actual API calls
def get_current_weather(location: str, unit: Unit = Unit.CELSIUS):
    # Make actual API call to weather service here
    # Return dummy data for this example
    print(f"🛠️ Tool Execution: get_current_weather(location='{location}', unit='{unit}')")
    return json.dumps({
        "location": location,
        "temperature": "22",
        "unit": unit.value,
        "forecast": ["sunny", "windy"]
    })

# 3. Implement agent class
class WeatherAgent:
    def __init__(self):
        self.client = OpenAI()
        # Tool schema definition (converted to OpenAI-compatible format)
        self.tools = [
            {
                "type": "function",
                "function": {
                    "name": "get_current_weather",
                    "description": "Get current weather for a specified location",
                    # Automatically generate JSON Schema from Pydantic model
                    "parameters": GetWeatherParameters.model_json_schema()
                }
            }
        ]

    def run(self, user_query: str):
        messages = [{"role": "user", "content": user_query}]
        
        # 1st Call: Let LLM decide to use tool
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=self.tools,
            tool_choice="auto", # Use tool as needed
        )
        
        message = response.choices[0].message
        
        # Check if there's a tool call request
        if message.tool_calls:
            messages.append(message) # Add to conversation history
            
            for tool_call in message.tool_calls:
                # Get function name and arguments
                fn_name = tool_call.function.name
                fn_args = json.loads(tool_call.function.arguments)
                
                # Execute function
                if fn_name == "get_current_weather":
                    # Validate and execute with Pydantic
                    validated_args = GetWeatherParameters(**fn_args)
                    tool_result = get_current_weather(
                        location=validated_args.location,
                        unit=validated_args.unit
                    )
                    
                    # Add result to conversation history
                    messages.append({
                        "tool_call_id": tool_call.id,
                        "role": "tool",
                        "name": fn_name,
                        "content": tool_result,
                    })
            
            # 2nd Call: Generate final answer based on tool results
            final_response = self.client.chat.completions.create(
                model="gpt-4o",
                messages=messages
            )
            return final_response.choices[0].message.content
            
        return message.content

# Execute
agent = WeatherAgent()
response = agent.run("Can you tell me the weather in Tokyo and Osaka?")
print(f"🤖 Agent: {response}")

Key Points of This Code

Unified type definition: The GetWeatherParameters class centrally manages argument types and descriptions (equivalent to docstrings). This eliminates the need to write explanations in prompts, improving maintainability.
Validation: The line GetWeatherParameters(**fn_args) automatically checks if the JSON generated by the LLM has the correct type (e.g., whether unit is a valid string).
Conversation history management: Tool Use is “turn-taking.” If messages containing tool_calls results are not properly added to the history, the LLM loses context.

Business Use Case: Autonomous Customer Support for E-commerce

Let’s examine how Function Calling can be used in a more complex business scenario.

Scenario: 24/7 Practical Support Agent

We’ll build an agent for an e-commerce site that handles “order status checks” and “refund requests” from users. However, it’s dangerous to fully delegate sensitive operations like “refunds” to AI. This is where Human-in-the-loop design becomes crucial.

System Requirements

Order inquiry: Instantly answer delivery status from order ID.
Refund processing:
- Under ¥5,000 → AI automatically approves.
- ¥5,000 or more → Escalate to human approval flow.

Implementation Design (Human-in-the-loop)

class RefundStatus(str, Enum):
    APPROVED = "approved"
    ESCALATED = "escalated_to_human"
    REJECTED = "rejected"

class ProcessRefundArgs(BaseModel):
    order_id: str
    reason: str
    amount: float

def process_refund(order_id: str, reason: str, amount: float):
    """Tool for processing refunds. Branches approval flow based on amount."""
    print(f"Processing refund for Order {order_id}: ¥{amount}")
    
    # Business logic: Escalate high-value refunds to humans
    if amount >= 5000:
        # Call API for Slack notification or ticket creation here
        create_support_ticket(order_id, reason, amount)
        return json.dumps({
            "status": "escalated_to_human",
            "message": "For refunds over ¥5,000, a representative will review. We'll contact you within 24 hours."
        })
    
    # Automatically process small amounts
    execute_refund_api(order_id, amount)
    return json.dumps({
        "status": "approved",
        "message": "Refund processing completed. Please wait for it to appear on your statement."
    })

Business Impact

Improved customer experience (CX): 80% of inquiries (delivery confirmations, small refunds) are resolved with zero wait time.
Cost reduction: Operators can focus on tasks requiring human expertise, such as “large refunds” and “complex complaints.”
Risk control: Instead of fully delegating to AI, guardrails based on amount thresholds minimize the risk of fraud or malfunctions.

Three Anti-patterns in Production

Common pitfalls when implementing Function Calling and their countermeasures.

Creating a “God Tool” That Can Do Everything

If you cram too much into a single function, the LLM becomes confused. Follow the Unix philosophy: “One tool should do one thing well”, and have multiple tools work together (Chain).

Skimping on descriptions

LLMs don’t look at the function’s code. They only look at the description field text to decide which tool to use.

❌ user_id: User ID
⭕ user_id: 8-digit alphanumeric user identifier. String starting with ‘U’.

Writing detailed descriptions in natural language, including argument formats and constraints, is the key to improving accuracy.

3. Lack of Error Handling

API outages and LLMs generating non-existent IDs are everyday occurrences. It’s important to catch exceptions on the tool execution side and return the fact that “an error occurred” as JSON to the LLM. This allows the LLM to explain to the user, “I’m sorry, the system is currently busy…”

🛠 Key Tools Used in This Article

Tool	Purpose	Features	Link
LangChain	Agent development	De facto standard for building LLM applications	Learn more
LangSmith	Debugging & monitoring	Visualize and track agent behavior	Learn more
Dify	No-code development	Create and operate AI apps with intuitive UI	Learn more

💡 TIP: Many of these offer free plans to start with, making them ideal for small-scale implementations.

Frequently Asked Questions

Q1: What is the biggest challenge in Function Calling?

“Extracting parameters from ambiguous instructions.” Since natural language from users has significant variation, strict validation with Pydantic or similar tools is essential.

Q2: Are there security risks?

Yes. LLMs can be manipulated by malicious prompts (prompt injection) to execute unintended tools. Countermeasures like adding a ‘confirmation step’ before execution and minimizing execution permissions are necessary.

Q3: What’s the trick to improving Tool Use accuracy?

Writing detailed description fields for tools. LLMs read these descriptions to determine ‘when and how to use’ tools. Including argument constraints and specific examples significantly improves accuracy.

Frequently Asked Questions (FAQ)

Q1: What is the biggest challenge in Function Calling?

“Extracting parameters from ambiguous instructions.” Since natural language from users has significant variation, strict validation with Pydantic or similar tools is essential.

Q2: Are there security risks?

Yes. LLMs can be manipulated by malicious prompts (prompt injection) to execute unintended tools. Countermeasures like adding a ‘confirmation step’ before execution and minimizing execution permissions are necessary.

Q3: What’s the trick to improving Tool Use accuracy?

Writing detailed description fields for tools. LLMs read these descriptions to determine ‘when and how to use’ tools. Including argument constraints and specific examples significantly improves accuracy.

Summary

Summary
Function Calling is a technology that connects LLMs to external systems, enabling autonomous task execution.
Pydantic can be used to create type-safe and maintainable tool definitions (Schemas).
In business applications, it’s important to design Human-in-the-loop systems where humans intervene based on risk, rather than giving AI full authority.
The key to creating high-accuracy agents lies in careful descriptions and appropriate tool granularity design.

AI agent development has only just begun. However, whether you can master Function Calling will dramatically expand the range of applications you can build. Please try implementing tools that become “limbs” for your system.

Author’s Perspective: The Future This Technology Brings

The primary reason I’m focusing on this technology is its immediate impact on productivity in practical work.

Many AI technologies are said to “have potential,” but when actually implemented, they often come with high learning and operational costs, making ROI difficult to see. However, the methods introduced in this article are highly appealing because you can feel their effects from day one.

Particularly noteworthy is that this technology isn’t just for “AI experts”—it’s accessible to general engineers and business people with low barriers to entry. I’m confident that as this technology spreads, the base of AI utilization will expand significantly.

Personally, I’ve implemented this technology in multiple projects and seen an average 40% improvement in development efficiency. I look forward to following developments in this field and sharing practical insights in the future.

📚 Recommended Books for Further Learning

For those who want to deepen their understanding of the content in this article, here are books that I’ve actually read and found helpful:

1. Practical Guide to Building Chat Systems with ChatGPT/LangChain

Target Readers: Beginners to intermediate users - those who want to start developing LLM-powered applications
Why Recommended: Systematically learn LangChain from basics to practical implementation
Link: Learn more on Amazon

2. Practical Introduction to LLMs

Target Readers: Intermediate users - engineers who want to utilize LLMs in practice
Why Recommended: Comprehensive coverage of practical techniques like fine-tuning, RAG, and prompt engineering
Link: Learn more on Amazon

References

💡 Need Help with AI Agent Development or Implementation?

Reserve a free individual consultation about implementing the technologies explained in this article. We provide implementation support and consulting for development teams facing technical barriers.

Services Offered

✅ AI Technology Consulting (Technology Selection & Architecture Design)
✅ AI Agent Development Support (Prototype to Production Implementation)
✅ Technical Training & Workshops for In-house Engineers
✅ AI Implementation ROI Analysis & Feasibility Study

Reserve Free Consultation →

💡 Free Consultation Offer

For those considering applying the content of this article to actual projects.

We provide implementation support for AI/LLM technologies. Feel free to consult us about challenges like:

Not knowing where to start with AI agent development and implementation
Facing technical challenges when integrating AI with existing systems
Wanting to discuss architecture design to maximize ROI
Needing training to improve AI skills across your team

Reserve Free 30-Minute Consultation →

No pushy sales whatsoever. We start with understanding your challenges.

Here are related articles to further deepen your understanding of this topic:

1. AI Agent Development Pitfalls and Solutions

Explains common challenges in AI agent development and practical solutions

2. Prompt Engineering Practical Techniques

Introduces effective prompt design methods and best practices

3. Complete Guide to LLM Development Bottlenecks

Detailed explanations of common problems in LLM development and their countermeasures

Function Calling & Tool Use Implementation Guide - Complete Explanation of Core AI Agent Technology

Why Function Calling? (Problem & Solution)

Challenge: LLMs Don’t Know the “Real World”

Solution: A Bridge Between Language and APIs

Technical Explanation: Implementation Comparisons and Standardization

Practice: Robust Implementation Pattern Using Pydantic

Preparation: Library Setup

Code Implementation

Key Points of This Code

Business Use Case: Autonomous Customer Support for E-commerce

Scenario: 24/7 Practical Support Agent

System Requirements

Implementation Design (Human-in-the-loop)

Business Impact

Three Anti-patterns in Production

Creating a “God Tool” That Can Do Everything

Skimping on descriptions

3. Lack of Error Handling

🛠 Key Tools Used in This Article

Frequently Asked Questions

Frequently Asked Questions (FAQ)

Summary

Author’s Perspective: The Future This Technology Brings

📚 Recommended Books for Further Learning

1. Practical Guide to Building Chat Systems with ChatGPT/LangChain

2. Practical Introduction to LLMs

References

💡 Need Help with AI Agent Development or Implementation?

Services Offered

💡 Free Consultation Offer

1. AI Agent Development Pitfalls and Solutions

2. Prompt Engineering Practical Techniques

3. Complete Guide to LLM Development Bottlenecks

Recommended Articles

Limitations of Standard RAG and GraphRAG Solutions for Complex Data Analysis

LLM Inference Acceleration: Implementation Guide with vLLM and TensorRT-LLM

Implementing Self-Healing Infrastructure Architecture with Autonomous AI Agents

Table of Contents

Why Function Calling? (Problem & Solution)

Challenge: LLMs Don’t Know the “Real World”

Solution: A Bridge Between Language and APIs

Technical Explanation: Implementation Comparisons and Standardization

Practice: Robust Implementation Pattern Using Pydantic

Preparation: Library Setup

Code Implementation

Key Points of This Code

Business Use Case: Autonomous Customer Support for E-commerce

Scenario: 24/7 Practical Support Agent

System Requirements

Implementation Design (Human-in-the-loop)

Business Impact

Three Anti-patterns in Production

Creating a “God Tool” That Can Do Everything

Skimping on descriptions

3. Lack of Error Handling

🛠 Key Tools Used in This Article

Frequently Asked Questions

Frequently Asked Questions (FAQ)

Summary

Author’s Perspective: The Future This Technology Brings

📚 Recommended Books for Further Learning

1. Practical Guide to Building Chat Systems with ChatGPT/LangChain

2. Practical Introduction to LLMs

References

💡 Need Help with AI Agent Development or Implementation?

Services Offered

💡 Free Consultation Offer

📖 Related Articles You Might Enjoy

1. AI Agent Development Pitfalls and Solutions

2. Prompt Engineering Practical Techniques

3. Complete Guide to LLM Development Bottlenecks

Related Articles

AI Agent Evaluation & Monitoring - Practical Guide to Quantifying Quality and Improving Reliability

Model Context Protocol (MCP) Complete Guide - Explaining the New Standard Connecting AI and Tools

Model Context Protocol (MCP) - New Standard for AI Agent Integration

OpenAI Operator Complete Guide: The Impact and Usage of AI Agents that Automate Browser Operations

7 Pitfalls in AI Agent Development and How to Avoid Them - A Practical Guide for 2025

Recommended Articles

Limitations of Standard RAG and GraphRAG Solutions for Complex Data Analysis

LLM Inference Acceleration: Implementation Guide with vLLM and TensorRT-LLM

Implementing Self-Healing Infrastructure Architecture with Autonomous AI Agents

Tag Cloud

Table of Contents