AI Function Calling & Tool Use Explained

4 min read

Tool use (also called function calling) is the ability of an AI model to request that your application execute external functions on its behalf. The model doesn’t run the tools directly — it generates a structured request, your code executes it, and the result gets fed back for the model to use in its response.

How Tool Use Works

  1. You define tools — tell the model what’s available, what each tool does, and what parameters it accepts
  2. The model decides — based on the user’s request, it chooses which tool to call and with what arguments
  3. You execute and return — your code runs the function and sends results back to the model
User: "What's the weather in Tokyo, and do you have umbrellas?"

Model calls: get_weather(city="Tokyo")
Model calls: search_products(query="umbrella", max_results=3)

→ Results returned → Model composes a response using both

The model called both tools in parallel — it recognized two independent needs and handled them simultaneously.

Writing Good Tool Descriptions

The model selects tools based almost entirely on their descriptions. Vague descriptions lead to missed or incorrect calls.

Weak: search — Searches for things

Strong:

search_knowledge_base — Searches internal knowledge base for
policy documents, FAQs, and product guides. Use when the user
asks about company policies or product details.
  query: Search query (natural language, not keywords)
  category: Optional — "policy", "faq", or "product"

Tell the model what the tool does, when to use it, and how the parameters work.

Tips

  • Fewer tools, better results — a focused set of well-described tools outperforms a sprawling toolkit
  • Fix over-triggering by adding “Only use this when you don’t already know the answer” to descriptions
  • Fix under-triggering by being explicit — “Use the search tool to find the answer”
  • Test tool selection — verify the model picks the right tool for diverse inputs, not just obvious ones

With all these techniques in your toolkit — structured output, templates, chaining, long context, RAG, and tool use — one critical question remains: how do you know your prompts actually work? That’s what evaluation is for.

Quick Quiz

Question 1 of 2

When an AI model "calls" a tool, what actually happens?