OpenAI Assistants API function calling and retrieval practical guide • Meteora Web Agency

Have you ever tried to build an AI assistant that answers with up‑to‑date data or interacts with external systems? If so, you know how complex it is to manage context, tokens, and API calls. With OpenAI's Assistants API, you can delegate all of that to a professional architecture without reinventing the wheel every time.

We, at Meteora Web, have put it into production for clients managing complex catalogs, dynamic FAQs, and back‑office automations. And we tell you right away: the difference between a “homemade” assistant built with Chat Completion and a properly configured Assistant with retrieval and function calling is huge. Not only in answer quality, but also in development and maintenance time saved.

This guide is hands‑on. If you first want to understand where this fits in the AI agent landscape, check our complete guide to AI agents and advanced automation. Here we go straight to the code.

What Is an AI Assistant and Why It Changes Your Day

An AI Assistant is a pre‑configured entity on OpenAI, with a model, system instructions, enabled tools, and persistent memory via threads and runs. Compared to a plain chat.completions.create call, you get three huge advantages:

Automatic context management: each thread keeps the message history, you no longer have to manually concatenate exchanges.
Built‑in retrieval: you upload one or more files (PDF, CSV, TXT) and the assistant searches inside them to answer. No manual vectorization, no custom chunking.
Native function calling: you define functions in JSON Schema, the assistant decides when to call them, you handle the output and feed it back. All in a single run flow.

For a business that already has databases, ERPs, or internal documents, this means going from a “lab prototype” to an agent that works on real data in a few hours.

Practical Setup: Assistant with Python

We start with a minimal but complete configuration. You need an OpenAI API key (with access to gpt‑4o or gpt‑4‑turbo) and the openai library updated:

pip install openai --upgrade

Creating the Assistant

Let's define an assistant for a fictional clothing store – a topic we know well, since we managed the ERP of a real store for years.

from openai import OpenAI

client = OpenAI(api_key="sk-proj-...")

assistant = client.beta.assistants.create(
    name="Store Assistant",
    instructions="You are an assistant for a clothing e‑commerce.\nAnswer in a friendly tone. If you don't know something, use your tools.",
    model="gpt-4o",
    tools=[
        {
            "type": "function",
            "function": {
                "name": "get_product_price",
                "description": "Get the price of a product given its ID",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "product_id": {"type": "string", "description": "Product ID"}
                    },
                    "required": ["product_id"]
                }
            }
        },
        {"type": "retrieval"}
    ],
    file_ids=[]  # will add later
)
print(f"Assistant created with ID: {assistant.id}")

Notice we enabled both function calling (with the get_product_price function) and retrieval. Now let's upload a file for retrieval.

Uploading a File for Retrieval

Suppose we have a PDF with the store's frequently asked questions (faq_store.pdf). We upload it and attach it to the assistant:

file = client.files.create(
    file=open("faq_store.pdf", "rb"),
    purpose="assistants"
)

# Update the assistant with the file
assistant = client.beta.assistants.update(
    assistant_id=assistant.id,
    file_ids=[file.id]
)

Now the assistant can search textually inside the PDF to answer questions like “what is the return policy?” or “how to track my order?”.

Creating a Thread and Starting a Conversation

The thread is the conversation between user and assistant. Each thread is independent and automatically maintains history.

thread = client.beta.threads.create()

# Add a user message
message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="What is the price of product with ID 'TSH-123'?"
)

Now we need to run a run which makes the assistant process the message and decide whether to call a function or answer directly.

run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id
)

# Polling until completion
import time
def wait_for_run(client, thread_id, run_id):
    while True:
        run = client.beta.threads.runs.retrieve(thread_id=thread_id, run_id=run_id)
        if run.status == "completed":
            return run
        elif run.status == "requires_action":
            return run
        elif run.status == "failed":
            raise Exception(f"Run failed: {run.last_error}")
        time.sleep(1)

run = wait_for_run(client, thread.id, run.id)
print(run.status)

If the status is requires_action, the assistant decided to call a function. We must execute the local function and return the result.

Function Calling in Action

In our example, the get_product_price function must be implemented by us. Here's a simulated function and how to handle the response:

# Function implementation (simulates database access)
def get_product_price(product_id):
    prices = {
        "TSH-123": 19.99,
        "JEA-456": 59.90,
        "CAP-789": 12.50
    }
    price = prices.get(product_id)
    if price is None:
        return {"error": "Product not found"}
    return {"product": product_id, "price": price}

# If the run requires action
if run.status == "requires_action":
    tool_calls = run.required_action.submit_tool_outputs.tool_calls
    outputs = []
    for tool_call in tool_calls:
        if tool_call.function.name == "get_product_price":
            args = json.loads(tool_call.function.arguments)
            result = get_product_price(args["product_id"])
            outputs.append({
                "tool_call_id": tool_call.id,
                "output": json.dumps(result)
            })
    
    # Submit the results to the run
    run = client.beta.threads.runs.submit_tool_outputs(
        thread_id=thread.id,
        run_id=run.id,
        tool_outputs=outputs
    )
    
    # Wait again for completion
    run = wait_for_run(client, thread.id, run.id)

Now we can read the final response from the assistant:

messages = client.beta.threads.messages.list(thread_id=thread.id)
for msg in messages:
    if msg.role == "assistant":
        print(msg.content[0].text.value)

The assistant will answer something like: “The product TSH-123 costs $19.99.”

Retrieval: Answers from Documents

If the assistant doesn't have a suitable function, it can draw from the uploaded files. Just ask a question like “what is the return policy?”. The assistant will search the PDF and answer with a quoted extract. You don't need to write extra code – it's all handled automatically by retrieval.

We used it to create a chatbot for a retail chain: we uploaded product manuals, shipping policies, and price lists. The result? Precise answers, no long emails to support.

Costs and Operational Considerations

Be careful: the Assistants API charges per run, plus input/output tokens. Uploaded files for retrieval are indexed once (storage cost). Here are some estimates for a small business use case:

Simple runs (no retrieval or functions): ~$0.01‑0.03 per conversation of 5‑6 exchanges (gpt‑4o model).
Runs with retrieval: cost goes up if the document is large because each question triggers a text query. Keep files under 50 pages.
Function calling: adds only the tokens for definition and call; if the function returns small data, impact is minimal.

We recommend testing in a development environment with gpt‑4o-mini (cheaper) and only switch to more powerful models in production. Always monitor costs in the OpenAI dashboard.

Common Mistakes and How to Avoid Them

Forgetting to handle requires_action: the run stays pending forever. Always implement polling with status checks.
Not validating function inputs: parameters arrive as JSON, they can be manipulated. Check types and ranges.
Uploading too large files: retrieval has a 20 MB per file limit and 500 files per assistant. But huge files degrade search quality.
Using the same API key in production and development: better to separate accounts to avoid mixing costs and rate limits.

What to Do Now

Here's an operational checklist to get started:

Get an API key and make sure you have a model supporting Assistants (gpt‑4o, gpt‑4‑turbo).
Install or update the openai library.
Create an assistant with clear instructions and at least one tool (retrieval or function).
Upload a sample file (e.g., a PDF with your company FAQ).
Write a real function (e.g., query a database or external API).
Simulate a conversation with polling and function submission.
Monitor costs with the OpenAI log.

If you want to dive deeper into integration with automation tools and other providers, you're in the right place. Our pillar guide on AI agents and advanced automation is the starting point for seeing how OpenAI Assistants connects with tools like LangChain, n8n, and autonomous agents.

We at Meteora Web, with our dual technical and accounting background, know that every implementation must have a measurable return. If you have a concrete project, talk to us. In the meantime, try the code above: a working run is worth more than a thousand slides.

OpenAI Assistants API: Agents with Function Calling and Retrieval — Practical Guide

What Is an AI Assistant and Why It Changes Your Day

Practical Setup: Assistant with Python

Creating the Assistant

Uploading a File for Retrieval

Creating a Thread and Starting a Conversation

Function Calling in Action

Retrieval: Answers from Documents

Costs and Operational Considerations

Common Mistakes and How to Avoid Them

What to Do Now

> AUTHOR_EXTRACTED

Ing. Calogero Bono

We build the digital presence your business deserves.

Stay in the loop

> MW_JOURNAL LATEST_LOGS

Claude Cowork sandbox escape exposes 500,000 Mac users to full file access via ShareRoot exploit

Google confirms development of Gemini 4 with most ambitious pre-training run yet

QR Check-in for Events — A Practical Guide to Fast, Error-Free Entry

Nvidia and Microsoft launch Open Secure AI Alliance without OpenAI, Google, and Anthropic

Chinese memory maker CXMT posts 466% leap in Shanghai IPO, focuses on DRAM production without HBM