AI Agent That Asks Before Acting in Python - Building Agents Part 3
Build a Clarifying Agent in Python with GPT-5 that asks follow-up questions until inputs are clear, providing a safer, smarter, and user-friendly experience.

Introduction: Why We Need Clarifying Agents
In Part 1, we built a Simple Decision-Making Agent that worked on structured inputs.
In Part 2, we added an LLM “perception layer” that turns natural language into structured JSON using response_format
+ json_schema
.
But there’s still a problem:
User: “I’m kinda hungry and I don’t have much time.”
How hungry is “kinda”? Is “not much time” five minutes or fifteen?
Rather than guessing, a good agent asks clarifying questions until everything is clear and understood.
NOTE: This article builds directly on Part 2: LLM-Powered Agent, and we’ll be referring back to it often. So if you haven’t read it yet, we recommend starting there.
Enter the Clarifying Agent
Our agent will:
- Detect what’s missing or vague (ingredients, hunger 1–5, time in minutes).
- Ask one clarifying question at a time.
- Loop until all three fields are complete.
- Only then, extract structured JSON (with the same schema approach as Part 2).
How the Clarifying Agent Works

Environment Setup
Following the same baseline as Part 2, and similar to it, we’ll load the environment from a .env
file for convenience.
python3 -m venv ~/.venvs/agents
source ~/.venvs/agents/bin/activate
pip install -U openai python-dotenv
Create a .env
file:
OPENAI_API_KEY=sk-...
And load it in Python:
from dotenv import load_dotenv
load_dotenv() # loads OPENAI_API_KEY from .env
Code: Clarifying Agent with Schema-Constrained Extraction
Below is a safe, production-friendly implementation that:
- Uses a
<DONE/>
sentinel to avoid false “done” matches inside natural text. - Caps clarifications with
max_rounds
to prevent infinite loops. - Extracts JSON via
response_format={'type':'json_schema'}
with the same schema as Part 2. - Validates types/ranges and normalizes inputs.
1) Shared JSON Schema (same as Part 2)
MEAL_SCHEMA = {
"type": "object",
"properties": {
"ingredients": {"type": "array", "items": {"type": "string"}},
"hunger": {"type": "integer", "minimum": 1, "maximum": 5},
"time": {"type": "integer", "minimum": 0}
},
"required": ["ingredients", "hunger", "time"],
"additionalProperties": False
}
2) Rule Engine + Tiny Recipe Base
from dataclasses import dataclass
from typing import List
def normalize_items(items: List[str]) -> List[str]:
return [i.strip().lower() for i in items]
@dataclass
class MealAgent:
ingredients: List[str] = None
hunger: int = 0
time: int = 0
def __post_init__(self):
# Use your recipe list
self.recipes = [
{
"name": "Egg Fried Rice",
"ingredients": ["rice", "eggs"],
"time": 15,
"filling": 4
},
{
"name": "Cheese Omelette Sandwich",
"ingredients": ["bread", "cheese", "eggs"],
"time": 10,
"filling": 5
},
{
"name": "Bread and Jam",
"ingredients": ["bread", "jam"],
"time": 5,
"filling": 2
}
]
def perceive(self, ingredients: List[str], hunger: int, minutes: int):
self.ingredients = normalize_items(ingredients)
self.hunger = int(hunger)
self.time = int(minutes)
def act(self) -> str:
have = set(self.ingredients)
# Only requirement: recipe "need" ⊆ user "have"
# (extra ingredients like "vegetables" are ignored)
feasible = []
for recipe in self.recipes:
need = set(map(str.lower, recipe["ingredients"]))
if need.issubset(have) and recipe["time"] <= self.time:
feasible.append(recipe)
if feasible:
# If very hungry, prefer higher filling; otherwise prefer quicker
feasible.sort(key=lambda r: (-(r["filling"] if self.hunger >= 4 else 0), r["time"]))
top = feasible[0]
tag = " (quick & filling!)" if self.hunger >= 4 and top["filling"] >= 4 else ""
return f'{top["name"]}{tag}'
return "No suitable recipe found."
3) Clarifying + Extraction
import json
from typing import List, Callable
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()
client = OpenAI()
class ClarifyingMealAgent:
def __init__(self, ask_user=input, max_rounds=8):
self.ask_user = ask_user
self.max_rounds = max_rounds
self.ingredients: List[str] = []
self.hunger: int = 0
self.time: int = 0
def clarify_until_clear(self, user_input: str) -> str:
convo = user_input.strip()
rounds = 0
while True:
rounds += 1
if rounds > self.max_rounds:
print("⚠️ Too many clarification rounds. Proceeding with best effort.")
return convo
prompt = f'''
You are helping extract three fields from a user's cooking context:
1) ingredients (list of items). (list[str]): Canonical ingredient names only, pluralized where natural.
2) hunger (integer 1-5)
3) time (integer minutes)
• Remove numbers (e.g., "2 eggs" → "eggs")
• Remove adjectives (e.g., "leftover rice" → "rice", "boiled egg" → "eggs")
• Use simple grocery-style labels that match common recipes.
Do NOT infer missing values; if anything is vague or missing, ask a direct question to get that exact value.
Conversation so far:
--- START ---
{convo}
--- END ---
If ALL THREE are present and clear, reply with:
<DONE/>
If NOT complete, ask ONE clarifying question about the most missing/unclear field. No extra commentary.
'''
reply = self.ask_gpt(prompt).strip()
if "<DONE/>" in reply:
return convo
print("🤖 Clarifying:", reply)
user_reply = self.ask_user("👤 Your answer: ").strip()
convo += "\n" + reply + "\n" + user_reply
def perceive(self, user_input: str) -> None:
full = self.clarify_until_clear(user_input)
resp = client.chat.completions.create(
model="gpt-5",
messages=[
{"role": "system",
"content": (
"Extract meal inputs as JSON matching the schema. "
"Normalize 'ingredients' to these allowed labels ONLY: "
"rice, eggs, bread, cheese, jam, vegetables. "
"Remove quantities (e.g., '2 eggs' -> 'eggs') and adjectives "
"(e.g., 'leftover rice' -> 'rice', 'boiled egg' -> 'eggs'). "
"Map synonyms like 'veg'/'veggies' -> 'vegetables'. "
"Return only canonical grocery-style labels."
)
},
{"role": "user", "content": full},
],
response_format={
"type": "json_schema",
"json_schema": {"name": "meal_inputs", "schema": MEAL_SCHEMA},
},
)
raw = resp.choices[0].message.content
data = json.loads(raw)
ingredients = data.get("ingredients", [])
hunger = int(data.get("hunger", 0))
minutes = int(data.get("time", 0))
if not isinstance(ingredients, list) or not ingredients:
raise ValueError("ingredients must be a non-empty list of strings.")
if hunger < 1 or hunger > 5:
raise ValueError("hunger must be an integer 1..5.")
if minutes <= 0 or minutes > 24*60:
raise ValueError("time must be a positive integer (minutes).")
self.ingredients = [str(i).strip().lower() for i in ingredients]
self.hunger = hunger
self.time = minutes
def ask_gpt(self, prompt: str) -> str:
r = client.chat.completions.create(model="gpt-5", messages=[{"role":"user","content":prompt}])
return r.choices[0].message.content
Why “Do NOT infer” (vs Part 2):
In Part 2, we let the model guess vague details (e.g., treating “in a rush” as 10 minutes). That’s convenient for demos, but risky. Guesses can be wrong, inconsistent across runs, and hard to explain to users. In this part, the clarify loop flips this: ask, don’t guess. If the user says “kinda hungry” or “not much time,” the agent asks a single, direct question to pin down the exact number.
What this buys you:
You receive reliable, user-confirmed inputs before any decision logic is executed, which enhances trust and reduces silent errors. It also ensures stable behavior for testing/telemetry, and maintains the integrity of schema extraction; no backfilling or hallucinated values. In short: clearer inputs, fewer surprises, better UX.
4) Wire it up
if __name__ == "__main__":
clarifier = ClarifyingMealAgent()
meal = MealAgent()
user_text = "I have leftover rice, two eggs, and veggies. I am kinda hungry and in a rush."
clarifier.perceive(user_text)
meal.perceive(clarifier.ingredients, clarifier.hunger, clarifier.time)
suggestion = meal.act()
print("🍽️ Suggestion:", suggestion)
Sample Run
Input:I have leftover rice and two eggs, and veggies. I’m kinda hungry and in a rush.
Output:
🤖 Clarifying: How many minutes do you have to cook?
👤 Your answer: 15
🤖 Clarifying: On a scale of 1–5, how hungry are you?
👤 Your answer: 4
🍽️ Suggestion: Egg Fried Rice (quick & filling!)
Why <DONE/>
and max_rounds
?
<DONE/>
sentinel: the plain word done can appear in conversation (“I’m done cooking”). A unique tag reduces false positives, so the loop only exits on an explicit signal from the LLM.max_rounds
: guards against infinite loops if a user goes silent or keeps replying vaguely. After N tries, the agent proceeds with best effort or shows a graceful message.
Why This Is Better
- Matches Part 2’s schema-constrained output pattern for reliable JSON.
- Adaptive: asks only what’s missing; feels conversational.
- Safe: bounded loop + validated inputs.
- Practical: plugs straight into the Part 1 rule engine + recipes for instant results.
Wrap-Up
This finishes the 3-part series: rules → LLM perception → clarifying loop. From here, you can add memory (favorite meals), tools (web recipe fetchers), or a UI layer.
Read: Part 1 - Simple Decision-Making Agent.
Read: Part 2 - LLM Powered Agent.
Comments ()