Zero-Shot vs Few-Shot Prompting: When to Use Each Approach
Understand the key differences between zero-shot, one-shot, and few-shot prompting. Learn when each technique works best and how to select high-quality examples for reliable LLM outputs.
The Spectrum of Example-Based Prompting
When you ask an LLM to perform a task, you can provide zero, one, or several examples of the desired input-output behavior. This choice — how many examples to include — is one of the most impactful decisions in prompt engineering. Each approach has distinct strengths, and understanding when to use which can mean the difference between a 60% and a 95% success rate.
Zero-Shot Prompting
Zero-shot prompting means giving the model a task description with no examples. You rely entirely on the model's pre-trained knowledge to understand what you want.
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": "Classify the sentiment of customer reviews as positive, neutral, or negative. Return only the label."
},
{
"role": "user",
"content": "The delivery was fast but the packaging was damaged."
}
]
)
print(response.choices[0].message.content) # "neutral"
Zero-shot works well for tasks the model has seen extensively during training: sentiment analysis, translation, summarization, and simple classification. It is fast to implement and keeps token costs low.
When to use zero-shot: The task is common, the output format is simple, and you need quick iteration without curating examples.
One-Shot Prompting
One-shot prompting provides a single example to anchor the model's understanding. This is often enough to clarify ambiguous formatting or establish a pattern.
messages = [
{
"role": "system",
"content": "Extract structured data from product descriptions."
},
{
"role": "user",
"content": "Nike Air Max 90, men's running shoe, $129.99, available in black and white"
},
{
"role": "assistant",
"content": '{"brand": "Nike", "model": "Air Max 90", "category": "running", "price": 129.99, "colors": ["black", "white"]}'
},
{
"role": "user",
"content": "Adidas Ultraboost 22, women's training shoe, $189.00, available in pink, grey, and navy"
}
]
The single example communicates the JSON schema, field naming conventions, and how to handle multi-value fields — all without verbose instructions.
Few-Shot Prompting
Few-shot prompting provides 2-8 examples that collectively cover the range of expected inputs and edge cases. This is the most powerful technique for custom or domain-specific tasks.
def build_few_shot_classifier(reviews: list[str]) -> list[dict]:
examples = [
("Absolutely love this product, works perfectly!", "positive"),
("It's okay, nothing special but does the job.", "neutral"),
("Broke after two days. Complete waste of money.", "negative"),
("Good quality but overpriced for what you get.", "neutral"),
("Best purchase I've made this year, highly recommend!", "positive"),
]
messages = [
{
"role": "system",
"content": "Classify customer reviews as positive, neutral, or negative."
}
]
for text, label in examples:
messages.append({"role": "user", "content": text})
messages.append({"role": "assistant", "content": label})
# Add the actual reviews to classify
for review in reviews:
messages.append({"role": "user", "content": review})
return messages
Selecting Good Examples
The quality of your examples matters more than the quantity. Follow these guidelines:
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
Cover the output space. If you have three classes, include at least one example of each. If outputs vary in length or structure, show that range.
Include edge cases. The mixed-sentiment review ("Good quality but overpriced") is more valuable than another clearly positive example.
Keep examples realistic. Use actual data from your domain, not synthetic toy examples. Models pick up on subtle patterns in real data.
Order matters. Place the most representative examples first and the edge cases last. The model pays more attention to recent examples.
# Bad: all examples are clearly positive or negative
examples = [
("Amazing!", "positive"),
("Terrible!", "negative"),
("Wonderful!", "positive"),
]
# Good: covers the full spectrum including ambiguity
examples = [
("Delivery was fast, product matches the description.", "positive"),
("Arrived late but the quality is decent.", "neutral"),
("Completely broken on arrival, no response from support.", "negative"),
("The color is slightly different than pictured but I still like it.", "neutral"),
]
Decision Framework
Use this practical guide:
| Approach | Best For | Token Cost | Setup Time |
|---|---|---|---|
| Zero-shot | Common tasks, simple outputs | Low | Minutes |
| One-shot | Format clarification, schema definition | Low | Minutes |
| Few-shot | Custom classification, domain-specific tasks | Medium | Hours |
Start with zero-shot. If the output is inconsistent or wrong, add one example. If edge cases are mishandled, add more examples targeting those specific failure modes. This incremental approach avoids over-engineering your prompts.
FAQ
How many examples should I use for few-shot prompting?
Three to five examples is the sweet spot for most tasks. Beyond 8 examples, you hit diminishing returns and increasing token costs. If you need more than 8 examples to get reliable results, consider fine-tuning instead.
Can few-shot examples hurt performance?
Yes. Poor-quality examples — ambiguous labels, unrepresentative data, or formatting inconsistencies — actively confuse the model. One bad example can negate three good ones. Always validate that each example unambiguously demonstrates the pattern you want.
Should I randomize the order of few-shot examples?
For classification tasks, vary the label order so the model does not develop a recency bias. If your last three examples are all "positive," the model may lean toward "positive" for the next input. Interleave labels to prevent this.
#FewShotPrompting #ZeroShot #PromptEngineering #LLM #Python #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.