Zero-shot learning
Definition
Zero-shot learning is when a model performs a task it was never explicitly trained or given examples for, relying on its general knowledge and reasoning to handle a novel request. You simply describe the task in the prompt, and the model attempts it without any demonstrations. It reflects the broad, transferable capability of modern large language models.
How it works
In zero-shot learning, you give the model only an instruction, with no worked examples. The model draws on the broad knowledge and patterns absorbed during pretraining to interpret the request and produce an answer. Asking a model to classify a sentiment or summarize a passage with just a clear instruction is zero-shot prompting.
This capability emerges because large models are trained on such diverse data that they have implicitly seen countless related tasks and formats. Instruction tuning further sharpens their ability to follow a plain description of a task they were never specifically trained to perform.
Zero-shot performance depends heavily on how clearly the task is described. A precise, well-scoped instruction often yields strong results, while an ambiguous one can lead the model astray, which is where few-shot examples can help.
Why it matters
Zero-shot learning is what makes modern AI feel general. The same model can answer questions, write code, translate, and reason about novel problems without any task-specific setup, dramatically lowering the effort to apply AI to new situations.
For builders, it means a capable starting point for almost any language task with nothing more than a good instruction. Many production features rely on zero-shot prompting, escalating to few-shot examples or fine-tuning only when accuracy or format demands more. This flexibility is central to how quickly AI features can be shipped.
Frequently asked questions
What is the difference between zero-shot and few-shot learning?
Zero-shot provides only an instruction with no examples, while few-shot includes a handful of demonstrations in the prompt. Few-shot usually improves accuracy on tricky or format-sensitive tasks, while zero-shot is simpler and uses fewer tokens when a clear instruction suffices.
How can a model do a task it was never trained on?
Pretraining on vast, diverse data exposes the model to many related tasks and patterns. Combined with instruction tuning, this lets the model generalize from what it has learned to follow a plain description of a new task at inference time.
When does zero-shot learning struggle?
It struggles with ambiguous instructions, highly specialized formats, niche domains outside the training data, and tasks needing precise edge-case handling. In those cases, adding few-shot examples or fine-tuning typically improves reliability.
How do I get better zero-shot results?
Write clear, specific instructions that state the task, the desired format, and any constraints. Techniques like chain-of-thought prompting can also improve zero-shot performance on reasoning-heavy tasks without adding examples.
Few-shot learning
Few-shot learning is the ability of a model to learn a new task from just a handful of examples, typically two to ten, provided directly in the prompt rather than through retraining. By showing the model a few input-output pairs, you steer it toward the desired format and behavior. It is a core technique in prompt engineering with modern language models.
Prompt engineering
Prompt engineering is the practice of designing and refining the inputs given to an AI model to produce precise, high-quality, and reliable outputs. It covers wording, structure, examples, context, and constraints — shaping how a model interprets a request without changing the model itself. Effective prompting is often the cheapest and fastest way to improve results.
Large language model (LLM)
A large language model is an AI system trained on vast amounts of text to understand and generate human language. Built on transformer architecture and containing billions of parameters, LLMs predict the next token in a sequence, enabling them to answer questions, write, summarize, and reason. They power modern chat assistants, AI search, and autonomous agents.
Chain of thought (CoT)
Chain of thought is a prompting technique that improves a model's reasoning by encouraging it to work through a problem step by step before giving a final answer. Making intermediate reasoning explicit helps models handle multi-step math, logic, and planning tasks more reliably. Once a hand-written prompting trick, chain-of-thought reasoning is now built directly into reasoning models that think before they respond.
AI inference
AI inference is the runtime step where a trained AI model takes a prompt and produces an output — the tokens you see streaming back from ChatGPT, Claude, Gemini, or Perplexity. Inference is what costs money in production: every prompt and every generated token consumes GPU time, and the economics of any AI product live in this loop.
Reasoning models
Reasoning models are language models trained to solve complex problems by thinking step by step before answering, spending extra computation at inference to work through a problem rather than responding immediately. Examples include OpenAI's o-series, DeepSeek-R1, and reasoning-tier Gemini and Claude modes. The approach trades latency and cost for stronger performance on math, coding, science, and multi-step planning.