Prompt Engineering Guide

Practical patterns for getting better results from AI models. Not theory — tested techniques with examples that work across GPT-4, Claude, and Gemini.

6 articles 1 tool
Related Tools

What prompt engineering actually is

Prompt engineering is writing instructions that produce reliable, high-quality outputs from language models. It’s not magic words or secret incantations — it’s clear communication with a system that follows patterns.

The core principle: the model matches the distribution of text that looks like your prompt. A vague prompt gets a vague answer (because the internet is full of vague answers to vague questions). A specific, structured prompt gets a specific, structured answer (because high-quality writing follows patterns that models have learned).

The six patterns that work everywhere

1. Role + Task + Format

Tell the model who it is, what to do, and how to present the answer.

You are a senior backend engineer reviewing a pull request.
Review this code for security vulnerabilities, performance issues, and maintainability.
Format: bullet list, one issue per bullet, severity (high/medium/low) prefix.

Why it works: constrains the model’s output distribution to the intersection of “expert writing” + “this specific task” + “this specific format.”

2. Few-shot examples

Show the model 2-3 examples of the input→output mapping you want before giving it the actual input.

Convert these natural language descriptions to SQL queries.

Description: "Show me all users who signed up last month"
SQL: SELECT * FROM users WHERE created_at >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month') AND created_at < DATE_TRUNC('month', CURRENT_DATE);

Description: "Count orders by country this year"
SQL: SELECT country, COUNT(*) as order_count FROM orders WHERE created_at >= DATE_TRUNC('year', CURRENT_DATE) GROUP BY country ORDER BY order_count DESC;

Description: "Find products with no sales in 90 days"
SQL:

Why it works: examples define the pattern more precisely than instructions ever can. The model extrapolates the pattern to new inputs.

3. Chain of thought

Ask the model to think step-by-step before answering. This is not a politeness — it forces intermediate reasoning that improves accuracy on logic, math, and multi-step problems.

Determine if this argument is logically valid. Think through each step before concluding.

Why it works: without chain-of-thought, the model generates the answer token-by-token in a forward pass. With it, the “thinking” tokens serve as working memory, allowing the model to break complex problems into simpler subproblems.

4. Constraints and boundaries

Explicitly state what the model should NOT do. Models have strong tendencies (being helpful, being verbose, hedging with caveats). Override them directly.

Answer in 2 sentences maximum.
Do not include disclaimers or caveats.
If you don't know, say "I don't know" — do not guess.
Use only information from the provided context.

5. Structured output

Request specific formats. JSON, markdown tables, numbered lists, YAML — any format the model has seen frequently in training data.

Return a JSON object with these fields:
- "sentiment": one of "positive", "negative", "neutral"
- "confidence": float 0-1
- "key_phrases": array of strings

Why it works: structured formats reduce ambiguity. The model doesn’t have to decide how to organize the output — you’ve already decided.

6. Iterative refinement

Don’t try to write the perfect prompt. Write a decent prompt, see the output, then refine. Most production prompts go through 5-10 iterations.

Common refinement moves:

  • Output too long → add word/sentence limit
  • Output misses edge cases → add examples of those cases
  • Output format inconsistent → add explicit format template
  • Output too generic → add domain context or persona

Token economics matter

Every token in your prompt costs money and consumes context window. A 2,000-token system prompt that could be 200 tokens is wasting 90% of its budget.

Rules:

  • Don’t repeat instructions. Say it once, clearly.
  • Don’t explain why you want something unless it changes the output.
  • Remove filler phrases (“I would like you to…”, “Could you please…”). Models don’t care about politeness — direct instructions produce identical or better results.
  • Put the most important instruction first and last (primacy and recency effects exist in LLMs).

What doesn’t work

  • Threatening the model (“If you get this wrong, I’ll be fired”) — produces anxious, over-hedged outputs.
  • Excessive detail on trivial aspects — dilutes attention from what matters.
  • Asking the model to “be creative” — creativity emerges from constraints, not from asking for it. Constrain the format, free the content.
  • Expecting consistency across sessions — models are stateless. Every conversation starts from zero. If you need consistent behavior, put everything in the system prompt every time.

Articles in this guide