Richard Batt |
Chain of Thought Prompting: Get AI to Show Its Working
Tags: prompt engineering, chain of thought, AI decision making, business AI
A client asked ChatGPT whether they should expand into the German market. It said yes. No reasoning. No consideration of their cash position, team capacity, or the fact that their product had not been localised. Just a confident "yes" with some generic points about Germany's strong economy.
That is the default behaviour of AI: skip the reasoning, jump to the answer. And for business decisions with real money attached, that is dangerous.
Key Takeaways
- Chain of thought (CoT) prompting forces AI to reason through problems step by step before giving an answer, dramatically improving accuracy on complex tasks.
- Research shows CoT improves accuracy by 20-40% on reasoning-heavy tasks compared to direct prompting.
- For business use, the reasoning trail matters as much as the answer, it lets you spot where the AI made wrong assumptions about your situation.
- Three practical CoT variations: basic ("think step by step"), structured (numbered reasoning stages), and tree-of-thought (explore multiple paths before deciding).
What Chain of Thought Prompting Actually Does
Chain of thought prompting is a technique where you instruct the AI to work through a problem step by step before providing its final answer. Instead of pattern-matching to a likely response, the model breaks the problem into intermediate reasoning steps and each step builds on the last.
The concept emerged from research showing that large language models perform significantly better on complex tasks when forced to generate intermediate reasoning. It is the difference between asking a new analyst "what should we do?" and asking them "walk me through your analysis, show your assumptions, then give me your recommendation."
The simplest version is adding five words to your prompt: "Let's think step by step." Research across arithmetic, logic, and multi-step reasoning tasks has shown this alone improves accuracy by 20-40% compared to direct prompting.
Why This Matters for Business
For trivial tasks, writing a subject line, reformatting a table, you do not need chain of thought. But for anything involving analysis, comparison, or recommendation, the reasoning matters as much as the conclusion.
Here is why: when AI shows its working, you can audit it. You can see where it made assumptions about your business that are wrong. You can catch the moment it confused your gross margin with your net margin, or assumed you have a 50-person team when you have 8.
Across my client implementations, I have found three categories where chain of thought prompting makes a measurable difference:
Financial analysis: Budgets, forecasts, ROI calculations. Any prompt where the AI needs to work with numbers and arrive at a conclusion. Without CoT, AI frequently skips steps and produces plausible-looking but incorrect figures.
Strategic decisions: Market entry, hiring, product prioritisation. The AI needs to weigh multiple factors against each other. Without CoT, it defaults to generic advice instead of reasoning about your specific situation.
Process design: Building workflows, mapping dependencies, identifying bottlenecks. These tasks require sequential logic, step A must happen before step B. Without CoT, the AI often produces workflows with circular dependencies or missing steps.
Three Variations You Can Use Today
1. Basic Chain of Thought
Add "Think through this step by step before giving your final answer" to any analytical prompt. This is the minimum effective dose.
Before: "Should we raise our prices by 15%?"
After: "We are a B2B SaaS company with 200 customers, $15K average contract value, and 4.2% monthly churn. We are considering a 15% price increase. Think through this step by step, consider the impact on churn, revenue, customer acquisition cost, and competitive positioning, then give your recommendation."
The second prompt produces an analysis that walks through each factor, quantifies the trade-offs, and arrives at a reasoned conclusion you can actually evaluate.
2. Structured Chain of Thought
For more complex decisions, define the reasoning stages explicitly. This is particularly useful when you need the AI to consider factors in a specific order.
Example: "Analyse whether we should build or buy a customer onboarding automation system. Work through these stages in order:
- Define our requirements based on the process description below
- Estimate the build cost (time, team, tools) based on similar projects
- Research three buy options and their pricing
- Compare build vs buy on cost, time to value, flexibility, and maintenance burden
- Make a recommendation with your confidence level and the key assumptions it depends on"
This approach works because you are not just asking for step-by-step reasoning, you are defining what the steps should be. The AI follows your analytical framework instead of inventing its own.
3. Tree of Thought, Explore Multiple Paths
Tree of thought is a more advanced variation where the AI explores multiple reasoning paths before settling on the best one. Think of it as brainstorming three different approaches, evaluating each, then picking the winner.
Example: "We need to reduce our customer support response time from 4 hours to under 1 hour. Generate three different approaches to solving this problem. For each approach, outline the steps required, estimated cost, timeline, and risks. Then compare all three and recommend which to pursue first, explaining your reasoning."
This is especially powerful for problems where the right answer is not obvious and you want to avoid the AI anchoring on its first idea.
When Not to Use Chain of Thought
Not every prompt needs CoT. Here is a quick guide:
| Task Type | Use CoT? | Why |
|---|---|---|
| Simple retrieval ("What is the capital of France?") | No | No reasoning required |
| Formatting/rewriting | No | Style task, not logic task |
| Multi-step calculations | Yes | Intermediate steps prevent arithmetic errors |
| Strategic recommendations | Yes | You need to audit the reasoning |
| Process design | Yes | Sequential logic requires step-by-step work |
| Comparing options | Yes | Multiple factors need weighing against each other |
A Real Example from Client Work
A logistics company asked me to help them evaluate three AI tools for route optimisation. Without CoT, the AI picked one and wrote a convincing case for it, ignoring that it did not integrate with their existing fleet management system.
With structured CoT, I prompted the AI to first list their integration requirements, then evaluate each tool against those requirements, then compare on cost and implementation time, and finally make a recommendation. The result was a clear comparison matrix that showed two of the three tools would require custom API work, a $30K cost that would have been invisible without the step-by-step analysis.
The reasoning trail caught what the quick answer missed. That is the entire point.
Frequently Asked Questions
What is chain of thought prompting?
Chain of thought (CoT) prompting is a technique where you instruct the AI to work through a problem step by step before giving its final answer. Instead of jumping directly to a conclusion, the AI generates intermediate reasoning steps, making the output more accurate and auditable. Research shows it improves accuracy by 20-40% on complex reasoning tasks.
When should I use chain of thought prompting?
Use it for any task that involves analysis, comparison, calculation, or recommendation, situations where the reasoning matters as much as the answer. Skip it for simple retrieval, formatting, or creative writing tasks where step-by-step logic does not apply.
Does adding "think step by step" really make a difference?
Yes. Even the basic five-word addition measurably improves output quality on reasoning tasks. For business-critical decisions, structured chain of thought (where you define the reasoning stages) performs even better because the AI follows your analytical framework rather than inventing its own.
Does chain of thought work with all AI models?
It works with all major large language models, GPT-4, Claude, Gemini, and open-source models like Llama. The effect is strongest on larger models. Smaller models sometimes produce reasoning steps that are plausible-sounding but logically flawed, so always audit the chain of reasoning regardless of which model you use.
How do I audit AI reasoning for errors?
Read each step of the reasoning chain and check two things: are the facts correct, and does each step logically follow from the previous one? Pay special attention to numerical claims, assumptions about your business context, and any step where the AI says "typically" or "usually", those are often where it substitutes generic knowledge for your specific situation.
Richard Batt has delivered 120+ AI and automation projects across 15+ industries. He helps businesses deploy AI that actually works, with battle-tested tools, templates, and implementation roadmaps. Featured in InfoWorld and WSJ.
What Should You Do Next?
If you are not sure where AI fits in your business, start with a roadmap. I will assess your operations, identify the highest-ROI automation opportunities, and give you a step-by-step plan you can act on immediately. No jargon. No fluff. Just a clear path forward built from 120+ real implementations.
Book Your AI Roadmap, 60 minutes that will save you months of guessing.
Already know what you need to build? The AI Ops Vault has the templates, prompts, and workflows to get it done this week.