Claude AI: The Complete Developer Guide — Features, Prompting & Real-World Workflows
Claude AI Prompt Engineering Anthropic LLM

Claude AI: The Complete Developer Guide — Features, Prompting & Real-World Workflows

D. Rout

D. Rout

May 16, 2026 12 min read

On this page

GitHub Repo: All code examples in this guide are available at Github. Clone it, install dependencies, and run any example as you follow along.

Introduction

If you've been building with LLMs, you've probably hit the same wall: impressive demos that fall apart in production. Wrong tone, hallucinated data, no reliable structure in the output, and brittle workflows that break the moment your prompt changes slightly.

Claude — Anthropic's flagship AI model — was built with a specific design philosophy to address these problems: Constitutional AI for safer, more predictable behaviour, and an extended context window (up to 200k tokens) that lets it work with entire codebases, long documents, and complex multi-step tasks in a single call.

In this guide, we'll go from zero to production-ready. You'll set up the SDK, master the prompting techniques that actually matter, build a multi-turn chat, hook up tools so Claude can call external APIs, analyse documents, process images, and understand the patterns that separate toy projects from real deployments.


Prerequisites

Before we start:

  • Node.js 18+ (examples use ES modules)
  • An Anthropic API key — get one at console.anthropic.com
  • Basic familiarity with async/await and REST APIs
  • The companion repo cloned locally:
git clone https://github.com/deepakrout/claude-ai-complete-guide.git
cd claude-ai-complete-guide
npm install
cp .env.example .env   # add your ANTHROPIC_API_KEY

1. Setting Up the Anthropic SDK

Install the official SDK:

npm install @anthropic-ai/sdk dotenv

Your first completion:

// 01-basic-api/hello-claude.js
import Anthropic from '@anthropic-ai/sdk';
import 'dotenv/config';

const client = new Anthropic(); // reads ANTHROPIC_API_KEY from env

const response = await client.messages.create({
  model: 'claude-sonnet-4-5',
  max_tokens: 1024,
  messages: [
    { role: 'user', content: 'What is the capital of France?' }
  ]
});

console.log(response.content[0].text); // "The capital of France is Paris."

Key fields in the response object:

Field Description
response.id Unique message ID for logging/debugging
response.model Which model handled the request
response.content Array of content blocks (text, tool_use)
response.stop_reason Why generation stopped: end_turn, max_tokens, tool_use
response.usage.input_tokens Tokens consumed by your prompt
response.usage.output_tokens Tokens in Claude's response

Run it:

node 01-basic-api/hello-claude.js

2. Choosing the Right Claude Model

Anthropic offers several models in the Claude family. Picking the right one for your use case significantly affects both quality and cost.

Model Best For Context Window
claude-opus-4 Complex reasoning, deep analysis, long documents 200k tokens
claude-sonnet-4-5 Balanced — fast, capable, cost-effective for most tasks 200k tokens
claude-haiku-4-5 High-throughput, low-latency, simple completions 200k tokens

A good rule of thumb: start with Sonnet, move to Opus for tasks requiring multi-step reasoning, and reach for Haiku when you're building anything that needs sub-second response times at scale — like search autocomplete or real-time chat suggestions.


3. Prompt Engineering That Actually Works

Prompting is not magic incantations. It's structured communication. These are the four techniques that move the needle most.

3.1 System Prompts

The system prompt defines Claude's role, constraints, and output format for the entire conversation. Think of it as your onboarding document for Claude.

// 02-prompt-engineering/system-prompt.js
const response = await client.messages.create({
  model: 'claude-sonnet-4-5',
  max_tokens: 1024,
  system: `You are a senior TypeScript engineer reviewing code for a production SaaS.
Focus on: type safety, performance, and maintainability.
Always suggest concrete improvements with before/after code examples.
Respond concisely — no filler text.`,
  messages: [
    {
      role: 'user',
      content: `Review this:\n\nfunction getUser(id) {\n  return db.find(id);\n}`
    }
  ]
});

What to put in a system prompt:

  • Persona and expertise level
  • Output format constraints (JSON, markdown, bullet points)
  • What Claude should not do (negative constraints are powerful)
  • Domain-specific terminology it should use

3.2 Few-Shot Examples

When you need a specific, consistent output format, show Claude examples instead of describing the format. This is more reliable than detailed instructions.

// 02-prompt-engineering/few-shot.js
const messages = [
  { role: 'user', content: 'Convert to slug: "Hello World"' },
  { role: 'assistant', content: 'hello-world' },         // example 1
  { role: 'user', content: 'Convert to slug: "Getting Started with Angular Signals"' },
  { role: 'assistant', content: 'getting-started-with-angular-signals' }, // example 2
  { role: 'user', content: 'Convert to slug: "Claude AI: The Complete Developer Guide"' }
  // Claude infers the pattern and follows it
];

3.3 Chain-of-Thought Reasoning

For tasks that require multi-step reasoning (maths, analysis, debugging), explicitly ask Claude to think before answering. This dramatically reduces errors on complex problems.

// 02-prompt-engineering/chain-of-thought.js
const response = await client.messages.create({
  model: 'claude-sonnet-4-5',
  max_tokens: 1024,
  messages: [{
    role: 'user',
    content: `Think step by step, then answer:

A SaaS has 3,000 MAU. 5% upgrade to Pro at $29/mo.
10% of Pro users upgrade to Enterprise at $99/mo.
What is the monthly recurring revenue?

<thinking>Work through each tier.</thinking>`
  }]
});

3.4 XML Tags for Structured Context

When you're passing large, mixed-content prompts (documents + instructions + data), XML tags help Claude parse and prioritise different sections.

const prompt = `
<task>Summarise the key risks in this legal contract.</task>

<document>
${contractText}
</document>

<constraints>
- Bullet points only
- Focus on liability, indemnification, and termination clauses
- Max 200 words
</constraints>
`;

4. Multi-Turn Conversations

Claude has no built-in memory between API calls. You maintain conversation state by sending the full message history on every request.

// 03-multi-turn/conversation.js
const client = new Anthropic();
const history = []; // Grows with each turn

async function chat(userMessage) {
  // Append user turn
  history.push({ role: 'user', content: userMessage });

  const response = await client.messages.create({
    model: 'claude-sonnet-4-5',
    max_tokens: 1024,
    system: 'You are a helpful coding assistant.',
    messages: history  // Send full history every time
  });

  const reply = response.content[0].text;

  // Append assistant turn — so Claude "remembers" it said this
  history.push({ role: 'assistant', content: reply });

  return reply;
}

// Usage
await chat('What is a closure in JavaScript?');
await chat('Can you show me a practical example?'); // Claude remembers the topic

Context window management tips:

Strategy When to use
Summarise old turns Conversations > 50 turns
Sliding window (keep last N) Real-time chat applications
Semantic retrieval Long-term memory across sessions
Prompt caching Repeated system prompts at scale

5. Tool Use — Giving Claude Hands

Tool use (also called function calling) lets Claude interact with the outside world: call APIs, run queries, read files, trigger webhooks. This is the foundation of agentic AI.

5.1 Defining a Tool

const tools = [
  {
    name: 'get_weather',
    description: 'Get current weather for a city. Returns temperature and conditions.',
    input_schema: {
      type: 'object',
      properties: {
        city: { type: 'string', description: 'City name, e.g. "Tokyo"' },
        units: { type: 'string', enum: ['celsius', 'fahrenheit'] }
      },
      required: ['city']
    }
  }
];

5.2 The Agentic Loop

Claude doesn't call your tool — it asks you to call it by returning a tool_use content block. You execute the tool and feed the result back. This loop continues until Claude returns stop_reason: 'end_turn'.

// 04-tool-use/weather-tool.js
async function runWithTools(userMessage) {
  const messages = [{ role: 'user', content: userMessage }];

  let response = await client.messages.create({
    model: 'claude-sonnet-4-5',
    max_tokens: 1024,
    tools,
    messages
  });

  // Loop until Claude is done using tools
  while (response.stop_reason === 'tool_use') {
    const toolUse = response.content.find(b => b.type === 'tool_use');

    // Execute the tool locally
    const result = get_weather(toolUse.input);

    // Feed result back to Claude
    messages.push({ role: 'assistant', content: response.content });
    messages.push({
      role: 'user',
      content: [{
        type: 'tool_result',
        tool_use_id: toolUse.id,
        content: result
      }]
    });

    response = await client.messages.create({
      model: 'claude-sonnet-4-5',
      max_tokens: 1024,
      tools,
      messages
    });
  }

  return response.content.find(b => b.type === 'text')?.text;
}

await runWithTools('What is the weather in Tokyo in Fahrenheit?');

5.3 Tool Use Reference

stop_reason value Meaning
tool_use Claude wants to call a tool — run the loop
end_turn Claude is done — read content[].text
max_tokens Response was cut off — increase max_tokens
stop_sequence A custom stop sequence was hit

6. Document and File Analysis

Claude can analyse text documents, PDFs, and structured data directly in the prompt. With a 200k token context window, you can fit entire books, codebases, or contract bundles.

6.1 Sending Text Documents

const contractText = fs.readFileSync('./contract.txt', 'utf-8');

const response = await client.messages.create({
  model: 'claude-opus-4',
  max_tokens: 2048,
  messages: [{
    role: 'user',
    content: `<document>${contractText}</document>

List all clauses related to data privacy and GDPR compliance.
Format as a numbered list with the clause reference and a one-sentence summary.`
  }]
});

6.2 Sending PDFs (Base64)

import { readFileSync } from 'fs';

const pdfData = readFileSync('./report.pdf').toString('base64');

const response = await client.messages.create({
  model: 'claude-opus-4',
  max_tokens: 2048,
  messages: [{
    role: 'user',
    content: [
      {
        type: 'document',
        source: {
          type: 'base64',
          media_type: 'application/pdf',
          data: pdfData
        }
      },
      { type: 'text', text: 'Summarise the executive summary section.' }
    ]
  }]
});

6.3 Extracting Structured Data

const response = await client.messages.create({
  model: 'claude-sonnet-4-5',
  max_tokens: 1024,
  system: 'You are a data extraction engine. Respond ONLY with valid JSON. No prose.',
  messages: [{
    role: 'user',
    content: `Extract all invoice line items from this text and return as JSON:
    
${invoiceText}

Return format: { "invoice_id": "...", "items": [{ "description": "...", "qty": 0, "unit_price": 0, "total": 0 }] }`
  }]
});

const data = JSON.parse(response.content[0].text);

7. Vision — Image Understanding

Claude can process images alongside text, enabling powerful multi-modal workflows.

import { readFileSync } from 'fs';

const imageData = readFileSync('./screenshot.png').toString('base64');

const response = await client.messages.create({
  model: 'claude-sonnet-4-5',
  max_tokens: 1024,
  messages: [{
    role: 'user',
    content: [
      {
        type: 'image',
        source: {
          type: 'base64',
          media_type: 'image/png',
          data: imageData
        }
      },
      {
        type: 'text',
        text: 'Describe any UI issues or accessibility problems visible in this screenshot.'
      }
    ]
  }]
});

Supported image formats: JPEG, PNG, GIF, WebP. Max 5MB per image, up to 20 images per request.

Practical vision use cases:

Use Case Prompt Pattern
Screenshot-to-code "Convert this UI mockup to a React component with Tailwind CSS"
Chart analysis "Extract the data from this bar chart as a JSON array"
Document OCR "Transcribe all text from this scanned page"
UI review "List any accessibility issues in this UI screenshot"
Error debugging "What is the error in this terminal screenshot?"

8. Advanced Patterns

8.1 Streaming

For long responses, stream tokens as they arrive instead of waiting for the full completion. Essential for any user-facing application.

// 07-advanced-patterns/streaming.js
const stream = await client.messages.stream({
  model: 'claude-sonnet-4-5',
  max_tokens: 512,
  messages: [{ role: 'user', content: 'Explain async/await in 3 paragraphs.' }]
});

for await (const chunk of stream) {
  if (chunk.type === 'content_block_delta' && chunk.delta.type === 'text_delta') {
    process.stdout.write(chunk.delta.text); // tokens arrive in real time
  }
}

8.2 Prompt Caching

If you're sending the same large system prompt (or document) across many requests, prompt caching can reduce costs by up to 90% and latency by up to 85% on cached content.

const response = await client.messages.create({
  model: 'claude-sonnet-4-5',
  max_tokens: 1024,
  system: [
    {
      type: 'text',
      text: largeSystemPrompt,
      cache_control: { type: 'ephemeral' }  // Cache this prefix
    }
  ],
  messages: [{ role: 'user', content: userQuery }]
});

8.3 Batch Processing

For high-volume, non-time-sensitive tasks (data transformation, content generation at scale), the Batch API processes up to 10,000 requests asynchronously at 50% cost reduction.

const batch = await client.messages.batches.create({
  requests: items.map((item, i) => ({
    custom_id: `item-${i}`,
    params: {
      model: 'claude-haiku-4-5',
      max_tokens: 256,
      messages: [{ role: 'user', content: `Summarise: ${item.text}` }]
    }
  }))
});

// Poll for results
const results = await client.messages.batches.results(batch.id);

9. Common Errors & Troubleshooting

Error / Issue Cause Fix
401 Unauthorized Invalid or missing API key Check ANTHROPIC_API_KEY in .env
429 Rate Limit Too many requests per minute Implement exponential backoff
400 Bad Request Invalid messages array (wrong roles) Ensure alternating user/assistant turns
Hallucinated output Prompt too vague or ambiguous Add system prompt + output format constraints
JSON parse errors Claude added prose to JSON output Add Respond ONLY with valid JSON to system prompt
Truncated responses max_tokens too low Increase max_tokens; use streaming for long outputs
Tool loop not terminating Tool result not fed back correctly Ensure tool_use_id matches the block id
High latency Large prompt, no caching Enable prompt caching for repeated system prompts

What's Next

  1. Build an MCP server — Connect Claude to your own internal tools using the Model Context Protocol, letting it interact with your databases, APIs, and filesystems natively. Read our MCP server tutorial for a step-by-step walkthrough.

  2. Structured output with Zod — Combine Claude with zod schema validation to guarantee type-safe, structured JSON from every LLM call in your TypeScript backend.

  3. RAG pipelines — Pair Claude's large context window with vector search (Pinecone, pgvector, Qdrant) to build retrieval-augmented generation systems over your own documents.

  4. Agentic workflows with multiple tools — Define 5+ tools and build a Claude agent that can autonomously plan, research, write, and publish content end-to-end.


Further Reading


Closing

You now have a complete picture of what Claude can do and how to use it effectively in production. The biggest wins come from combining these features: a sharp system prompt, few-shot examples to constrain output format, tool use for external data, and prompt caching to keep costs manageable at scale.

Every code example in this guide lives in the companion repository — Github — with a working sample you can run immediately. Clone it, experiment, and break things. That's the fastest way to internalize this.

Happy building.

Share

Comments (0)

Join the conversation

Sign in to leave a comment on this post.

No comments yet. to be the first!