👤 Your code ──HTTPS──▶ ⚙️ Claude API ──▶ 🧠 Models
│
├──▶ 💬 Chat
├──▶ 🌊 Streaming
├──▶ 🛠️ Tool use
├──▶ 👁️ Vision
├──▶ 📄 PDFs
└──▶ 💾 Cached prompts
# Add to your shell profile (.zshrc, .bashrc)
export ANTHROPIC_API_KEY="sk-ant-..."
[!WARNING] Never commit API keys to git. Use
.envfiles or a secret manager. There are bots that scan public repos within minutes.
The API charges per token (~¾ of a word). Pricing varies by model:
INPUT (per million tokens) OUTPUT (per million tokens)
────────────────────────── ─────────────────────────────
💎 Opus ████████████████████ $$$ ████████████████████████ $$$$
⚡ Sonnet ████████ $$ ████████████ $$
🚀 Haiku ██ $ ███ $
(illustrative — check pricing page for current rates)
📌 Prices change. Always check docs.claude.com/en/docs/about-claude/pricing before estimating costs.
pip install anthropic
# hello_claude.py
from anthropic import Anthropic
client = Anthropic() # reads ANTHROPIC_API_KEY from env
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[
{"role": "user", "content": "In 3 bullets, what is the Anthropic Claude API?"}
],
)
print(message.content[0].text)
Run it:
python hello_claude.py
npm install @anthropic-ai/sdk
// hello-claude.ts
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic(); // reads ANTHROPIC_API_KEY from env
const message = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1024,
messages: [
{ role: "user", content: "In 3 bullets, what is the Anthropic Claude API?" },
],
});
console.log(message.content[0].type === "text" ? message.content[0].text : "");
That’s it — you’ve used the API.
client.messages.create(
model="claude-sonnet-4-6", # which model
max_tokens=1024, # cap on response length
system="You are a witty haiku-only assistant.", # system prompt
messages=[ # conversation history
{"role": "user", "content": "Tell me about the ocean."},
{"role": "assistant", "content": "Vast salt cathedral / ..."},
{"role": "user", "content": "Now about mountains."},
],
temperature=0.7, # 0 = deterministic, 1 = creative
)
Key concepts:
system — the persistent instruction (think of it as Claude’s “role”)messages — the conversation. Alternates user and assistant. You maintain history; the API is stateless.max_tokens — required. Hard cap on output length.temperature — randomness. Use 0 for extraction/classification, 0.7+ for creative writing.For chat UIs, you don’t want to wait for the full response. Stream it.
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a 200-word story about a lonely lighthouse."}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
const stream = await client.messages.stream({
model: "claude-sonnet-4-6",
max_tokens: 1024,
messages: [{ role: "user", content: "Write a 200-word story about a lonely lighthouse." }],
});
for await (const event of stream) {
if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
process.stdout.write(event.delta.text);
}
}
Claude is stateless — you must keep and resend the conversation history yourself.
history = []
def chat(user_message: str) -> str:
history.append({"role": "user", "content": user_message})
resp = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="You are a helpful coding tutor.",
messages=history,
)
reply = resp.content[0].text
history.append({"role": "assistant", "content": reply})
return reply
print(chat("What's a Python decorator?"))
print(chat("Show me a simple example."))
print(chat("Now apply that to a function that logs how long it takes to run."))
For long conversations, you’ll eventually hit the context window limit. Strategies:
Tools let Claude call your code. The flow:
1. 👤 User ──▶ 🧠 Claude "Should I bring a jacket to Paris?"
2. 🧠 Claude ──▶ 🧠 Claude decides: needs weather
3. 🧠 Claude ──▶ 💻 Your code tool_use: get_weather(city="Paris")
4. 💻 Your code ──▶ 💻 Your code calls real weather API
5. 💻 Your code ──▶ 🧠 Claude tool_result: {temp: 14, drizzle}
6. 🧠 Claude ──▶ 👤 User "Yes — 14°C and drizzly. Bring a light jacket."
import json
from anthropic import Anthropic
client = Anthropic()
tools = [
{
"name": "get_weather",
"description": "Get current weather for a city.",
"input_schema": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
"unit": {"type": "string", "enum": ["c", "f"], "default": "c"},
},
"required": ["city"],
},
}
]
def get_weather(city: str, unit: str = "c") -> dict:
# In reality, call a weather API. We'll fake it.
return {"city": city, "temp": 22, "unit": unit, "conditions": "sunny"}
messages = [{"role": "user", "content": "Should I bring a jacket to Paris today?"}]
while True:
resp = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=tools,
messages=messages,
)
if resp.stop_reason == "tool_use":
# Claude wants to call a tool
tool_use = next(b for b in resp.content if b.type == "tool_use")
result = get_weather(**tool_use.input)
messages.append({"role": "assistant", "content": resp.content})
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": json.dumps(result),
}],
})
continue # let Claude see the result and respond
# Final answer
print(next(b.text for b in resp.content if b.type == "text"))
break
This pattern is the foundation of every “agent” — Claude in a loop, calling tools until the task is done.
Claude can see images. Send them as base64 or a URL.
import base64
with open("chart.png", "rb") as f:
img_b64 = base64.standard_b64encode(f.read()).decode()
resp = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": img_b64,
},
},
{"type": "text", "text": "What's the most surprising thing in this chart?"},
],
}],
)
print(resp.content[0].text)
Use cases: chart analysis, OCR, accessibility (alt text generation), receipt parsing, screenshot debugging.
Recent Claude models can read PDFs natively — no parsing layer needed.
import base64
with open("contract.pdf", "rb") as f:
pdf_b64 = base64.standard_b64encode(f.read()).decode()
resp = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
messages=[{
"role": "user",
"content": [
{"type": "document", "source": {
"type": "base64", "media_type": "application/pdf", "data": pdf_b64,
}},
{"type": "text", "text": "List every party's obligations as a markdown table."},
],
}],
)
print(resp.content[0].text)
If you keep sending the same long context (a system prompt, a knowledge base), cache it. Cached input tokens cost ~10% of normal pricing.
┌──────────────────────────────────────────────────────────────┐
│ │
│ FIRST CALL (writes cache) │
│ ┌───────────────────────────────────────┐ │
│ │ System: "You're support for Acme..." │ $$ premium │
│ │ Knowledge: [50 pages of docs] ──┐ │ to write cache │
│ │ User: "How do I reset?" │ │ │
│ └───────────────────────────────────│───┘ │
│ │ │
│ ▼ stored ~5 min │
│ SUBSEQUENT CALLS (read cache) │
│ ┌───────────────────────────────────────┐ │
│ │ System: same │ ✅ ~10% cost │
│ │ Knowledge: same ─────────────────┐ │ │
│ │ User: "What's warranty?" │ │ │
│ └───────────────────────────────────│───┘ │
│ │ │
│ └─► uses cached ✨ │
│ │
└──────────────────────────────────────────────────────────────┘
resp = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system=[
{
"type": "text",
"text": "You are a customer support agent for AcmeCorp...",
"cache_control": {"type": "ephemeral"}, # cache this block
},
{
"type": "text",
"text": LONG_KNOWLEDGE_BASE,
"cache_control": {"type": "ephemeral"},
},
],
messages=[{"role": "user", "content": "How do I reset my password?"}],
)
First call writes the cache (small premium). Every subsequent call within ~5 minutes pays the discounted rate. For chatbots and RAG, this can cut costs by 80%+.
For jobs you don’t need answered immediately (overnight summarization, batch classification), use the Batch API:
Great for: dataset labeling, bulk content generation, periodic reports.
Before you ship anything API-powered, walk this checklist top to bottom:
| # | Step | Why |
|---|---|---|
| 1 | 🔒 Secrets in vault | No keys in code, repo, or logs |
| 2 | ⏱️ Rate-limit endpoint | Don’t expose raw API to users |
| 3 | 💰 Cost caps + alerts | Anomaly alerts on token spend |
| 4 | 🔁 Retries with backoff | Handle 429s and 5xxs gracefully |
| 5 | 📝 Logging + PII redaction | Audit trail without leaks |
| 6 | 🧪 Eval set | Regression tests on every prompt change |
| 7 | 🛡️ Fallback model | Survive model outages |
| 8 | 💾 Prompt caching | If static context > 1024 tokens, cache it |
| 9 | 🚀 Ship it! |
Detailed checklist:
Working examples for everything in this module live in examples/:
01-hello-world.py02-streaming.py03-multi-turn.py04-tool-use.py05-vision.py06-pdf.py07-prompt-caching.py08-streaming.ts09-tool-use.tsYou should now be able to:
👉 Next up: Module 07 — Advanced Techniques — extended thinking, agents, MCP, and computer use.
| ← Previous | 🏠 Home | Next → |
|---|---|---|
| Module 05 — Claude Code | Course README | Module 07 — Advanced Techniques |