43 posts tagged “ai-agents”
2026
7 Mental Models for Building Agent Skills (From Anthropic’s Internal Playbook)
Anthropic just published their internal playbook for Claude Code Skills — based on hundreds of skills in active use. Buried inside the practical advice are deep mental models for building better agents. Here’s what they’re really telling you.
[... 1,275 words]From Prompt Engineering to Harness Engineering: Building Infrastructure for Autonomous Agents
2025 was the year of agents. 2026 is the year of harnesses — the persistent infrastructure that gives a foundation model hands, feet, and senses. The shift is fundamental: from prompt engineering (optimizing single interactions) to harness engineering (building the systems that control long-running, autonomous agents).
[... 1,174 words]The Agent Loop Iceberg — 10 Hard Problems Hiding Beneath the Simple Loop
The basic agent loop — LLM call, tool execution, observe result, repeat — is maybe 10% of a production agent’s code. The other 90% is making it reliable, resumable, extensible, and production-grade. After tracing through real agent source code, here are the ten hard problems hiding beneath the surface that nobody shows you in tutorials.
[... 1,695 words]Autoresearch and Context Rot — How a Stateless Agent Loop Avoids Memory Problems (And Where It Breaks)
The autoresearch pattern — where a coding agent runs hundreds of autonomous experiments to optimize code — produced a 53% speedup on Shopify’s 20-year-old Liquid codebase and a 69x speedup on a demo text processor. But there’s a fundamental flaw nobody talks about: the agent has no memory of failed experiments. Here’s exactly how the pattern works, where it breaks, and how Tobi Lütke’s team quietly fixed it.
[... 2,392 words]How Skills Work in AI Agents — From Lazy-Loading Instructions to LLM Attention Weights
When you hear “skills” in AI agents, it sounds like a new concept. It’s not. Skills are a lazy-loading pattern for instructions — delivered through the same tool-calling mechanism the LLM already uses. But the details of how they load, where they land in the message hierarchy, and why they break at scale reveal deep truths about how LLMs actually work.
[... 2,801 words]Coding in the AI Agent Age — Why Typing Code Is Dying But Engineering Is Thriving
If you think coding is just putting human-defined processes into structures, loops, functions, rules, packages, and web pages — you’re not wrong about the past. But that definition is dying. AI is automating the typing. What remains is the thinking.
[... 1,387 words]Mental Models in the AI Agent Age
Mental models are compressed knowledge of human experience — patterns discovered over centuries by many thinkers across physics, biology, economics, mathematics, and systems theory. In the age of AI agents, these same patterns don’t just help you think better. They help you build better systems, debug reality faster, and make decisions that compound over decades.
[... 1,757 words]I Ran 100 Parallel Tool Calls on AgentCore — The microVM Didn’t Break, But the LLM Did
What happens when you fire 100 tool calls in parallel inside a single AgentCore microVM? Does the microVM crash? Does it run out of memory? Does the thread pool explode? I deployed an agent with 100 tools to Amazon Bedrock AgentCore Runtime and ran a scaling test from 5 to 100 parallel tool calls. Here’s exactly what happened.
[... 2,597 words]The 95% Rule: Why Your Agent Is Slow and How to Prove It
Your agent takes 5 seconds to respond. Where did those 5 seconds go? AgentCore gives you 6 observability layers, 30 hidden metrics, and a debugging decision tree — but you have to know where to look. Here’s everything you can’t see by just reading the code.
[... 2,829 words]What Actually Happens When You Call invoke_agent_runtime()
You call invoke_agent_runtime(). Your agent responds 3 seconds later. But what actually happened in those 3 seconds? There’s an entire orchestration layer — sidecars, health checks, microVM boot sequences — that you never see. Here’s the full picture.
Inside an AgentCore microVM — Ports, Cold Starts, and the Sidecar Pattern
When you deploy an agent on Amazon Bedrock AgentCore Runtime, your Docker container runs inside a Firecracker microVM. But what actually happens inside that microVM? Here’s the complete picture — what boots, what listens on which port, why there’s a non-root user, and exactly what determines a cold start vs a warm start.
[... 1,416 words]AgentCore Runtime vs Lambda — Scaling, Warm Pools, and Why Fixed 8 GB Boxes Exist
Amazon Bedrock AgentCore Runtime uses Firecracker microVMs to run AI agent tools in isolated environments. But if you’ve used Lambda, it sounds familiar — serverless, auto-scaling, pay-per-use. So why does AgentCore exist? Here’s the complete picture: how AgentCore actually scales, what it can and can’t do, and when you’d pick it over Lambda or ECS.
[... 1,534 words]How Firecracker MicroVMs Power AgentCore Runtime — From 125ms Boot to Auto-Scaling AI Agents
When AWS needed to run Lambda functions — millions of them, simultaneously, for strangers on the internet — containers weren’t isolated enough and full VMs were too slow. So they built Firecracker: a microVM that boots in ~125 milliseconds with ~5 MB of memory overhead, gives you hardware-level isolation, and lets you pack thousands of them onto a single server. Now Amazon Bedrock AgentCore Runtime uses the same technology to run AI agent tools. Here’s exactly how it all works.
[... 2,548 words]The Complete Beginner’s Guide to Robotics — From Your First Camera to Foundation Models
This is everything you need to go from zero to building, sensing, and coding robots at home. We’ll start with what hardware to buy, walk through how Tesla and Waymo actually see the road, write real depth estimation and PID control code you can run with just a camera, and end with how foundation models and RAG are transforming what robots can do.
[... 4,183 words]From Webcam to Robot Brain: How Vision-Language Models and Vision-Language-Action Models Actually Work
I built a webcam app that sends live frames to Claude and GPT-4o for real-time scene understanding. Along the way, I discovered how fundamentally different this is from what robots like OpenVLA do with the same camera input. Here’s the full pipeline — from photons hitting your webcam sensor to tokens coming back from the cloud.
[... 1,756 words]10 Hidden Concepts in Strands SDK Hooks That You Won’t See by Reading the Code
The Strands SDK hook system looks simple on the surface — register a callback, receive an event. But there are 10 hidden concepts buried in the design that you’ll never see by just reading the code. Here’s what’s actually happening under the hood.
[... 1,390 words]Top Claude Code Skills: What 20 YouTube Videos and 2.3M Views Agree On
I researched 20 YouTube videos on Claude Code skills, fed them all into Google NotebookLM, and asked it to synthesize the top skills across every source. Here’s what came back — ranked by how often they were mentioned and how impactful creators found them.
[... 848 words]Build a Research Engine in Claude Code: YouTube Search → NotebookLM → Synthesized Insights in 5 Minutes
I built a research engine inside Claude Code that searches YouTube, feeds results into Google NotebookLM, and lets me query across all the sources — all without leaving the terminal. Here’s exactly how it works and how to set it up yourself.
[... 1,151 words]playwright-cli: How to Give Your AI Coding Assistant a Real Browser
If you use Claude Code (or any AI coding assistant), there’s a tool that makes browser automation trivially easy: playwright-cli. It’s a command-line wrapper around Microsoft’s Playwright that lets you control a real browser from your terminal — navigate pages, click buttons, fill forms, take screenshots, and scrape content. Here’s how to set it up and why it’s genuinely useful.
[... 1,104 words]Context Engineering for AI Agents: 6 Techniques from Claude Code, Manus, and Devin
After studying how production AI agents like Claude Code, Manus, and Devin actually work under the hood, the single most important concept isn’t prompt engineering — it’s context engineering. The art of controlling exactly what goes into the model’s context window, and what stays out.
[... 2,082 words]Multi-Agent Is Two Problems: Why TypeScript and Python Each Win Half"
Will multi-agent be done better in JavaScript than Python? No. They solve different types of multi-agent. This is the most important distinction nobody is making clearly enough.
[... 1,364 words]Why Every Coding Agent Is TypeScript (And Every ML Framework Is Python)
Every major coding agent — Claude Code, OpenCode, Pi, Amp — is built in TypeScript. Not Python. This isn’t a coincidence. It’s architecture. And to understand why, you need to go back to the origin stories.
[... 2,983 words]Why .md Files Are the Agent: How ClawdBot Turns Markdown Into a Living Assistant
You’d expect an AI agent to be built from code — Python classes, orchestration frameworks, complex pipelines. But ClawdBot (OpenClaw) does something different. Its agent is mostly .md files sitting in a folder. Plain text. Markdown. The kind of thing you’d write documentation in.
[... 1,496 words]OpenClaw vs Claude Code Agent Teams — Architecture Comparison
OpenClaw and Claude Code Agent Teams share DNA — both use .md files as personality/rules injected into system prompts, both support multiple sessions with separate context windows, and both execute tools. But they’re fundamentally different in scope and execution model.
[... 1,183 words]OpenClaw / ClawdBot Architecture — How a Telegram AI Bot Actually Works
OpenClaw (ClawdBot) is a Telegram bot powered by Claude Opus that delivers daily AI news digests, runs scheduled jobs, and maintains persistent conversations. Here’s the full architecture — how every piece connects.
[... 1,200 words]Agentic Engineering Patterns — Key Takeaways from Simon Willison’s Guide
Simon Willison recently published an excellent guide on agentic engineering patterns — practical lessons from building with AI coding agents. Here are the key takeaways that matter most for AI agent developers, distilled from his full guide.
[... 567 words]How AWS Strands Hooks Work
Hooks are an event-driven extensibility system — a way to inject custom logic at specific points in the agent lifecycle without modifying core code. Think of them as middleware/interceptors.
[... 713 words]How AWS Strands Agent Loop Works
Strands uses the ReAct pattern — a recursive loop where the LLM reasons, optionally calls tools, observes results, then reasons again. It is NOT prompt chaining (where you have a fixed sequence of prompts). The loop is open-ended and driven by the model’s decisions.
[... 650 words]How Claude Team Agents ACTUALLY Connect — No Fluff
Agents are separate processes. They don’t share memory. They don’t call each other’s functions. They communicate through files on disk and CLI commands that Claude Code provides as tools.
[... 2,382 words]How Pi Builds Its System Prompt at Runtime — And the Innovations That Make It Stand Out
A deep dive into the open-source coding agent that assembles its brain on the fly.
[... 2,918 words]