Akshay Parkhi's Weblog

All my curiosities — notes on GenAI, agents, physical AI, and quantum.

On physical-ai 18 ai-agents 57

 

April 24, 2026

AgentCore Harness, Inside Out

What’s actually running when AWS says “declarative agents” — and when it’s the right tool.

[... 2,598 words]

April 23, 2026

MCP Apps Explained: How AI Agent Shows Live Widgets Inside the Chat

I built a greeting card generator and got confused. The AI agent showed a real card with buttons inside the chat, and I couldn’t figure out why. Here’s what I learned — explained the way I wish someone had explained it to me.

[... 2,166 words]

April 14, 2026

AgentCore Registry: The Missing Yellow Pages for AI Agents

How we stopped hardcoding ARNs, what we learned publishing an MCP server and an A2A agent, and the VPC-endpoint footgun that shipped into every team’s first demo.

[... 2,439 words]

April 9, 2026

Beyond Tool Calling: A Practical Tour of Advanced MCP Concepts

If you’ve used MCP for a few weeks, you already know the basics: a server exposes tools, resources, and prompts, and a client (usually an LLM-driven agent) calls them. That mental model gets you surprisingly far. But it also flattens MCP into “just tool calling,” and you start to wonder what makes the protocol interesting compared to a plain JSON-RPC schema.

[... 2,874 words]

I Built an Agent in 5 Minutes: Anthropic Managed Agents vs AWS AgentCore + Strands

A side-by-side look at two very different bets on what “agent infrastructure” should mean.

[... 1,669 words]

April 5, 2026

AgentCore Auth from First Principles: How JWT Flows from Browser to Agent Container

When you deploy a React frontend on S3+CloudFront that talks directly to AWS AgentCore Runtime — no API Gateway, no Lambda proxy — is that secure? We traced every byte from browser to agent container to find out.

[... 1,417 words]

HTTP vs AG-UI: What Actually Changes in Your React Code

A question that comes up once you understand how AG-UI works: isn’t this just HTTP streaming with a defined event format? Could you achieve the same thing with the HTTP protocol if you defined the same output structure?

[... 1,489 words]

April 4, 2026

All Four AgentCore Protocols Are Just HTTP: What AG-UI, MCP, and A2A Actually Do

A question that comes up once you understand how AG-UI works: isn’t this just HTTP streaming with a defined event format? Could you achieve the same thing with the HTTP protocol if you defined the same output structure?

[... 1,489 words]

HTTP vs MCP vs A2A vs AG-UI: The Four Protocols of AgentCore Runtime

When you deploy an agent to AWS AgentCore Runtime, you pick a protocol: HTTP, MCP, A2A, or AGUI. This choice determines how your agent talks to the outside world — what it receives, what it sends back, and who it talks to. All four run on identical infrastructure. The differences live entirely in the framing and application layers.

[... 2,362 words]

AG-UI Protocol: A Layer-by-Layer Deep Dive with Real Network Captures

There’s a common misconception about AG-UI: people treat it as a transport protocol. It isn’t. AG-UI rides on top of HTTP and WebSocket — it doesn’t replace them. Understanding where each layer starts and stops is the key to debugging, optimizing, and building correctly with it.

[... 2,076 words]

AG-UI Protocol: The Missing Standard for AI Agent Interfaces

If you’ve built applications with AI agents, you’ve hit this wall: every framework has its own way of streaming responses to the UI. LangChain uses callbacks and streaming iterators. CrewAI returns completed results. AutoGen has its own message protocol. Amazon Bedrock Agents uses a proprietary streaming format. OpenAI Assistants has yet another event structure.

[... 1,959 words]

March 31, 2026

Does Claude Code Test Itself? Yes — Here’s What’s Actually in the Source

Anthropic published a blog post on demystifying evals for AI agents. It recommends three grader types, eight setup steps, and a feedback loop from production back into improvement decisions. What makes this interesting is what the Claude Code source code reveals: the product doesn’t just follow the philosophy — it IS the eval system.

[... 1,462 words]

Claude Code’s Design Philosophy: 10 Patterns to use for Your Agent Systems

A deep dive into Claude Code’s engineering decisions — the prompt architecture, tool philosophy, concurrency model, permission system, and memory design that make it work. Each section includes what you can apply to your own agent systems.

[... 2,415 words]

Multiple MCP Servers Through Amazon Bedrock AgentCore Gateway

As AI agents scale in enterprises, teams build dozens of specialized MCP (Model Context Protocol) servers — one for order management, another for product catalog, yet another for promotions. Each server has its own endpoint, its own auth, its own tool definitions. The agent that consumes these tools suddenly becomes an integration nightmare.

[... 1,675 words]

March 28, 2026

OpenUSD: Advanced Patterns and Common Gotchas.

Deeper OpenUSD concepts — schemas, rendering rules, performance patterns, and the gotchas that catch people off guard.

[... 1,122 words]

March 25, 2026

OpenUSD Mastery: From Composition to Pipeline — A SO-101 Arm Journey

OpenUSD (Universal Scene Description) is not just a 3D modeling format — it’s a universal language for describing complex scenes, their relationships, and their properties. Think of it as JSON for 3D worlds, but infinitely more powerful.

[... 1,260 words]

March 19, 2026

Learning OpenUSD — From Curious Questions to Real Understanding

Written as I explored OpenUSD before my exam. These are real questions I asked, and the answers that actually made things click for me.

[... 1,135 words]

March 18, 2026

7 Mental Models for Building Agent Skills (From Anthropic’s Internal Playbook)

Anthropic just published their internal playbook for Claude Code Skills — based on hundreds of skills in active use. Buried inside the practical advice are deep mental models for building better agents. Here’s what they’re really telling you.

[... 1,275 words]

From Prompt Engineering to Harness Engineering: Building Infrastructure for Autonomous Agents

2025 was the year of agents. 2026 is the year of harnesses — the persistent infrastructure that gives a foundation model hands, feet, and senses. The shift is fundamental: from prompt engineering (optimizing single interactions) to harness engineering (building the systems that control long-running, autonomous agents).

[... 1,174 words]

March 15, 2026

The Agent Loop Iceberg — 10 Hard Problems Hiding Beneath the Simple Loop

The basic agent loop — LLM call, tool execution, observe result, repeat — is maybe 10% of a production agent’s code. The other 90% is making it reliable, resumable, extensible, and production-grade. After tracing through real agent source code, here are the ten hard problems hiding beneath the surface that nobody shows you in tutorials.

[... 1,695 words]

March 13, 2026

Autoresearch and Context Rot — How a Stateless Agent Loop Avoids Memory Problems (And Where It Breaks)

The autoresearch pattern — where a coding agent runs hundreds of autonomous experiments to optimize code — produced a 53% speedup on Shopify’s 20-year-old Liquid codebase and a 69x speedup on a demo text processor. But there’s a fundamental flaw nobody talks about: the agent has no memory of failed experiments. Here’s exactly how the pattern works, where it breaks, and how Tobi Lütke’s team quietly fixed it.

[... 2,392 words]

How Skills Work in AI Agents — From Lazy-Loading Instructions to LLM Attention Weights

When you hear “skills” in AI agents, it sounds like a new concept. It’s not. Skills are a lazy-loading pattern for instructions — delivered through the same tool-calling mechanism the LLM already uses. But the details of how they load, where they land in the message hierarchy, and why they break at scale reveal deep truths about how LLMs actually work.

[... 2,801 words]

Coding in the AI Agent Age — Why Typing Code Is Dying But Engineering Is Thriving

If you think coding is just putting human-defined processes into structures, loops, functions, rules, packages, and web pages — you’re not wrong about the past. But that definition is dying. AI is automating the typing. What remains is the thinking.

[... 1,387 words]

Mental Models in the AI Agent Age

Mental models are compressed knowledge of human experience — patterns discovered over centuries by many thinkers across physics, biology, economics, mathematics, and systems theory. In the age of AI agents, these same patterns don’t just help you think better. They help you build better systems, debug reality faster, and make decisions that compound over decades.

[... 1,757 words]

March 12, 2026

I Ran 100 Parallel Tool Calls on AgentCore — The microVM Didn’t Break, But the LLM Did

What happens when you fire 100 tool calls in parallel inside a single AgentCore microVM? Does the microVM crash? Does it run out of memory? Does the thread pool explode? I deployed an agent with 100 tools to Amazon Bedrock AgentCore Runtime and ran a scaling test from 5 to 100 parallel tool calls. Here’s exactly what happened.

[... 2,597 words]

The 95% Rule: Why Your Agent Is Slow and How to Prove It

Your agent takes 5 seconds to respond. Where did those 5 seconds go? AgentCore gives you 6 observability layers, 30 hidden metrics, and a debugging decision tree — but you have to know where to look. Here’s everything you can’t see by just reading the code.

[... 2,829 words]

What Actually Happens When You Call invoke_agent_runtime()

You call invoke_agent_runtime(). Your agent responds 3 seconds later. But what actually happened in those 3 seconds? There’s an entire orchestration layer — sidecars, health checks, microVM boot sequences — that you never see. Here’s the full picture.

[... 1,146 words]

Inside an AgentCore microVM — Ports, Cold Starts, and the Sidecar Pattern

When you deploy an agent on Amazon Bedrock AgentCore Runtime, your Docker container runs inside a Firecracker microVM. But what actually happens inside that microVM? Here’s the complete picture — what boots, what listens on which port, why there’s a non-root user, and exactly what determines a cold start vs a warm start.

[... 1,416 words]

March 11, 2026

AgentCore Runtime vs Lambda — Scaling, Warm Pools, and Why Fixed 8 GB Boxes Exist

Amazon Bedrock AgentCore Runtime uses Firecracker microVMs to run AI agent tools in isolated environments. But if you’ve used Lambda, it sounds familiar — serverless, auto-scaling, pay-per-use. So why does AgentCore exist? Here’s the complete picture: how AgentCore actually scales, what it can and can’t do, and when you’d pick it over Lambda or ECS.

[... 1,534 words]

How Firecracker MicroVMs Power AgentCore Runtime — From 125ms Boot to Auto-Scaling AI Agents

When AWS needed to run Lambda functions — millions of them, simultaneously, for strangers on the internet — containers weren’t isolated enough and full VMs were too slow. So they built Firecracker: a microVM that boots in ~125 milliseconds with ~5 MB of memory overhead, gives you hardware-level isolation, and lets you pack thousands of them onto a single server. Now Amazon Bedrock AgentCore Runtime uses the same technology to run AI agent tools. Here’s exactly how it all works.

[... 2,548 words]