Paolo Perrone on Google ADK and Code-First Agents
A deeper look at Paolo Perrone's take on Google ADK and what code-first, event-driven agents mean for teams building AI systems.
Paolo Perrone recently shared something that caught my attention: "Google's new agent framework makes LangChain look like a toy. It's called ADK. It treats agent development like software development Not prompt engineering." That framing is provocative, but it also points to a real shift many teams are feeling right now.
We have spent the last year shipping agent demos by chaining prompts, patching tool calls, and debugging behavior by reading transcripts. It works until it does not: once an agent becomes a product feature with uptime, cost, safety, and change management requirements, the old habits start to creak.
In his post, Paolo argues that Google ADK (Agent Development Kit) pushes agents toward standard engineering practices: code-first development, an event-driven runtime, and a tool ecosystem that looks more like a platform than a prompt library. I want to expand on why that matters, what it changes day-to-day, and where the tradeoffs show up.
ADK is an opinionated bet: agents are software
Paolo's core claim is not just that ADK has more features. It is that it treats agent development like software development.
If your agent cannot be versioned, tested, reviewed, and deployed safely, it is not really production software yet.
In practice, that means ADK encourages you to define agents and workflows in code (Paolo highlights Python, Java, TypeScript, and Go), then run them in a runtime that is built for orchestration and long-running tasks. This is a different posture from the request-response mindset that dominates typical chat-based integrations.
When you adopt a code-first approach, you unlock familiar workflows:
- Code review and diffable changes, instead of editing prompts in a dashboard with unclear provenance
- Unit and integration testing for deterministic parts of the system
- CI/CD pipelines that promote agent updates like any other service
- Clear ownership boundaries between product logic, orchestration logic, and model behavior
This does not eliminate prompt work. It just puts prompts in the same place as the rest of your system: in code, under change control.
Event-driven runtime: why request-response is limiting
Paolo calls out an "event-driven runtime, not request-response." That matters because many real agent tasks do not fit neatly into a single synchronous call:
- A research agent that runs multiple searches, waits on rate limits, and revisits results
- A data agent that triggers BigQuery jobs and polls for completion
- A coding agent that executes code, captures outputs, and iterates
- A support agent that needs a human confirmation step before taking an irreversible action
An event-driven model is a better match for these realities. Instead of pretending everything happens inside one LLM call, you model the workflow as a series of events, tool invocations, and state transitions. The result is often more observable, more recoverable, and easier to scale.
Agent primitives: predictable workflows plus dynamic routing
One detail from Paolo's list that I think is easy to gloss over is the separation between reasoning agents and pipeline agents. He mentions primitives like:
- LlmAgent for reasoning
- SequentialAgent, ParallelAgent, LoopAgent for deterministic pipelines
- AgentTool to use agents as tools inside other agents
- LLM-driven routing for dynamic task delegation
This combination is powerful because it lets you mix two styles:
Deterministic pipelines where you need reliability
If a step must happen in a strict order (validate input, fetch customer record, compute policy decision, draft response), a SequentialAgent style pipeline gives you a structure that is testable and debuggable. Parallel and loop patterns matter when you need to fan out (for example, fetch from multiple sources) or iterate until a stopping condition is met.
Dynamic delegation where you need flexibility
LLM-driven routing is the opposite: you allow the model to decide which specialist agent or tool to call next. That is great for open-ended tasks like research, triage, or multi-topic customer requests.
The trick in production is not choosing one style. It is deciding where each applies. My rule of thumb is:
- Deterministic structure for safety-critical or cost-sensitive steps
- Dynamic routing for exploration, summarization, and ambiguous intent
The tool ecosystem is the real differentiator
Paolo says, "The tool ecosystem is where it gets ridiculous," then lists built-ins and integrations that hint at a platform play:
- Google Search and Code Execution built in
- Full MCP support (for example BigQuery and Google Maps MCP servers)
- Point it at any OpenAPI spec and it auto-generates tools
- Direct integration paths for LangChain, LlamaIndex, CrewAI, and LangGraph
- Human-in-the-loop confirmation before tool execution
- Long-running async tools
Let me unpack why each of these matters when you are beyond demos.
OpenAPI to tools reduces glue code and mismatches
In many teams, the slow part is not the model. It is writing and maintaining the adapter layer between the agent and business systems. Auto-generating tools from OpenAPI specs can reduce manual boilerplate and keep tool definitions aligned with the source of truth.
The caution: generated tools are only as good as the API design. You still need guardrails (scopes, auth, rate limits, safe defaults) and you still need to curate which endpoints are actually safe for an agent.
MCP support signals a shared, modular tooling future
Model Context Protocol (MCP) is emerging as a standard way to expose tools and data sources. If ADK supports MCP servers for common services, it makes tools more portable across agent stacks.
The big win is not one framework winning. The win is teams being able to swap runtimes without rewriting every integration.
Human-in-the-loop is not optional for many workloads
The ability to require confirmation before tool execution is a practical safety feature. If an agent can send an email, modify a database record, or place an order, you will want approval gates, at least until you have strong trust and monitoring.
Long-running async tools are similarly practical: many enterprise operations take seconds or minutes, not milliseconds.
Model support: optimized for Gemini, but not trapped
Paolo notes ADK is optimized for Gemini but model-agnostic, with LiteLLM integration and access to the Vertex AI Model Garden.
This is important for two reasons:
- You can standardize your orchestration while still choosing models per task (cheap model for routing, stronger model for synthesis, specialized model for code).
- You can keep leverage. If performance, cost, or policy changes, you are not locked into a single provider.
The operational reality is that multi-model setups are becoming normal. A framework that makes swapping models routine can reduce both vendor risk and performance risk.
Multimodal streaming: audio and video change the agent surface
Paolo also mentions bidirectional audio and video streaming out of the box. That deserves attention because it expands what an "agent" means. You are no longer limited to text chat:
- Voice assistants that can interrupt, clarify, and confirm actions in real time
- Video-aware agents for demos, troubleshooting, or guided workflows
- Real-time support experiences that feel closer to a call than a ticket
Streaming changes your architecture needs: latency budgets, partial results, and ongoing sessions become first-class concerns.
So, does ADK make LangChain look like a toy?
Paolo's phrasing is intentionally spicy, and I would translate it more gently: ADK appears to target a more complete software lifecycle for agents.
LangChain (and similar frameworks) can absolutely be used in serious systems, especially when paired with good engineering discipline. But ADK is signaling a different default: treat agents like services, not like prompt experiments.
If you are debating frameworks, a more useful question is:
- Do we need a workflow and tooling platform, or a flexible composition library?
Practical next steps if you want to evaluate ADK
Paolo ends with a simple call: pip install google-adk. If you are curious, here is a pragmatic evaluation path:
- Rebuild a small internal agent you already have (a research assistant or support triage bot) using ADK primitives.
- Identify which steps should be deterministic (pipeline) vs dynamic (router).
- Integrate one real tool via OpenAPI or MCP and add a human confirmation gate.
- Add basic tests for the deterministic pieces and wire it into CI.
- Measure what improved: debugging time, iteration speed, tool reliability, and deployment confidence.
That kind of comparison will tell you more than any feature checklist.
This blog post expands on a viral LinkedIn post by Paolo Perrone, No BS AI/ML Content | ML Engineer with a Plot Twist 🥷100M+ Views 📝. View the original LinkedIn post →