Alis Exchange

The Python vs GoLang performance gap for scale-to-zero serverless AI

If you are currently deploying AI agents on Google Cloud, you are likely considering Cloud Run to host them (and for good reason). As a modern and fully-managed serverless platform, it checks the important boxes:

Scale-to-zero: Your agent “sleeps” while inactive, saving you money.
Container runtime: Total flexibility to bring binaries, frameworks or programming language of choice.

That being said, not all agents are deployed equal.

To illustrate this point, I ran an experiment a couple weeks back building two functionally identical agents using Google’s Agent Development Kit (ADK). The only difference between them was that the one was built using the Python ADK and the other using the GoLang ADK (released just over a month ago). Both were deployed to Cloud Run.

The difference in results for the container startup latency were significant (see screenshot below):

Python: ~9.5s GoLang: ~250ms. A 40x difference!

‍Fig. Comparison of Cloud Run container startup latency. Python ADK runtime is shown at the top and GoLang ADK runtime shown at the bottom.

In hindsight, this result was perhaps not too surprising given Python’s heaviness as a runtime (i.e. the python interpreter, loading large libraries, etc.) compared to a pre-compiled GoLang binary. But does this difference really matter?

From a user’s perspective, I’d argue that this difference could be very significant. In the simplest of scenarios, imagine sending “hi” to an agent and processing of the request begins almost immediately. The agent feels snappy and reactive. Contrast that with staring at a loading spinner for 10 seconds before the agent even starts processing the request…

To highlight another example where this may be significant, consider a multi-agent system involving a chain of agents either collaborating or delegating to one another. These interactions could be several layers deep. Assuming each agent to be independently hosted, time-to-first-response from the user’s perspective compounds with each new agent-to-agent interaction.

The key takeaway here is that infrastructure and runtime decisions cannot necessarily be decoupled from user experience. The engineering details at every level matter when building fit-for-purpose agent applications.

If you’re looking for more deep dives into building AI agents on Google Cloud, stay tuned for more in the coming weeks. In the meantime, check out the GoLang ADK here to get started building your own agents.

‍

Not all agents are deployed equal

On this page

The Python vs GoLang performance gap for scale-to-zero serverless AI

Related articles for your team

The PRD is Dead? Long Live the PRP (Product Requirement Prototype)

When to activate ADS for enterprise agentic AI programs