Enterprise organizations can successfully scale agentic AI by implementing a centralized MCP Gateway and Registry that provides standardized tool discovery, security governance, and quality control, enabling thousands of engineers to safely and efficiently use AI agents across complex service landscapes.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
How Uber Runs 60K AI AgentsAdded:
Good morning, everyone. It's great to be here. I'm Meghna and this is Raj. We lead the agentic AI platform and initiatives at Uber. Today, we're going to be sharing our journey of moving MCPs from a promising protocol to operating this at massive scale across thousands of engineers, services, and agents. And we're also going to cover some of our challenges, similar to what James was mentioning actually, what some of our solutions were, what are some of the lessons we learned along the way, and a quick look at what we're planning for the near future.
Okay, let's get started by grounding this in scale. At Uber, we have more than 5,000 engineers with more than 90% of them using AI every single month for agentic workflows.
And this does not yet include all of the non-engineering folks, thousands of them who are also now trying agentic workflows. So, this is no longer a pilot program at Uber. It's the new standard for how we work.
The reality is of a company this big is really that we have more than 10,000 services. Our knowledge is spread across this very complex landscape. Without MCPs, every agent has to rediscover how to interact with each service. And on top of this, we are now seeing more than 1,500 agents just internally monthly active and over 60,000 executions per week. And this is very high demand, as you can see, very high velocity, but without standardization, this starts becoming chaotic very quickly.
And so, MCPs are not just like important, they really are what make AI usable at Uber.
And at this scale, the standard MCP challenges become a lot more amplified.
The first class of problems that we saw was around the development life cycle.
Without a central framework or guidance, there was no standard way to develop and deploy these MCP servers at Uber. So, we had a lot of teams building out these custom integrations independently, and most of this was non-reusable.
We also had everybody trying to figure out how to solve the very same problems in silos of their own.
The simple truth was, if you can't manage the development life cycle, you just can't trust it in production.
And at Uber, security is non-negotiable.
But with these many bespoke ways of doing things, governance start becoming a very immediate concern for us. We needed complete visibility into the call patterns and who was accessing what data. And in reality, it takes us humans a lot more effort to break things, but with agents, as you know, it's a lot faster, it's a lot quicker, and the blast radius is a lot higher.
So, we had to make sure that there was no unauthorized access to any data or any of our critical end points and services, even unknowingly. We also had to account for any risks that we would have with some of the third-party MCPs around data handling cuz Uber does use a lot of external systems.
Now, the last class of problems that we saw was around discovery and quality.
How does an agent or even an engineer find the right MCP? And not just any MCP, something that is reliable, has high performance, and is safe.
Bad tools just don't fail, they also degrade the agent performance at the end of the day.
Our solution was we built MCP Gateway and Registry.
Think of this as the control plane for all MCP interactions at Uber.
With this, we moved to a config-driven approach. We now translate all Uber service endpoints into MCP tools automatically.
Service owners, which are basically the experts, still stay in control of which tools actually get exposed and they also fine-tune the descriptions for the LLMs.
This removes a lot of duplication and also enforces consistency for us across the board.
And the other thing we also did is we followed a different strategy for how we treat and enable third-party MCPs versus our in-house MCPs. We introduced a lot more levels of gating, scanning, and rigorous checks for the external systems compared to our own trusted internal systems.
We also deprecated all one-off standard playground environments that people started, you know, uh spawning off and everything is now centrally committed and managed in code.
We also introduced a central registry.
Um this is the single source of truth to discover all the MCPs at Uber and their versions.
And we did all of this with security and privacy built in every single layer.
We integrated into our authorization service centrally, so there was no access to any data that is actually not supposed to be there.
The other thing we also did is we integrated with our PII redaction service and this automatically redacted any of our sensitive data, too.
We also do periodic code scanning both at diff commit time as well as in our code periodically to make sure that we are detecting any bad patterns, any, you know, endpoint exposures unknowingly or any of the risky tool metadata.
We also have full observability and guardrails both to block any mutable endpoints that can bring down critical services for us, but also to have extensive logging and uh metrics and tracing for all the operations.
>> [snorts] >> And let's take a quick look at what this gateway architecture looks like.
There are two critical components to this gateway.
The first one is the orchestrator, which is responsible for generating the MCP definitions from the 10,000 plus service IDEs at Uber.
And then we have the gateway service, which serves these MCP servers and also allows the service owners to update the MCP definitions.
Let's quickly walk through what this actually how this actually works.
The gateway orchestrator crawls all of these IDEs, which are proto and thrift files. Then it calls an LLM to generate MCP tool descriptions. This is based on the message names and comments, and then it stores this in our object storage.
The gateway service then has a conflict provider that's going to pick up these definitions and serves these MCP servers to the different consumers we have at Uber, whether it's a no-code agent platform or SDKs or coding agents.
And as I mentioned earlier, service owners can update all of these definitions, which then triggers creating a diff, which is basically a pull request. And the diff is then scanned by our engineering securities unified scanning APIs. And if there are no issues, everything is good, then the scan report is attached to the diff and the diff is committed and deployed to our object storage.
And this is then again available to be picked up by the gateway service, which exposes it to the consumers.
Now, I'm going to hand it off to Rush to talk more about how we use MCPs and consumption in more detail.
Thanks, Megha.
So, let's talk about how our MCPs used at Uber. Um, there's actually three main surfaces at Uber that use MCPs. One is our Uber agent builder, uh, which is a no-code solution for building, uh, agents at Uber. Uh, this agents are usually internal agents.
And uh they're used for productivity, uh for team uh workflow automation, and so on. There is thousands of these uh that are active on a monthly basis at Uber right now, and this they're growing very rapidly across the board.
The next uh surfaces are Uber Agent SDK.
Uber Agent SDK, along with all the uh Uber Agent platform functionality, like managed memory, managed uh chat history, uh orchestration, uh is our code-first solution for building agents at Uber. Uh some of our top use cases use this uh SDK.
Uh those top use cases include our grocery assistant agent, our care uh auto uh care coordination agent, as well as our customer support agent. If you interact with our customer support, you'll see this in action.
And then finally, we have our coding agents. The coding agents, uh as you know, they're cloud code, they're cursor, they're companions to uh our developers uh to build software at Uber.
And on top of that, we have Minions, which is our background agent that's built on uh cloud as well, the cloud harness as well. And uh it's actually producing about 1,800 uh code changes a week right now.
And it is being used by All of these are being used by 95% of our engineers across Uber.
Okay, so let's dive into uh a bit more details about how these are actually incorporated to each one of these surfaces. The MCPs are incorporated in here.
Um I'll start with Agent Builder. So, for Agent Builder, if you want to use an MCP, you can mention the MCP uh server name uh as an at mention inside the system instructions. So, you can actually scope the MCP within the system instructions.
For example, if I want to search for something, uh I can say uh if a user asks for this, use the @mcp server for Usearch, which is our internal search tool, to return information.
Now, as all of you know, and it's been brought up in previous presentations, these things can hallucinate and maybe not pick the right tool. So, what we've done is we actually allow you to pick the specific tools from the MCP server so that LLM doesn't have to make a decision there.
Uh This makes the agent more reliable. And then, to further make it more reliable, we have the capability to do parameter overrides. So, what that means is the LLM doesn't have to make a decision even to pass in a parameter anymore. We can scope the parameter to something static instead. Again, this is through no-code UI, so just making it easier for these users to do this is highly important. And again, makes the agent more reliable.
Okay. So, talking about So, that was Uber Agent Builder and how it's used. In Uber Agent SDK, it's somewhat similar.
We have a config.yaml, a yaml config file that we use there, and you can put in the MCP name and identifier field, and then you can also select the tools that you want or pick the tools you want and put that in the config.
And on top of that, you can also override parameters in the same way.
You put all of this in the configuration, the SDK automatically loads these tools and makes it available to the agent with those specific configurations.
On the coding agent side, we have our AI FX CLI tool.
Basically, what AI FX lets you do is to add your MCP um by running the MCP add command, and then this MCP server, but whether it's remote or local, is available to both cloud code as well as cursor or any other IDE based agent that we have available at Uber.
Okay, so that's what we've done so far and how we're using MCPs. Uh I want to talk about our roadmap and how we're essentially focused in the near future uh on improving the quality of these MCP servers and then simplifying discovery.
Um we want to extend our MCP registry to include more evaluation information. We want to surface the highest quality MCP servers to our users.
So, doing uh by by exposing the evaluation metrics, by including the SLA service SLAs for these MCPs, this includes reliability and availability of the service, that's how we surface these are the this is the right MCP. This is a higher tier and or lower tier MCP that you can use that's most reliable.
Uh we also want to we are working on adding a tool search tool that was actually mentioned earlier.
Uh again, that helps us help it helps us with improve the accuracy of tool discovery by making it so that it's discovered automatically and it's also loaded on demand.
That also helps us with context bloat, reduces context bloat as as well. So, that's again something we're focused on introducing to our registry and our our MCP tools uh as kind of an MCM omni MCP tool.
Uh and then obviously evaluations is something we're building into the registry and uh our overall agent platform has evaluations built for agents as well.
The other thing as everybody knows about is skills, right? As skills are becoming more and more important uh at Uber. You can think of them as recipes for using these MCPs. So, we want to make them shareable, not just across Uber, but across different teams at Uber.
We want to be able to have processes that can be shared and conventions that can be shared across Uber through these skills.
And then we want to introduce evaluations to these skills. That means we want to be able to evaluate the output quality. We want to be able to evaluate the correctness of skill invocation.
And we want to be able to AB test these skills. So if I have a different version of the same skill, which one performs better? So we want to have those informations.
So that's a bit about where we are today and where we're going to go in the near future.
If you want to connect with us, please connect with us at the this event and if you don't get we don't get a chance to meet here, please connect with us on LinkedIn. Thank you.
>> [applause]
Related Videos
Re: 🗣️📍theprophedu📍2026 GST 103 CLASS (E-EXAM REVISION)
theprophedu
636 views•2026-06-04
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Instagram accounts got PWNed
EricParker
13K views•2026-06-03
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 views•2026-05-29
So What's Odin Lang Even Good For
TechOverTea
131 views•2026-06-01











