安装我们的扩展，即时搜索任意视频内容

Google DeepMind on Gemma, AI Studio & the Future of AI | Omar Sanseviero & Paige Bailey
本站添加: 2026-06-03

122 观看226:51Superintel-daily原视频发布: 2026-06-02

Google DeepMind's Gemma models, released under the permissive Apache 2.0 license, represent a significant advancement in making AI development accessible to non-technical users. The models come in various sizes optimized for different deployment scenarios—from 2B and 4B parameter models designed for efficient on-device execution on phones and laptops, to larger 31B parameter models for production environments. This approach enables developers to build applications without extensive technical expertise, with tools like Google AI Studio providing features such as Firebase integration, workspace support, and mobile deployment capabilities. The open model strategy, combined with tools that allow non-engineers to create functional AI applications, democratizes AI development and expands the potential for innovation beyond traditional software engineering backgrounds.

[00:00:00]We were just talking actually outside with one of the the folks that we know from one of our early access programs who had built a mobile app powered by Gemma. Three years ago, this would have been impossible a profitable boop business.

[00:00:12]>> The models can finally do 95% of the things I want them to do in the day-to-day >> and it might look correct. Many of the people are coming from outside of this world of software engineering and they don't know how to kind of doubleclick and make sure that all of those connections that they asked for are actually robust. I mean, if you want like production grade systems, you need to be careful with some of these things.

[00:00:31]We want to build the best models we can build, right? And in that sense, our benchmark is ourselves rather than the competitors. We know what the community wants.

[00:00:38]>> If you do give everyone the ability to create any sort of app for $0, super fast, completely on device from a mobile phone. Like there's there's really no limit to the potential of this globally, like not just in the United States, not just in Europe, but everywhere.

[00:01:01]Hi, I'm Kim Eenbeck, super intelligence editor and chief. And today I'm joined by Omar and Paige from Google Deep Mind.

[00:01:09]Omar leads developer experience. She's the person making sure models like Gemini and Gemma actually reach developers and before deepmind platform and community at hugging phase and Paige is the technical unit technical lead for creator and developer experiences driving the product strategy behind Gemini and she previously worked at GitHub co-pilot right Omar Page great to have you both here >> great to be here thank you so much for coming to IO this year >> yeah thank you for having us here >> thank you so much for inviting me so it's been big days of announce announcements. Before we get into specifics, what's the thing you should show at IO that you're personally most excited about?

[00:01:49]>> So, I I feel like the the experiences that we just added in AI Studio are pretty compelling. Um, we talked a little bit earlier about managed agents, which our our colleagues have been uh sort of shipping out. uh they give you the ability to have um these agents that are running 24/7 on managed cloud VMs which is pretty magical and you can invoke them just by asking in natural language for things that you would like to have created. And then also I'm really jazzed about uh being able to vibe code Android apps and AI studio.

[00:02:21]>> Yeah.

[00:02:22]>> Yeah. For me something that has been quite exciting to see over the last few months is that the models are finally there. like the models can finally do 95% of the things I want them to do in the day-to-day. The next layer there is the identic hardness right which is pretty much what P is talking about the in this case anti-gravity as a aentic hardness not just as the coding platform but really as a agentic hardness that is enabling all of these different experience from the manage agents in AI studio to by coding in AI studio to a bunch of different products across Google and I think that's very very exciting >> and it's I love that you called out that it's like a singular agent harness whereas in the before times I think there were many different ones that you could choose from and Now everything from Gemini Spark to AI Studio to anti-gravity is running the same tool, >> right? Makes totally sense. Um, Omar, you spent years at Hugging Face, basically the home of open-source AI, right now you're at Google DeepMind building the models that get distributed through platforms like Hugging Face. How did that switch change how you think about what developers actually need?

[00:03:26]>> Tough question, right? No, I mean it's a it's an interesting moment in the industry because so I I do have a bias to open models and I've been working in the open models space for many many years. H going back to my previous answer as well is the models can finally do things right. H in the context of Yemen and Yemeni can do 95% of the things I want it to do in my day-to-day in the context of models such as Yema they are on device much smaller very different use cases right like privacy heavy uh areas but these are very powerful models something that has been very exciting uh to work at Google is how the models get distributed not just by how they are in in hugging face but also how they are integrated into Chrome into Android so for example So Gemini Nano is built on top of Yema and of course even like via AI studio right like you can build with GMA. Uh so yeah I think that part is quite exciting. Uh Google is a very good place to distribute those models to millions of people uh directly embedded in their experiences that they use in their day-to-day which is very a very different experience than going to Hoginface maybe to download and fetune them.

[00:04:36]You already brought up Gemma. So let's stick with Gemma for a moment. Right.

[00:04:40]Gemma now ships under Apache 2 license which is genuinely a permissive open source license. Um that's a sign significant decision. Why did Google make the choice and what does it mean for developers in practice?

[00:04:53]>> Yeah. So we have been doing things in a very communitycentric way for quite some time and this is not just right like P can also share about Gemini. uh we have been talking with developers, talking with the startups and we keep hearing this kind of feedback right what are the capabilities that they want what's the license that they want which are the things that they want and we're iterating quickly we're working as a startup incorporating that feedback into the road map and really reacting based on that >> yeah I agree I I think that the Gemma early access program has been a really strong channel and conduit for getting feedback from the community either people who are building with Gemma in the academic world or in the sciences world or um you know partners like Unsloth or Olama. Y >> um and Apache 2 is such a a nice gift to the community to be able to give them the confidence that they can build their businesses around Gemma.

[00:05:43]>> It's really really powerful to see.

[00:05:45]>> Yeah, a a big part of Y is building a foundation for the ecosystem to build on top of, right? And having a license they understand that they feel comfortable with is critical for sovereign use cases for AI for good for a bunch of day-to-day use cases as well.

[00:06:01]>> We were just talking actually outside uh with one of the the folks that we know from one of our early access programs who had built a mobile app powered by Gemma um that could do kind of tool calling and retrieval um all completely on his device. And we were discussing like you know three years ago this would have been a profitable business. well impossible a profitable business you know >> um and he was even uh you know thinking about commercializing it so and now it's great to know that based on these permissive licenses he would have the option to do that >> exactly >> um I mean this is a good um point to talk about because the landscape is very competitive right now right Gemma Quen Mistral and more if I'm if I'm a developer starting a project today um why should I pick Gemma what's the honest case here.

[00:06:50]>> I mean, I can say nice things about Gemma.

[00:06:54]>> It's the I I I and I I feel like I'm slightly less biased because like Omar Omar's part of the the Gemma team and I'm just like a Gemma super fan.

[00:07:05]>> Part of the Gemma fan.

[00:07:07]>> Well, well, it's the So, Gemma has um many of the same capabilities that are magical about the Gemini model family.

[00:07:14]So, it's able to understand video now, audio, images, >> multilingual, >> multilingual. It's really, really good at tool calling.

[00:07:22]>> Um, it comes in sizes that make sense if you're trying to productionize AI. So, the 2B, the 4B, the uh the two larger sizes. One of them is even a dense model, one of them is a mixture of experts model. Um, and it's been really really nice to just kind of see the creative ways that people in the community are experimenting with it. It also has a really nice um uh kind of context window 256k >> and then also it has um also it has like this uh support for over 140 different languages >> and even more um it's just that they haven't been kind of officially tested or kind of blessed.

[00:08:01]>> Definitely >> like you know you might get it to be able to say some Cllingon and like and or some Elvish and and you know who knows. Yeah, in the development of Gemma for us is what is important is to make models that are not just the best models out there but models that are actually usable by the community, right? That's why we care about developer friendly sizes. So we care a lot about building the most capabilities or intelligence per parameter, the most intelligent model per what uh and making the models in such a size that they can run in devices that people have in their day-to-day. So the smallest model models I have here, my Pixel phone from years ago, like I can run the model here, right? or in a laptop or in the gaming computers. Not you you should not need to have like a huge uh number of GPUs to be able to run some of these models at home.

[00:08:50]>> And and Shaneova from uh from the hugging face team just recently got the two billion parameter version or was a 4 billion parameter version >> four I think >> the four billion parameter version running with transformer.js in the browser controlling a robot which is wild.

[00:09:04]>> Yeah. Yeah. Yeah. You can do things like a Raspberry Pi. You can put the model in extremely hardware constraint setups and just a model and it's not a slow experience like it can actually work quite >> I mean yeah but this is precisely my my next question. Um it's designed to run on a single consumer GPU exactly right but it also can run um on a phone for example. Um could you explain why the size range matters and what the difference is between running a model in the cloud versus running it yeah on your own device locally where is the advantage here yeah so there are three different architectures or sizes so the E2B and E4B so those are the two smallest models they were architecturally designed to be very efficient to run in a phone >> uh so they use an architecture called PL per layer embedding which is specifically done to be able to run very efficiently in in the phones. So, and that was done in partnerships with different OEMs and and so on. And then Paige mentioned before like there's both a dense and so dense model is like the 30 the 31B. It's the most uh row intelligence is the largest model. If you want the best open model from us, uh you would go and use that one. But for many deployment setups, you want very very fast in friends. And that's where a mixture of experts would come in. So the MO is a 26B uh parameter model. So it's quite large h but it only 4 billion parameters actually activate and that means that you have almost the same intelligence as a 27b model but only four bill you get the speed of a 4B model I'm simplifying a bit things but that's more or less how it works. So, uh, we're trying to serve different kind of a use cases, right? From the people that want to run the models in their in their pockets to people that want to deploy in production setups to people that want to build the most powerful model without worrying too much about the latency.

[00:10:55]>> Yeah. And also the energy consumption like the the smaller models. If you're if you're operating on a cell phone, you want something that doesn't drain your battery power really quickly. Um, and then also from an energy efficiency perspective, I know people are starting to care really deeply about that. Um, as well as I I think distributed inference can still be a challenge for many folks.

[00:11:17]And so being able to fit on a single commodity GPU is is a lot easier to architect than than multiple GPUs.

[00:11:24]>> This thing that you mentioned, the electricity consumption, I think is super important because in a phone setup like you don't want the battery to >> Exactly.

[00:11:31]>> to to be drained, right? So one thing that uh we don't talk that much about that probably we should talk more about is the the how many tokens a yema generates right because one part is okay the model is very intelligent but another question which is very important in ondevice setups is how many tokens the model needs to generate to get to the right answer right and that's something that we we benchmark with specifically for to make sure that the density of reasoning that it can do when it's doing reasoning is very strong so the model can do very good reasoning without generating tens of thousands of tokens.

[00:12:07]>> Yep. And you can see that in the efficiency of the tool calls. So I I really hope that Andrew open sources or at least like shares his app with the world so that um but it was uh it was relatively speedy in terms of the the tool calls needed to invoke different um different applications. Um and people can also test out the Google AI edge gallery app.

[00:12:27]>> Yeah. So you so if you use iPhone or you use Android there is the AI edge gallery. It's in both play stores. You can go download the the app and you can use it in in an airplane with this.

[00:12:38]>> I already tested it out when I say >> I love it having like a model running locally like I said when I'm on a plane right having a model it's perfect. It's amazing >> first thing I did downloading it.

[00:12:50]>> It feels magical. One of one of our colleagues, Olivier, was also showing a demo yesterday of uh being able to have kind of the video input feed for a phone working on a hike. So, uh like you can ask about signs, even signs in different languages.

[00:13:06]>> Um uh like what they mean, which direction to go, all of these things.

[00:13:10]>> Yeah, that means magic. It's like the features here, right?

[00:13:13]>> Yeah. Exactly. But um to continue um are we moving toward a world where instead of one big general purpose model developers build with a toolkit of smaller specialized models? Is that the future you're designing for? So more specialized model instead of general purpose models.

[00:13:32]I would so so I really love the the sort of behavior that we see right now with agents that are figuring out what needs to be the model generating a whole bunch of tokens versus what should be a tool call. Um like what tools what uh sort of options do I have available that doesn't require me to generate a lot of tokens and run through all of this reasoning by myself. Um because I I think that leads to a much more kind of efficient AI future. If you if you do have this space where um you know you have a powerful reasoning model like Gemini, it's able to break down a complex task into you know 50 different steps and then it's able to decide like all right 30 of these steps I can use Gemma 20 of them or 10 of them I can use these tools that aren't even like relying on AI at all and then these other 10 might need to be calls to Gemini. Um, but that dramatically reduces, you know, your energy consumption, the time that it might take to achieve an outcome, the cost profile that you might have for a problem. Um, and I I really hope that that's the future. Um, we're we're already starting to see it a little bit, but the um, but I I think that uh, Deep Mind is also really great at kind of building these models that are capable of doing so many things.

[00:14:51]>> Yeah, that's a very interesting part, right? like how much do you think will go in a more unified direction >> and Omnia is a great in that direction like Gemini Gemini is pretty powerful in the sense that it can understand video and images and audio and text and code and it can also output multiple modalities so I can output audio output images output text output code and now video for like this new omni exploration that we're doing >> um so I I think that you know there's space for both um but from a from a person who has probably too many hobby projects. Like it's and a very large personal cloud bill. Like I I really love this idea that we could uh like dramatically reduce pages cloud bill and like use use a constellation of models as opposed to just one singular.

[00:15:39]>> Yeah.

[00:15:41]>> Yeah. For it's been interesting to see we the the base model is just so strong that we see less of a need for people to fine tune the model. So with YMA 3, we had tons of partners that were doing fine-tuning on top of it. With YMA 4, like people were getting state-of-the-art results uh for open models in their benchmarks without any additional fine tuning. And it was the same in the early days of the Gemma model or the Gemini models as well like the like at the very beginning they had to have um so people would fine-tune Gemini to create Medp or um to create kind of um like specific to coding versions of Gemini and now like the the benchmarks for those those tasks are just exceeded by base Gemini.

[00:16:32]>> Yeah. I'm just curious um how was the feedback so far from the community and are there some wishes um they were like okay this this needs to be like a developer there in jammer or from the community.

[00:16:45]>> Yeah. So Sundar announced I think 20 million downloads the week after the launch. By now we have over 100 million downloads in six weeks. So the community reception has been very positive and with YMA 3 when we released Yma 3 we had lots of feedback from the commit like people were telling us function tooling doesn't work well system instruction doesn't work well all of these things does not work well with 4 we really incorporated all of that so in general the reception has been extremely positive we know on the agentic side of things we still need to we we still want to keep pushing the frontiers h but always constraining within the same size right I think that's something I'm very excited about if you compare similarly sized YMA models to YAM 4. So we had YMA 2 27B, YAMA 3 27B, YMA 4 31B, we kept seeing an improvement across all of the LM arena uh vertical. So we think that we can keep pushing significantly the cap the agent capabilities of Gemma without having to keep scaling up the model size. And the team is uh one of the things that's pretty magical too is that so many of the the Gemma team members are on social media. They're in and the EAPs. They're answering feedback all the time, listening to feedback, >> building with open tools.

[00:18:03]>> Exactly.

[00:18:03]>> Yeah. I think that's something very important, right? Like people need to build with with what developers are building as well, right? Uh and I think that's quite critical for the success uh of cool. Um, Paige, you've built developer tools at GitHub, with Copilot, and now at Google for AI developers.

[00:18:23]What's the biggest gap you see right now between what frontier AI models can do and what a typical developer can actually build with them?

[00:18:32]>> Interesting. So, so I I think the uh one of the the most um one of the most interesting gaps that I've seen and I guess uh for context I I started uh contributing to open source projects a long time ago. We were just talking like almost two decades ago. um first in the scientific computing space um like numpy and scypi and scikitlearn and then later on things like tensorflow and uh and some of the other machine learning frameworks that we have at Google um but it there's always been kind of this gap between like understanding the complexity of a system um and uh being able to uh sort of under uh understand like where might the gaps in this system be um and I think today. One of the the things where I see developers continuously get frustrated and I would love to hear Omar's perspective on this as well is that people might ask in a tool like anti-gravity or AI studio or cursor or whatever it might be please uh do this task for me and it might require connecting to a database or a system or um uh you know creating a specific kind of database um incorporating something like ooth um or pulling in data via search or or some other tool call. Um, and the the IDE or the tool will give a response back and it might look correct.

[00:19:59]Um, but many of the people are coming from outside of this world of software engineering and they don't know how to kind of doubleclick and make sure that all of those connections that they asked for are actually robust. Um, that it is a connection to a database as opposed to just synthetic data that was created behind the scenes. um it is actually doing you know a connection to a workspace um as opposed to just kind of like hallucinating some or fabricating some of the the workspace um as sort of examples and I think we're we're working really hard in AI Studio to make sure that those connections exist and that they're um invoked explicitly um but it's really really hard for earlier career devs to um to be able to double check that that some of these things actually um are correct. So So I think that people are right now expecting the models to do a lot and those expectations keep increasing over time.

[00:20:57]>> Um and the models really can do a lot.

[00:21:00]It's just that right now we're in this strange interstitial state where we also need to have um you know a healthy amount of skepticism that it does all of the things that you ask for. Yeah. Um and and that's that's a gap that I that I see people continuously um getting getting snagged by that hopefully will get better with time and we're trying to close an AI studio.

[00:21:23]>> Yeah. Yeah. It's an interesting point, right? because we are at a stage in which it's never been easier to build with AI and at the same time if you want like production grade systems you need to be careful with some of these things and I think uh as you mentioned like nontechnical people or tech adjacent people that are starting to build with AI but also students right like what is happening now with computer science students once they are out there how they are going to build how their development workflows will look like will be something very interesting to to follow and to build the right tooling for them as well right >> and to make sure that they're taken care of so So if they do accidentally, you know, expose an API key or expose password in plain text that those things are caught for them as opposed to like foot guns that they find out afterwards.

[00:22:06]>> Yeah.

[00:22:06]>> Yeah. This is actually exactly my follow-up question. So for someone who isn't in machine learning, engineer, but wants to build something with AI like a journalist or business owner, designer, whatnot. Um, how far can they actually get with Google Studio today without writing code?

[00:22:23]>> Oh god, >> very far. super far like and and like and much further since yesterday like >> like uh like we've added Firebase support so so they can have databases and OOTH. We've added workspace support so you can connect Gmail and calendar um uh and things like >> development >> mobile development you could deploy to an Android app. You can export to anti-gravity if you need to. Um, we've also got grounding with Google search, grounding with Google Maps. Yeah.

[00:22:51]>> Um, URL context, which is effectively like retrieval for free, so you don't have to stand up a vector database.

[00:22:58]>> Yeah.

[00:22:59]>> Um, >> and then you have the I mean, we we didn't launch this yesterday, but we launched this in the last two months.

[00:23:03]You have this editor and annotator mode in which you can just move things around and it's levering all of these multimodal capabilities of Gemini to keep iterating on the product without writing a single line of code >> and also design features. So it's like once you once you describe an app that you want to create like it gives you five different design options. So like if you're like me and you can't design and front end at all like it gives you you know uh the ability to say this one looks beautiful and then and then use it in your app.

[00:23:33]>> I love the answer. I love it. I love it.

[00:23:35]So there's um as you know a global race in open models right? Chinese labs, European companies with for example I know Google are already all releasing very competitive open modes. How does the deep mind team think about this competition? Um is it a race you need to win or does more competition help everyone?

[00:23:56]>> The way I say is that we want to build the best models we can build right and in that sense our benchmark is ourselves rather than the competitors. We know what the community wants uh and we're iterating based on that. So in the context of YMA, so in the context of open models again we want to build the strongest models that are developer friendly, consumer friendly that can run in consumer devices. In the context of Gemini and the Omni, Vio, Lia, Nano Banana, like the whole Gemini family, again the goal is to build the best models we can build uh based on all of the feedback we get from the community.

[00:24:28]>> Yeah. And I and I do feel like the the world will be a better place the more open models that exist.

[00:24:35]>> Couldn't agree more. So, it's it's kind of exciting to see how all of these um all of these are are getting deployed out into the world and fine-tuned.

[00:24:44]>> Mhm.

[00:24:46]>> Okay. So, if we sit here at Google IO next year, what will have changed the most about how people build with AI?

[00:24:58]>> Tough question again.

[00:24:58]>> Ah, it's a fun one. It's a fun one. I mean, the models are getting very very good.

[00:25:04]the genetic harnesses are also getting to a very very good place.

[00:25:09]What I'm the most excited for the next 10 months, six months is to see more and more of these non-traditional developer audiences starting to build with AI. And I think that will help us shape how we want the the road map to look like, right? So, I don't know what we're going to ship in a year from now, but I'm sure like seeing how people that don't come from a developer background deal with AI will help inform this significant thing.

[00:25:33]>> I love that. And I I was going to say something really similar like we just announced an AI studio mobile app at IO uh this year um which is you know massive massive market like there are many people who have cell phones or who um kind of uh are very devoted to their cell phones but don't necessarily have a laptop at home. Um, and given how much uh there is still left to build, how much creativity and passion that folks have around the world, especially out of the engineering space, um, I can't wait to see what they create. Um, and also like see more people rely on on device models or working with anti-gravity or with mobile devices. I think that would be really interesting.

[00:26:19]>> This is interesting for sure.

[00:26:20]>> Yeah. Because if you if you do give everyone the ability to create any sort of app for zero, super fast, completely on device from a mobile phone, um like there's there's really no limit to the potential of this. Um globally, like not just in the United States, not just in Europe, but everywhere.

[00:26:41]>> Yeah. So >> cool.

[00:26:42]>> Yeah.

[00:26:42]>> Omar Paige, thank you so much for the conversation. I really really appreciate it. Had fun. Thank you so much. So super fun your

相关推荐

OpenHuman VS Hermes AI: Who Wins?

JulianGoldieSEO

285 views•2026-05-29

BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2

aimmediahouse

122 views•2026-06-03

Long-Running Agents — Build an Agent That Never Forgets with Google ADK

suryakunju

142 views•2026-05-30

This computer is made from real human brain cells. And you can buy it.

Talktmsmedia

3K views•2026-05-28

I Made the Same Anime Fight Scene in Every AI Video Generator

NobleGooseAnime

295 views•2026-05-30

Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S

cnnnews18

3K views•2026-06-01

I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)

AICodingDaily

298 views•2026-05-29

3D Platformer Update - NO CAPES

SolarLune

294 views•2026-05-30

热门趋势

计算机科学

The Meta AI Hack Is a DISASTER

LowLevelTV

141K views•2026-06-03

Paris is in SHAMBLES right now 😭

H1T1

4053K views•2026-05-31

The Casino Had Us Guessing All Day

VegasMatt

157K views•2026-06-03

The Dancing Plague...

HoodieGuyStories

1730K views•2026-05-30