Install our extension to search inside any video instantly.

AI 2030: Beyond LLMs | What Comes After Generative AI
Added: 2026-04-29

119 views61:44:10hitechadvisors1160Original Release: 2026-04-28

The future of AI lies in transitioning from generative language models to action-oriented agents that can reason, interact with the real world, and execute tasks autonomously. This evolution requires developing hardware-aware AI models for edge deployment, building robust semantic layers and data schemas for enterprise applications, and creating systems that can handle compliance, security, and explainability challenges. The industry is shifting from innovation to diffusion, where the focus moves from developing new models to implementing them reliably at scale across diverse domains like healthcare, autonomous vehicles, and customer service.

[00:00:08]Hi everyone once again welcome you all.

[00:00:11]If you have not met me yet I haven't gotten a chance to shake hands with you.

[00:00:16]I'm Ruchi. I lead the recruitment here at Hi-Tech Advisors and I'd like to invite our COO Lan Wang so he can give you more information about Hi-Tech Advisors and we can get rolling with the agenda.

[00:00:34]>> Thank you Ruchi. Thanks to the team who actually put this uh event all together and thank you all for coming to this uh event. This is the actually the latest uh episode of this uh um AI series that High-Tech Advisor puts on uh with the support from uh JP Morgan Private Bankers.

[00:00:58]For those who I have not the pleasure to meet yet, uh my name is Lang Wong and I joined Haresh and the high-tech advisor team last year um to further advance the core practices and also uh international presence for high-tech adviserss. So many of you know that Hares actually started high-tech advisor about nine years ago and the team and Haresh have done tremendous job at you know gaining marquee names that you see on some of the slides uh in the US. One of my goals and one of my charter for joining high-tech advisor is actually to strengthen our AI and customer experience practice as well as uh taking the company's uh expertise and um uh excellent service uh internationally.

[00:02:00]So um it has been a great uh episode so far, right? So we've uh over the last six months or so have expanded our customer base to include uh crossborder commerce companies. Uh we are doing strategy consulting and some of the management consulting and uh we're about to sign an our latest client which is a neo cloud for AI workload. So we're advising them on expanding globally right into Europe into into Middle East.

[00:02:40]Um at high-tech advisor we actually um pride ourselves in a simple approach right which is we lead every engagement with empathy and competency and hopefully through those two through repeated engagement we get to build trust over a long time. So many of you in the audience have been associated with uh high-tech advisors uh in multiple capacities for many years.

[00:03:10]Um so so glad to see many of the familiar faces here. And for those who are new to High-Tech Advisor, we hope to start our business relationship uh get to know each other on a more personal level uh through events like these. So we are the expert at engaging experts and we pride ourselves in finding meaningful business problems for those who find them meaningful to solve. So uh without further ado, let me bring up David.

[00:03:54]Hey everyone, I'm David um Kerbin with JP Morgan. Um, just want to thank Kesh.

[00:03:59]This has been a great event that we've hosted for quite some time. Um, kind of feels like I know a lot more people in the room now than I did three years ago and it's lovely to have a nice, you know, nice event, nice food. Um, yeah, I just I I'm here. I feel like, you know, us as a firm, we are in the ecosystem of innovation economy and um, just want to be a resource to resource to everyone in the room. um whether that's a startup, whether that's helping you out with your personal things, um always here as someone that you can have an idea with. Um but really just love this event and it's so fun to be you know growing relationships and I you know I really hosting a little Paulo over there earlier was like amazing.

[00:04:44]This is like my buddies. So it's great to great to see everyone. Um and then thank you great vision.

[00:04:53]Oh, thank you, David. No, I'm seeing for the rest of the night.

[00:05:02]We we kind of started this on a whim about three years ago. So, now it's been a long whim.

[00:05:09]So uh I think we had a colleague former colleague right uh Dina who who was in the Seattle office of JP Morgan and then at another event we we had uh thought of uh basically that people would be interested in AI maybe we'd have an event or two and then now here we are I think this is our 11th or the 12th event so thank you David uh JP Morgan has been a very very nice partner and uh very supportive of this effort. So, um but before we get into the main panels, uh we have a little bit of a bonus for you all. And uh where's my bonus?

[00:05:55]So, uh uh Ron uh Sinsky is with a company called IQ Rush. Uh they've been a client of ours for a while, but uh I'm not going to steal all the thunder, but it's a it's a surprise uh AI company. Uh and uh let me see if I can find you a PowerPoint, so you can feel free to introduce yourself a little bit.

[00:06:28]Okay, we're good to go.

[00:06:31]All right. Um, so yeah, I'm I'm Ron Slinsky. I've uh I've been a a longtime uh technology veteran. I started at Microsoft back in 1999. I spent like 22 years there, which is a very long time for anyone. Um I I spent some time at Oracle as well. Uh most of my career has been spent in the the the analytics, data science, AI space, right? Which is one of the reasons that I'm here tonight.

[00:06:59]Now, um, do we have a remote? Yeah.

[00:07:04]>> Okay. So, yeah.

[00:07:08]>> Great. All right. So, um, where do we begin? Right. So, I think the the the the topic of conversation tonight is is largely about what's next, right? And, you know, certainly a few years ago when Chad GPT was first introduced, uh, people were asking themselves that same question. you know, what's next? And and and certainly, you know, AI seemed like it could be the next big thing, but people weren't really sure, you know, is it real or is it, you know, like maybe some sort of next generation Eliza. Um certainly by now we know the answer to that question. Um the challenging thing is it's moving so fast that that a lot of companies are left wondering how it is that they they they respond. you know, more specifically, a lot of them are wondering like, how is it that I control the narrative when an AI is doing most of the talking? And, you know, that's the goal that that that's the challenge that IQ Rush is trying to address, right? That we're we're trying to help brands answer that question, help them to to optimize and monetize AI search.

[00:08:19]All right. So, AI search truly is sort of a strategic concern for a lot of companies. Netrush, as an example, is a is is a retail retail agency. They they help consumer brands manage their their presence in online marketplaces like Amazon and and and Walmart, Tik Tok, etc. And here you have the the the head of uh the head of of strategy expressing frustration over the differences between answer engine optimization AEO, what we have today, and historic search engine optimiz optimization, SEO, right? And um certainly back in the day when when Google was was the the only game in town, there were some very clear-cut strategies for how it is that you increased your AI visibility, right? you you you you could you can increase the number of backlinks, you could you could enhance your metadata tags. There there are variety of strategies. Worst case, if if you know, you could simply buy the right right keywords and elevate yourself to the top of Google search results.

[00:09:26]Answer engines don't work that way.

[00:09:28]They're they're much more complicated beasts. The same strategies, as you see here, simply do not work. Moreover, the stakes are much much higher. Right? with with traditional search, you had this near infinite list of links that you could could pursue. You know, at some point, your company would show up, right? With with with answer engines.

[00:09:48]That's not the case anymore. At most, a response is probably going to contain about a dozen dozen citations, dozen dozen references to grounding sources that they're using. And very probably they're going to mention an even fewer number of brands. So, how is it you become one of those citation sources?

[00:10:07]How is it that you become one of those brands that are mentioned? And that's fundamentally the problem that we're solving.

[00:10:15]And that's why Netrush has and and and and IQ Rush just yesterday announced a strategic partnership. You know, we've been working with them for several months now. They know what our platform is capable of doing and they believe that it's the best platform to help their clientele measure, manage and monetize AI, their AI search, their AI visibility.

[00:10:40]Now certainly Net Rush had a lot of companies that they could have chosen to work with, right? No doubt there there probably over a hundred companies now that are that are trying to address problems like this. The most of them however are simply measurement tools, right? So the way that that works is they're they're generally sending maybe a handful of questions to to answer engines and in some cases as few as 25 questions to the answer engines and however many times your brand shows up or your URLs were cited that's your score. Now, the the the problem with that is that it doesn't take into consideration the inherent variability of answer engines, right? You the way that they work is you you could ask the same question twice and you're going to get two different answers, right? So, if if I have a measurement platform that's doing something very similar to that, you could you could get a score and then ask it to refresh that score and you're going to get two different numbers.

[00:11:39]Now that that um for me that that that reminds me very much of seagull's law which which says that you know if a man has a watch he knows what time it is. If he has two watches he's never quite sure. Right. So that's that's that's the problem with with with platforms that aren't taking into consideration that inherent variability. So of course that's we we don't go about it that way.

[00:12:00]We we go about it very differently. We start with the same seed. We're we're going to send a a series of questions to the answer engines, but it's not going to be 25. It's going to be hundreds, perhaps thousands of questions. Um, and then we're going to do the necessary math to separate the signal from noise so that we can actually quantify the uncertainty in the score that we present. So, so brands can know exactly how reliable that number is. More importantly, we're not just looking back on the results that you just received.

[00:12:32]We're also able to look forward, right?

[00:12:34]As the slide suggests here, we're now able to predict with 92% accuracy what are the what are the URLs? What are the sources that these answer engines will site?

[00:12:47]And and one of the byproducts of of our research is is a variety of interesting insights, the sorts of things that actually can help inform brand strategy.

[00:12:57]Right? So here here's an example here where one of the things that we've observed one of the things that we quantified is the extent to which these answer engines actually behave differently from each other. Right? So if you take a look at the the the URLs the sources that that the engine engines are citing you'll find that there's only only a 15% overlap between um search GPT and and perplexity. That means that 85% of the sources are different.

[00:13:28]Right. So there's there's 3% overlap between Perplexity and Gemini. You know, again, that's that's a 97% difference.

[00:13:37]That's huge. Right. So the the key takeaway for brands is that if if you want to to be a topside brand across all of these answer engines, it's going to be very very difficult. That's going to force you to make some some prioritization decisions. Um now the underlying reason of course for these differences is that you know the these answer engines are very complicated systems right they're they're not just LLMs they each of them has its own architectures and and and that leads to these these these profound differences in behaviors the good news is we don't have to reverse engineer these systems to understand how they work. We're taking a data science-driven approach to figuring out how they work. Um everything that we do is is based upon this the scenarios that I've described. Um we we'll get into that a little bit more in a moment.

[00:14:31]Actually why don't I pop ahead in the me interest of time. This slide illustrates the the basic process that we go through with each of our brands. We you know we start with the brand showing up. Um they they all they really need to give us is you know the the the URL. It's helpful if they give us more information. you know, the markets that they're they're focused on, who their competitors are, and from that we build that set of questions that we're going to send to the answer engines. Now, again, that might be hundreds of questions, very probably might be thousands of questions. We'll send those to each of the answer engines. We'll collect the responses. We'll see, you know, what are the sites that are being um referenced, what are the brands that are being mentioned, and then we'll we'll actually go off too further and we'll scrape the URLs for every every site that was referenced by those engines. We'll go down to the sentence level to understand what it is that those sites are saying and how does it relate to what the answer engines are saying. And so that that's one of the things that that certainly makes all of this very powerful.

[00:15:38]And from that we're actually able to create a set of recommendations for brands, you know, suggesting to them how it is that they can increase their own AI visibility.

[00:15:50]So, and here's where that predictive capability comes into play. Now we, you know, based upon the the content that they're they're considering, we allow them to experiment with that and we can tell them ahead of time how likely is it that this will increase their AI visibility. Will it make a measurable difference? And then once they once once they choose to go ahead and publish that their enhanced content, we can actually show them in in in in practical terms, did it make a difference? And then we and further downstream we can connect that into into more practical terms for them. Did that result in more traffic and more revenue?

[00:16:33]>> Yes.

[00:16:39]>> Sure.

[00:16:44]So, um, >> sorry, just like how can we help you?

[00:16:52]>> Right.

[00:16:52]>> Yeah. Yeah.

[00:16:53]>> If you can get there, uh, because I want to make sure we have enough time.

[00:16:56]>> Oh, absolutely. So, go ahead.

[00:16:57]>> We're We're almost >> Okay, perfect.

[00:17:00]>> But the the the key answer that is money.

[00:17:05]>> All right. So, so I I I will say that um just just quickly that you know search engine optimization, traditional search, it it it it is proven that it's important for brands to curate the pre their online presence, right? And with AEO, that's going to be even more true.

[00:17:22]We've we've this and this market has gone from a zero zero dollar industry to an eight billion dollar industry just in the past couple years. And as you can see, the expectation is that it's going to increase to uh exponentially in the next few years. All right. So everything that I've been talking about, everything that that I've described um kind of puts us where we are today. You know, right now IQ Rush is a company we generate about $450,000 in annual recurring revenue. And we've we've largely done that on the backs of our own our own wallets, right? So and and a few investors. So, what we're looking for right now, the ask is uh we're we're trying to raise a $2 million seed money to give us an 18 uh month runway to actually expand everything that we've done. Right. So, most of what we do today is built on the the earliest of prototypes that we've developed. So, we need to get beyond that um so that we can uh we can actually onboard more brands and uh help them to be successful. So, here's a quick look at the leadership team. I won't linger here, but uh I'll I'll pause at that with that and open it up to questions or concerns if Yes, please.

[00:18:43]>> I'm signing up customers right now.

[00:18:46]>> So, the question is, are we signing up customers right now? And yes, we are. Um to be clear, um you know, the goal with the the seed the seed money that we're trying to raise is to make it possible for us to scale out so that it's a much more of a self-service platform. Right now, we're working with with customers in much more of a white glove scenario, right? So, we ask, you know, they'll work directly with us. We'll we'll generate the results and share that with them um on on a one-on-one basis.

[00:19:17]Okay.

[00:19:17]>> Thank you, Ron.

[00:19:19]>> There you go. Appreciate everyone's side and thanks again for the opportunity.

[00:19:22]>> Um, so if anybody you'll be here for a while here. Yeah.

[00:19:26]>> Yeah. So if anybody's interested to follow up with any questions or interest.

[00:19:30]>> Appreciate it.

[00:19:31]>> Is your contact somewhere?

[00:19:33]>> Um, well Irush.ai.

[00:19:36]I'm Ron at iqrush.ai.

[00:19:38]>> Perfect. Thank you. All right. Cheers.

[00:19:55]So now I'll ask the speakers to come out for the main panel.

[00:20:08]Thanks Ron by the way for standing in. Ron was he didn't know he was going to do this till very shortly before now.

[00:20:20]So So all right. So um oh yes here so so we'll get into the the main part of the panel in a second. Uh I I have a question for everybody.

[00:20:44]LLMs. How many of you had heard of LLMs before 2022?

[00:20:56]And how much time do you spend now hearing about LLMs?

[00:21:01]I'm sorry, that's too much.

[00:21:06]So um because of that I actually we're going to have uh before the main uh the rest of the panel discussion we're going to get a little bit of a tutorial uh from a very capable upand cominging uh professional in AI domain Mayan.

[00:21:27]So he he actually was kind enough to put a a few slides together for us to give us some context of you know what LLMs are and what's what's there in AI besides LLMs right so uh so that'll hopefully serve as a context for the panel discussion a little bit later it'll only take just a few minutes so if you want to come by mine feel free to by the way this is mine sudashnan So maybe take a minute to um introduce yourself a little bit before while I set up the PowerPoint for you.

[00:22:04]>> Yeah, sure thing. So um I am a machine learning engineer at a startup company called HyperB AI. Um and in May of last year I graduated with my masters in computer engineering from Washington University in St. Louis. Um and actually I work out of the high-tech office. Um my work is remote but uh Hares was kind enough to offer me space. Um and I guess the goal with the presentation today um is yes a tutorial but also to share some thoughts from an engineer's perspective on what's coming up next and how things should be tackled coming up next um in AI.

[00:22:49]So I'll talk a little bit about the current generative AI solutions both in the language context and then the vision context and then where the future of AI is going in terms of what type of models um and then I'll also talk about the landscape overall what are the challenges um and where does the industry focus lie and then also where I think the industry focus should lie okay so I want to start by explaining my fundamental principle here of AI. Um AI should serve as an augmentation of human capabilities. Um so you know whether it be a simulation of a human uh behavior or extending a human task you know I think there are three core elements of AI um and a good generative AI model that's perception idealization and then replication. So with those three elements in mind, we'll start by talking about something that uh we want to tune out LLMs.

[00:24:00]So you know this is this is going to be a very high level overview but I do want to go over you know the just the inner workings a little bit here. So as the name suggests um alpha numeric characters are processed and divided into subwords which are known as tokens.

[00:24:19]um those subwords can often be keywords or phrases. Um and essentially those tokens serve as part of the retrieval process. So after the um user has an input uh there's then a um retrieval process which you know searches the cache of the model and finds you know an appropriate response to send to the attention mechanism which is what where generation happens. after generation um the answer that's outputed is validated um based on accuracy metrics and then also the relevance to the original prompt. Now through that there is a feedback loop. So if it doesn't meet that criteria um that that prompt is is reprompted um in a different way um slightly different way slightly different writing um and you know re-queried um and then this is an iterative feedback loop um and then finally once those criterias are met for accuracy and relevance um there's an L there's an LM response now the vision context is kind of similar except Now we have an image encoder. So the image is tokenized um not like subwords but instead patches of the image. So tiny squares of an individual image um and essentially you know that's an entire different encoder on top of the image encoder. But on top of that there is multimodal fusion which is the fusing of textual and visual elements together. Um, and this is really critical for, you know, matching the semantics of the text with the actual meaning of the image. Um, and you know, because of these additional elements, these vision models, yes, they're, you know, more capable and they're, you know, highly applicable.

[00:26:17]The problem is is that they're they have a large memory footprint as a result of these additional elements. Um, and I'll talk a little bit more later about the challenges with that.

[00:26:28]Okay. So now where we're going into the future of you know these generative AI models. First there's the action model which has the capability to understand an image um and you know perceive the image within the context of what where that image is um and generate an action accordingly. But it can't predict predict the future. In other words, it doesn't jump out of the context of the image. Now then there's the world model which can perceive the image um and perceive an action and actually predict the future but it can't generate the action. But the next big thing and that's been in development um for by many firms is world or action world models which combine the capabilities of both the action models the world models and you you can act within a virtual entity um and have actionable items within it.

[00:27:35]Now as I had alluded to earlier there are some challenges um with the existing innovations and the existing focus of um model builders. So big data in terms of the images and the um text associated with it um and extensive training time are big hurdles for um other players to step into the model building space. um and you need vast amounts of computing resources for that. Now many of the production models that we we utilize are often deployed in cloud situations. So this raises significant security concerns depending on um you know what type of information is being shared. uh latency issues depending on how um how real time dependency for the use cases and then expensive maintenance which is the um uh the cost of you know accessing a cloud model. Um state-of-the-art models you know rarely um depend on you know affordable edge hardware. They're not uh they're not designed to tackle or they're not designed to be hosted on those edge hardware.

[00:28:49]That's where the industry focus is making highly capable but also highly resource in intensive models. um and they really cater to the deep pocketed entities that actually um you know make that make the models but also have access to the data and the resources as a result. Now I asked the question how can capable models be accessed uh cheaply and securely? the only way is developing capable hardware aware um AI models um for the edge which is you know what I'm doing but also you know where I where I really want the industry to go developing on-site solutions developing uh models that are hardware aware that are really powerful but they can fit on you know they have a smaller memory footprint and can fit on more affordable hardware. Um I guess we'll take questions now or Yeah, >> you have an example and the last one.

[00:29:54]>> Okay. So for example, uh you have you have a model that's just as powerful as you know chat GPT for example. Let's say you want to fit that on a drone, right?

[00:30:07]That's currently not possible without a very expensive GPU. Um similarly in Teslas right now uh you know the modern Tesla the latest update of Tesla self-driving comes with really expensive GPUs um and that's not feasible for other manufacturers to compete with that and also for the overall um you know competition in the field. It's just not it's not very feasible. So >> the solution which direction?

[00:30:42]>> So right now these the the the current solutions are tending towards that high you know highly capable yet still resource inensive route.

[00:30:55]However, if we want to spread the models to devices like drones um and more affordably spread it to devices on cars for example, you know, we need to build or you know the industry needs to pivot towards building models that are hardware aware. They can fit on smaller devices um at a more affordable price point. So leak all kind's idea. So we limit capabilities limited any silly cab so the GP >> not necessarily so you know if we if you design a model from you know hardware aware model from the ground up right understanding the limitations of what you're working on right and understanding the ar the existing architectures are vast um you know if you start from an architectural standpoint of you know from from the model side you know there are certain advantages for certain architectures of models over others um and those models uh you know should be the focus those smaller footprint architectures >> sorry um can can we come back oh I'm sorry so let's I'm sorry speakers please come back Mayan thank too. So hopefully we'll have some context uh to discuss.

[00:32:30]>> All right.

[00:32:38]>> The answer is uh going back to that action and world mort and the parent.

[00:32:45]Yeah. Answer probably lies in that action and world model you was talking about earlier, right? So you could have much lighter weight action models that can do the understanding pieces, right?

[00:32:54]It's a prediction pieces which you'd require a world model or an additional larger model on and that you could limit the surface area for which your model would do much nicer, right? So I think Linux was a good analogy over there like the Linux on a sensor is not the full Linux that runs on say a server that's on the cloud, right? Uh yes, it has the same core capabilities, right? And model needs to evolve towards that. Um there are certain aspects that are there today in the industry. Uh but it's is it mature? No. Right. And is it universal that we can all access? Probably not.

[00:33:26]And I think that's what Mayan was referring to.

[00:33:27]>> Right. Yeah. And there's not there's not a you know widely accepted architecture from a model perspective, model building perspective that's been proven over time. So I saw talk to Lyn.

[00:33:47]I I I would like to add the one thought on this like where is the the there's a limitation on hosting the model on the you know uh compute or sensors but right now the one of the architecture most of the uh you know device manufacturer adopting process in cloud capture on on the device process in cloud ring bell can uh you know from Amazon can detect there's a package on the front door But to recognize that through the vision they process on the you know on the cloud for now but this is a valid question how we bring those models in a cheaper way on the devices itself.

[00:34:26]>> So while you have the microphone on Shad go ahead. Yeah but let's do quick intros. So Shad uh just tell us a little bit about your journey and what you're doing now and then we'll follow up with some more questions.

[00:34:40]>> Yeah definitely. First of all, thank you for inviting me here and I really like the panel and everyone. I was having a chat with folks uh you know while having some snacks. Uh thank you for organizing this event as well to hi-tech. Uh myself I'm Shahed uh been in uh computer uh industry for 25 year uh been in uh Microsoft for 15 years and multiple places. Last thing I did with Microsoft was Microsoft teams. I am one of the founding member of teams and uh after that I I got opportunity to work at AWS.

[00:35:21]I was with AWS for 6 years where I led the uh Amazon connect as a contact center in cloud. There also I got opportunity to uh you know get my hands on AI to in the customer support scenario like building a IVR building a conversational AI uh for the voice agents and for the you know digital channels and recently a few months back I joined back uh Microsoft again as a boomerang came back from the this side of the bridge I was doing the that side of the bridge so uh now I am part of agent 365 and uh agentic data platform.

[00:36:03]Um so that has been my journey.

[00:36:07]>> Awesome. Thank you. Uh very quick question. Which one's cushier?

[00:36:13]Which one is cushier between Microsoft and Amazon?

[00:36:18]You don't have to. You can take the fifth if you want.

[00:36:23]go ahead.

[00:36:24]>> Yep. Hey, thank you Arish again and hi for inviting us here. Um I'm Uh it's interesting actually in this uh forum I have folks who have worked with me in different aspects of my career. So I've had about 20 plus years or 25 years now and counting in AI right u I started as an applied researcher uh grew up the ranks through applied research had my own startup before joining Microsoft uh spent 12 years in Microsoft did my rounds in Amazon for 4 years I trained as a college education like learned what the real world is came out and then you know last six years I had been in Oracle I am on a brief sbatical and I'll be starting a new gig next Monday so um you know to some extent I feel like a free bird I'm open to share my opinions. It's my opinions and no other companies. Uh so just wanted to talk about that and shout out to Ron right he is my exterracle great presentation loved how you uh presented a very complex area and you know simplified it right and connected it back to SEO reminds me of my days in Microsoft Bing and Bing ads where we were doing RNR and revenue relevance right and very very similar challenges.

[00:37:28]Thank you. And what's cushier between Oracle and Microsoft?

[00:37:32]>> It's okay.

[00:37:33]>> And Microsoft. Oh, that's hard.

[00:37:35]>> No, I think Microsoft gets the count on that one.

[00:37:38]>> You can also take the fifth. Tiger, >> let me answer a question first.

[00:37:42]Microsoft.

[00:37:44]>> Right. Uh, I'm Tiger F. I'm the president of the Seattle Technical Forum. Uh, I'm running all kinds of technical events across Seattle area with our 50,000 members. uh at the spare time I work for MATA as their PM.

[00:37:59]uh my my team really working on the foundational models not only a large language model but other foundational models to support whole uh companies pieces including recommendation search ads as and other stuff and before that I worked at Microsoft a few times as you guys started in 1999 the was it called the natural language processing group which now they call it natural language model but at that time it's called natural language uh language processing yeah and then I also joined Google for about seven and also other parts of Microsoft and also worked at Huawei uh and about 80 years ago talking about edge AI my team really uh created and made made it a mass production of first industry edge devices this really the AI chipping side so yeah nice meet you guys um we'll start with a lighter question what is your favorite LLM consumerf facing LLM that's that are out there each of you and Why >> tiger? Go ahead.

[00:39:03]>> I'm from meta. No comment.

[00:39:10]>> You're free to speak. right?

[00:39:12]>> Um, hey, it's a flavor of the day, right? So, it depends on which LLM did the good job. Today is all Gemini, right? I actually tried the vision model. Um, by the way, my thesis was interpretive vision 25 years back. Uh, and love to see how the field has evolved, right? And um I was playing with bananas and I was going bananas with that right a little bit with the Gemini. Uh it it's a amazingly good model. Um it's interesting how things change and shift right like we used to measure AI evolution in like you know four to five year cycles 25 years back and right now literally it is changing almost every week right where a new model comes in and new capabilities come in within a model. Um, no offense to Tiger, right? Like at one point we thought Llama was doing really well on code generation, right? And then Flaude came out and then Gemini came out and now you know Llama is there but uh you know a distant third or fourth, right?

[00:40:04]So >> no commas.

[00:40:07]>> Yeah. But hey, hey, I have a feel today.

[00:40:09]Uh anyway, so just wanted to say like my favorite uh changes by the week or day and that's the key thing or theme I want everyone to take away is I truly believe for any real application you have to think of a world where you are model agnostic. If you are tied to a single model you're going to be absolute very very very soon. So yeah for me it's obvious it's co-pilot.

[00:40:34]So on on CS node yes like the copilot the the way it can you know connect your uh you know the work your uh whole M365 together you can actually live live there like for example you know you're working on any document and you don't have to scroll through or search through your emails your meeting notes transcriptions and whatnot it is a one single place where you can combine the you know your uh search through the internal corpus of the information that you have generated by interacting with uh you know various application but the same time if you really want that it should bring the research from outside as well on some topic it can bring as well so whereas the other providers uh without taking a name they are very specialized uh for something but what I found that the co-pilot is really a gamecher where especially for the information worker for students you know it really provide a value right there because the one of the another thing is that you can find a very specialized thing in one model or one application but when it connect everything together in a context uh you know you are working on that's the true value of it I'm going back to the same thing uh you know what uh just said about the application has to model agnostic. LLM is right now is a commodity. It's very common. The industry is moving in my opinion from innovation to diffusion.

[00:42:15]Innovation is when you learn something first time. Diffusion is when the same thing happens millions time in a secure, scalable and safe way. And the LLM has proved that that it happened multiple time. Now the the industry or the leaders should really think now from LLM perspective how the foundation is correct, how the semantic layer is correct, the ontology is correct so that they have a right level of eval to to to make sure that whatever the capability they are building are uh you know working whenever they want to switch the you know model the the the semantic layer is so important And before that the the data schema is very important.

[00:43:05]Are we building the schema in a way that it is ready for the agents to access on a frequent basis? Because we know that the interaction of agents with the uh database or with your data is a different pattern than the way application access. The second thing is that uh the semantic layer on top of it the your business language the translation for example in airline industry the you know your uh uh ticket or whatever you buy your travel is recognized by PNR model will not understand PNR. This is a that belong to semantic. Similarly in financial industry uh you know for example for invoice the key is invoice ID.

[00:43:50]Similarly, you know what are my top opportunity what I how to calculate the top revenues and what not it varies business to business domain to domain.

[00:44:00]So that has to be built in the semantic layer so that you because the semantic layer is your proprietary whereas the LLM is generally available everywhere.

[00:44:14]People should not leaders should not think that LLM should understand my business semantic automatically and should give me the route right outcome.

[00:44:24]Whereas we have to make the alm work in our favor by providing all the semantic at the right right layer. Models will come models will go but what is not changing is your foundation and your semantic.

[00:44:41]>> Okay. Uh we'll probably come back to some of that mine. So two questions for you there. So we'll have the original one. What's your favorite uh LLM as a consumer? But the second one is hopefully maybe it'll be mix of younger and older professionals, but hopefully the next set of younger professionals will come up with a new killer app. What will that be that'll replace the LLM?

[00:45:08]The easy question for you.

[00:45:10]>> Yeah, thanks. Uh so, um to answer the first question, I actually, you know, I'm also model agnostic. Um, I actually use a server or an app called Olama.

[00:45:25]Now, Olama has a collection of several different models. You've got Gemini, you've got Microsoft, you've got uh uh Llama. Um, and basically what I do is cycle between models month by month, use case by use case. Um but instead of using the base model as they are I actually fine-tune each model that I I I experiment with fine-tuning models right now fine-tuning is you know you can think of it as training on your own data your own custom code your own writing style right without prompting the model.

[00:46:02]actually if you you know compartmentalize your own coding style or your own um writing style into data right you can actually train the retrain the model and so that's what I end up doing um and I cycle between different models every month um right now I'm using Microsoft 53 um fine-tuned um and yeah that's been working well >> awesome may I have a few more uh on top of my quest answer in My may mention one important thing is the finetuning. Uh before I uh come back to that my team is really uh in charge of the whole evaluation of all the uh one P 2P 3P models for the meta. So my team really uh did a lot of extensive and comprehensive evaluation of the old models. Uh so one thing we notice is no uh significant or no outstanding winner.

[00:46:57]There's some uh models are good at some dimensions while others not uh at that or better at that other dimensions. So there's no one single winner. That's one thing. Another thing is as I may mentioned that all the models you have to fine-tune it to be used for your use case. There's no out of shell model fit for all. So fine tuning really requires a lot of data the capacity or expertise.

[00:47:23]So you need engineer experts to find it smartly. So thank you for you to point that out. Yeah.

[00:47:30]>> All right. So I think it was touched on by Shre and a few others which is you have to be model agnostic, right? Um so I guess there's probably two audiences or I can think of two audiences that care a lot about the future. So if you're a large enterprise and you're going to make you're continuously making huge investments in all types of infrastructure, right? So, ever since I started in technology in the 90s, there's this future proof, right? We've all heard this phrase. Is it possible? And like if you were to, how do you do that as AI moves, you know, well beyond LLMs, right? So, and I think that's something you touched on a little bit.

[00:48:14]>> Yeah. So one thing I I'm sure that uh there are many uh IT professionals and in this room and have seen enough in industry like in my opinion uh you know when you get the your data schema correct your database architecture correct your semantic layer correct you can replace on the top application you can point to the mobile application you and build the API on top of that quickly. Whereas if you got your foundation wrong, you are left behind by multiple years. So at least as I you know I'm seeing industry evolving we need to continuously make sure that layer is solid at least that is future proof like models will come different patterns will come because the way industry is evolving every other week uh you know cloud code is producing some applications and whatn not and agents uh it's very fast evolving But for the enterprises that is going to be the key. Let's make sure for 2030 number one priority is to make sure your database architecture and semantic layer is solid and uh you have the right level of evals built that define the behavior of your uh agents or the functionality and then you are open for experimentation for the new models or new paradigms that's evolving.

[00:49:59]Okay, go ahead. Thank you.

[00:50:00]>> Yep. No, I think um Shhat covered it very well. U you have to go back to the basics. You need to know where your data model is. You need to understand your business use cases and business logic and semantic and those need to be fixed while we fully expecting the actual agent that will do the work will change or evolve over time. LLM that is used by the agent for sure will be changing. You heard from multiple people it's weekly right now. I won't be surprised if it gets to a very dynamic place where it might be changing based on the use case really or even in the you know current context of semantic you might run multiple models and pick the best one right uh so that is definitely a way to start thinking about little bit of future proofing right other thing I also want to talk about um and I was talking to a couple of students right um especially CS students all of them are facing a big dilemma right lot of code is being generated by AI right in fact there are a couple of startups that I've been engaged with or mentoring right where one try we are doing is can we actually have a soloreneur right a single person who comes with an idea right and is uh able to deploy things without requiring any additional engineers to be hired right short answer is it can be done it is very hard to do right and it goes back to this unless you do the basis right unless you have the right person thinking about how to structure it correctly it becomes untenainable unmaintainable and un uh you debuggable very very quickly. So I think those are the things that going back to basics is very very important right even for the students that I was talking to like focus on the fundamentals fundamentals don't go away if anything the need for them becomes more and more stronger as you progress right as you get into a world where it's more dynamic and things are more or less predictable than they were before actually yeah so to to add on to that you know that's a very good point about the fundamentals so I have um I have an undergraduate degree in mathematics matics, um, theoretical mathematics. Um, I obtained that before, you know, LLM's boomed in popularity. Um, and at the time I was thinking, oh gosh, you know, what am I what am I going to use this for? Um but then you know when you start to get exposed to deep learning and machine learning um those fundamentals of uh real analysis, the fundamentals of complex analysis um all start to make sense, right? And basic computer science principles will never change, right? That is the most futurep proof part of LLM development as a whole, right? The principles. So I think um you know focusing on the fundamentals maybe not necessarily the the job or the tool of coding right that's a that's a that's a part of labor that that can be you know reduced but at least the perception and understanding of what's being generated what's the expected output and you know how how is this supposed to look right that all comes from the theoretical knowledge and uh couldn't agree with more with that talk about the future uh proof. I also look at this problem from a people aspect that is no matter which domain or enterprise uh areas you are working on uh it will be dynamically changed quickly in the next few years. So uh techn technical wise, budget wise, capital wise you cannot really uh uh be prepared fully prepared for the future.

[00:53:32]You don't know what will be change how they will be changed but people will not be changed is you will be the people driving the business. So uh but people you may need new people different type of people at least the different kind of a mindset of the people to drive your business that's why uh for old companies I would suggest them to hire some CS background people especially a fresh graduate they are really the new generation of people with AI native they have the AI minds side they are not just learning something on you they have the AI mindset which could help you understand problem different angles so really is the key.

[00:54:14]>> Okay, so actually that was going to be the second question which is future proofing one's career, right? So we have a couple of young people uh maybe more than a couple uh that are kind of embarking on the the professional journey or the even the college journey.

[00:54:29]So um you know what fields do you think one shouldn't pursue that we were encouraging kids to pursue 5 years ago?

[00:54:40]Obviously sounds like fundamental math is good right?

[00:54:46]uh what's going to be redundant or what already is redundant that was >> yeah I I think uh one thing is becoming very obvious is that knowing the fundamentals as is given that is always very important but just knowing this coding part is not going to be enough.

[00:55:10]What the upcoming generation needs to bring on the table from the day one is uh s systematic thinking that they can translate to the agent to produce the desired outcome.

[00:55:24]Coding is implementation is given now it's cloud code can churn the code like left and right. It can look at the legacy system and can tell you that there are five memory leaks within less than five minute which you will take days to identify. It can it can write the PR just like that. But you need to guide and for that guiding right now comes from the experience but whereas uh when the the new blood is entering in the industry they need to bring that from day one. So having that understanding how the architecture is there and you can translate those to the instructions very precisely crisply to the agent and that is going to be the key.

[00:56:14]So that's my team plus one to what you know you heard from both Mayan and Sharat right? uh only the additional thing I want uh folks who are either in college or just getting out right to think about is think about in past right we would always talk about people who are experts or you know in a vertical domain right they knew everything about a domain say like healthcare or automotive or aerospace right those who were vertical domains then you had professionals who are horizontal right hey I'm a database expert I'm expert at core sciences and biology right uh that is the world that we used to live in Right now the most successful folks are going to be who can cross that and what we will call as a deep professional right where you have a deep expertise in one domain right where you can go vertical but at the same time you have enough cross skills where you can uh do multiple functions together so that you know it's more effective. So a simple example since we're talking about code generation right a solo planner who can be really successful or even an a new employee in a big uh company can be who will be very successful is someone who understands the basis of of software architecture a solid one right understands use cases and customer requirements right and is able to define those and is able to tie it back to business metrics right and being able to do all three of them is the key that I'm talking about yet you may have expertise in only one of them in in terms of your vertical expose, right? So, something to think about, right? Um I do think we are no longer in the world where you can just have a horizontal or vertical capability and be super sens. So, so something to think about.

[00:57:53]I totally agree with that. In fact, I be because I be with a big company as well.

[00:57:58]Well, I be asked by many parents what the major my kids should go to. In fact, I found at our age, we have a lot of kids at college and going to college, right? So that's a tough question. In fact, for nowadays, not only the fresh graduates could hardly find a good job, but also the FTEEs and the the current people working may lose their job pretty soon. In fact, my prediction is in the next two three years there will be mass layoff all across the board, not only from IT companies, it will be everywhere and the society will be in chaos. So I'm fully prepared for that. Not not a big but different ways. Yeah. uh we talk about the major to select I totally agree for the future the people they need is not one single uh vertical domain they need to be across the domains they have a mixed background and they may want to work on different things combined together but one key skill they need to know is how to better use the AI to help them everything in the future will be AI assisted so uh no matter where you are who you are or what you do you will use AI so how to better use AI they'll make outstanding uh than other people. So that's my answer.

[00:59:08]>> Yeah. Uh I yeah I completely echo all those thoughts. Um you know someone who's been in education for the last six years um and is thankfully out of it for now. Um you know I can say this one thing is that you know I think a lot of the education you have in school um can give you very good theoretical knowledge if you focus on the theory um rather than the coding side. I think focusing on the theory is very valuable. But one thing that you know any you know any person in college should be you know actively learning on their own because you know trust me you know school is not going to provide this is understanding the application of AI right that's essentially you know dotting your eyes crossing your tees right there right you can build a fantastic AI model that you know meets all these or succeeds all the other models on benchmarks is highly performative on on a analytical scale, but you got to ask the question, what can this be used for? Where can it be used? Um, you know, there's no there's no, you know, if you if you don't have a purpose for what you're what you're creating that's so powerful, you know, I don't think it's um, you know, I don't think that's a good use of your time.

[01:00:27]And understanding where hey, you know, maybe this is a new application of AI.

[01:00:31]you know this is an area where um this new technology has not has not been present um you know I can capitalize on that I can customize an AI model for that use case uh I think that's super valuable and um yeah I think >> that actually connects with another question we had which is um I guess also going back to the original theme what comes after LLMs or what what else is there or is needed So which are there specific application areas or verticles that look more exciting or more uh uh open or or what do you call more ready for leveraging future models right beyond um if that question makes sense right healthcare autonomous vehicles or manufacturing airlines what have you >> yeah I mean you know you know I've been working quite a bit in you know the autonomous domain autonomous driving domain um you know that is certainly like a very you know widely applicable field of you know world models of um world action models I think that's a perfect application of it but there still is the question that you can apply to any field hey you know this model is too big I need to fit it on something cheaper I can that I can put in every car right that's the real question and that's more of a theoretical question to be honest honest rather than um you know an application based type but um yeah I mean that's a highly relevant field I would say that beyond LLMs it's >> do do you think this would enable vehicles or drones or things that we today perceive to be impossible or hard to do if you had these much more efficient models that could be deployed in >> devices I think you know it can I think it can you know fill in the gap apps of the deficiencies of of existing solutions.

[01:02:34]Now, it's hard to say that it can fully solve a problem. I think that's a little difficult to project. Um, but certainly fill deficiencies of existing solutions.

[01:02:46]>> Same question, Chard. What applications are you guys are exciting?

[01:02:52]So promise I I I've been in uh you know around AI and with the real customers for almost like sevenish years now. I don't think so be before we think about the future the new area where the AI can enter I don't think so the existing common area around us have really truly uh you know realize the potential of AI yet like for example you know uh being in AWS I was part of Amazon connect which is contact center in cloud whereas most of the customer in different industries uh you know being travel, financial industries, everyone is looking for how to apply AI so that you know the most of the customer service can be more effective and faster to the customer and can be more personalized.

[01:03:50]Uh you know and then there was a you know a a wave came that lot of IVR and uh you know conversational AI came into the picture. Then we hit with the another uh you know roadblock of these AIs are very robotic you know there is it it is it can't be equal to the human because one thing that human bring in conversation is uh you know uh sentiment then the you know industry start evolving towards customer support industry start evolving towards we need to have IVR but same time they should be sentiment aware in the conversation like if I am frustrated traded the you know the the the response should not be very robotic like you know yes I can do it yes I cannot do it it should add some sentiment sorry to hear that and thank you for the appreciation and whatn not still all those industry haven't truly uh you know realize the value like I would say that like in in uh you know customer support itself uh you know if I look at the data around no one have achieved achieved yet uh you know maybe more than uh 20% or 30% of uh success rate with AI yet in customer support across the board. So the a huge potential over there where the value can be realized. But uh you know going forward I truly believe that the there are many industries are untouched and the reason for that is that there are great models out there in the labs right now and the term I use for that and and those models are sitting there nobody is thinking about you know bringing those to industry. I say that like that indust that term is I somehow I I call it uh you know model overhung the model is sitting there and it's like they are very powerful maybe in a uh you know for manufacturing maybe for the healthcare and but there's a push required to bring them in industry and uh so that you know those can be uh used more broadly. uh whereas the other problem is going on in industry is that many enterprises are stuck in pilot. they are not able to graduate out of the pilot and the both side of the problems are there. Uh that once leaders recognize the right way to get out of the pilot is to you know make sure that we build the right guard rail, we fix our platform data schema and have the right semantic and have a confidence in pushing the you know models or the AI agents to the production. uh that way they will be quickly moving to the next phase. So similarly bringing the new models will will create a more space in the industry so that customers and enterprises can think of experimenting for next set of things.

[01:06:57]>> Got it. Got it. So so you're saying there's a lot of opportunity still even if you don't have any new technology just applying what we already have. I see. So Gio, we talked about this I guess a couple days back right? Uh well I guess it's related what sh mentioned that going from PC into production and I don't know 90% of people are kind of stuck right y cuz it compliance and risk and all that goes through the roof. So your thoughts on maybe future application areas but also what >> yeah I'll touch on this first and then uh go to the future applications. Um I do believe like or at least from my vantage point what I've seen is more than 95% maybe even as high as 98 99% of use cases they show huge uh results or very great grains right when you do a P and then when you try to put in production things just fall on the space right very very few use cases go in production this goes back to um a challenge I want to introduce first and that will also tie into the how future will develop right for AI at least in my view I think economics of AI today is completely lopsided right let's take one of the salama models that said to me you start training it takes months right it takes crazy amount of GPUs going to put just a very brief estimate somewhere in hundreds of millions of dollars spend on training a new model right shelf life of that model to be the sota or state-of-the-art is in the best case scenario few weeks right that doesn't give you enough opportunity to capitalize for any company to go and do that right and that is definitely one big challenge The other one I think may already addressed right to exist this models today you have to hit a cloud well there are multiple hops right especially in networking right where it may not be feasible the power required or compute required to compute or do an inference is almost same equivalent to what is required for training on a single topper right that that economics has to change too so I do want to talk about those right but going back to like why are so many projects stuck in pilot phase or they never make it to production. Uh it truly comes down to a few fundamentals, right? One is when you are doing your pilot, are you putting it in a place where it's a real life scenario rather than a reconstrained environment, right? Second part is like do you even have the data to actually pull it off on a large scale, right? And that's a fundamental question many many companies don't start out with, right?

[01:09:21]They look at it and say, hey, if I can summarize and a resume, that's great.

[01:09:25]Now I have a better summarized resume for all my candidates. Great. what what what does it fit into right how does it play in the overall ecosystem that's very very critical so I think that is a fundamental thing that needs to happen I think AI needs to mimic human behavior and not the other way uh I think we have put the cart in front of the horse to some extent right where we talk about LLM and we talk about tokenization why should a business care about it they shouldn't right and uh that is where I think the fundamental shift has to happen before we'll see the mass adoption right um That's true.

[01:09:59]>> Yeah. Yeah. I would like to add few things is that uh this is a very critical point that soon you know as we all are leaders sitting here and there are many more leaders over there. I believe we will lose the right to consume all the energy all the resources behind these LLM training if we are not able to produce the right utility of it.

[01:10:29]So I always believe the trust with LLM is not limited to just security. The trust the other aspect of his utility.

[01:10:41]So we need to continuously think about like how we can bring the right outcomes and the utility out of LLM otherwise soon you know all these resources uh you know all these rights we have right now to use GPU energy and so much of uh you know uh resources behind it we will lose soon that's that's how I see the 2030 >> y uh no pressure >> yes again and now it's switching back to the question of how do I see the future I think future is bright, right? Uh 2030 and I'll go back to historical patterns, right? I do think like what computer did, what internet did, right? AI is going to be no different. It's going to be personal. It's going to be small.

[01:11:23]Right? Now, it requires massive clouds and servers to run. It will fit in your palm, right? It will fit in your mobile devices and it will be very very useful.

[01:11:30]Right? Uh we will go through hard phases as Tiger was mentioning, right? I also expect within next few years you know humanity will have to adjust to how AI is evolving and vice versa but at the same time I do believe see given the course of time 5 years or so it's actually going to be a better world right it's no different than any other technology right when computers came in lot of people who didn't no typing didn't know how to use computers well they had to retool right and we will go through a similar similar transformation as humanity in next few years for sure and yes you know is it going to be smooth no um it's going to have its bump but I do see the light at the end of the tunnel on this and it's a very positive one. So >> excellent. Tiger, your thought?

[01:12:10]>> Uh in fact I look at the Lra model a little bit different. Uh a lot of people think Latin L is a pretty new thing.

[01:12:17]It's going to change everything. In fact if you look back about the 30 years you see there are multiple waves of innovations in the past and the L model may be the smallest wave in the past. We have experienced the mobile change. We have experienced the cloud computing big data error. So there are even bigger changes. Uh uh so if you look at the L model in realistic it did not create any new business opportunities or new user scenarios. All the current application I would say is already been there. So uh there's nothing new is created. That's the problem. Uh as I mentioned that the the mass layout, right? Why there's a mass layoff? This wave of AI innovation is so uh so fast, so uh so uh impactful the efficiency gain is uh uh so great never happened in the human history in the past. Well, at the same time, it did not create any new opportunity to absorb the the the efficiency uh lost or efficiency gain from other domains. So I'm pretty sure there will be a mass layoff around all across the board. But at the same time you can see uh natural language model really drive up a revenue really beneficial for some domains but not all for the older use cases. Uh for example for the cheap business really is driving up the the cheap hardware business a lot all across the the industry. uh and also for search ads and also uh recommendation related stuff is really driving up the business are like uh billions of dollars or tens of billions of dollars more every year for for example for meta for Google for Tik Tok for other uh uh companies that's why I think around business working on the search AI really have potential right so three really major domain I saw in the past two years really uh uh uh been beneficial from the L model search uh as and also a recommendation related as stuff.

[01:14:15]>> Thank you. Uh we're probably going to run out of questions and I want audience to have time. I I have one question. We've kind of touched on it, right? Which is training is very expensive. Large model training is super expensive, right? Um and then inference is everywhere, needs to be everywhere, obviously.

[01:14:38]So there's a little bit of a dichotomy.

[01:14:40]Sometimes it's mentioned or thought about training and inference, right? But it it seems especially with world models that you we'll need devices or solutions that can do both, right? Because when I'm walking around in real world, it should that should all be going to training of some sort, right? And then when I need it to infer something, it should infer something for me based on what it's learned, you know, from whatever large content sets as well as my own um visual world, I suppose. So is there where do we get to the state where this dichotomy goes away right in French and training and it's constantly happening all the time, right? And this goes like to to your point which is it shouldn't matter to the end users right whether it's LLMs or what GPU whatever right so so when do you get to that higher level of abstraction where most of the people maybe even in this room would be abstracted from some of these uh details >> no that's uh I think there are two questions over there in terms of capability right so let's take rag as an example right where you have context text which is kept right and it reproves the results or enriches the results based on that right some of that exists today the architecture hasn't evolved where it can be super dynamic right it still requires a batch upload for certain aspects right and I think that needs to change I do think there also needs to be hardware innovation that needs to come in uh we plugged AI training on computer graphics or GPU GPUs were not designed for AI training from start right they were designed for accessing large visual content, right?

[01:16:31]And creating that in a way where it can be processed very quickly. A simple example in mathematics, it's it is all about matrix multiplication, right? Uh do you require met matrix multiplication for AI? To some extent, yes, but it's a very sparse metric, right? So there has to be innovation in that front, right?

[01:16:48]There has to be innovation on fundamentally how you process a single tpple for training and inference. Couple of ways I can see this evolving. One is they may completely diverse, right? in a way where it's in inferences is so fast that it becomes very very quick for you versus training becomes slower. Other thing you bring up is like can we be in a world where it's continuously training and learning right I do think yes but it might be a hybrid world for us at least in the five year horizon before we can get to this ubiquitous world of uh continuous training right and what I mean by hybrid world is your context is captured and your context supersedes the results that are presented to you even more than the world model right uh this goes back to some of the things that Ron was also mentioning right your results that you get back how can you infer them right how can you actually interject and change them on the fly. what can you what can you do so that you know there are elements of it which will surface because let's say you and I had this conversation around AI and few days back right and yesterday if you're looking around AI could those results be relevant saying hey why is this not going beyond prototypes right or pilots right and what can be done better for that um AI hasn't evolved there right I think search has evolved a little bit but I think similar innovation needs to happen and I think the bigger a bottleneck is going to be the hardware innovation right where you need to fundamentally change how the hardware is designed for the compute that is required rather than force fitting a compute on the existing hardware that's there. So I have a a different flavor of understanding and the way we should solve this. There are there are two parts when it comes to the LLM. One is model producer provider and other one are model users.

[01:18:33]In enterprise world uh you know the all the large company provide the model and services in the cloud and all the enterprises uh you know uh access those models and uh you know to to deliver the you know functionality using agents AI agents or or search or whatever.

[01:18:55]So building a model journal purpose that works for all the enterprises impossible. It's not possible. You cannot have a model that work for airline industry that works for financial industry, healthcare industry and everywhere for their domain specific capabilities and and that the multiple models came in picture. Okay, this model is good for the coding. If you want to tune to the right the code, this model is very good for computer vision and and and whatnot.

[01:19:30]But in within a similar industry like for example banking, the way one bank or one financial institute works, it's not necessarily the the another one will work the same way. So while I totally agree with uh what said about the you know the evolution that is needed uh but that is needed more from the the company those are provider but the one evolution is needed at the enterprise level is the understanding and this awareness that the model I got is not going to solve my problem. There is a there is a responsibility I also have to tell model what my business is through the semantic layer and uh you know uh I need to have a guardrails that observability that tells me how this model is serving my employees when they query this model for any specific thing.

[01:20:30]So having that guardrail transparency and also you know the companies have to provide the reinforcement learning as well and model get better and better as we use more and more in a specific way.

[01:20:45]So that is another thing that has to happen company uh you know those those are the provider for the models they need to continuously evolve in in that field but at the same time those are the consumer they need to also evolve their thinking their awareness right now is a heavy lifting done by the external companies or IT consultant to come in can you make this model work for us the same time the companies have to provide the tools and such observability that the enterprises can be sold.

[01:21:19]>> We don't mind by the way.

[01:21:25]>> So, Tiger, your thoughts on uh >> Sorry, what's the question again? I >> thought Yeah, that that you know there's a dichotomy between training and inference, right? Just to summarize the question, right? So can we get to a paradigm of compute or technology where that goes away and there's continuous training on maybe edge devices or whatever people have with them or enterprises have in their enterprises factories offices what they have so >> effect continuous training is not a new things right for the old in the past 30 years for older AI models you have to uh retrain the model to catch up with your new data or your new need so there there's nothing new here. But the the thing new for the large leg model retraining is really the cost is so high and also it needs a special expertise to efficiently retrain or fine tune the model and it also requires a lot of uh wellleabeled golden data to train it and the labor cost is huge because most of them multimodel but not just the text waste. In fact the for companies like metal other Microsoft they are spending billions of dollars on just labeling the data. So think about that scale. So it's not easy for a small company. So really your company has some business over there to help the customers do that. But really I want to have a word on that compliance both of them I think mentioned that compliance is very important for the for the for the AI models not only for larger language model but for AI models is better for the larger language model nowadays is not so explainable the result. So the the challenge of a compliance over the legal consideration is much higher than before. You cannot explain they give you answer you do not know how or why they give you that answer. So that really it's horrible in some sense and also it's uh meaningful in some sense and uh of course uh different local government federal governments they have different laws against that but no material laws or really uh systematically solve this problem. So if anybody uh have kids are going to a law school, I think that's a good direction.

[01:23:33]>> Last one what you mentioned, right? I just wanted to talk about >> Yes. Yeah. Yeah. So I was uh you know helping out Belgium city council right on something and one topic came up is they wanted to introduce their own AI laws and I was like please don't do that right there are just too many of them.

[01:23:49]Can we take a ISO standard or one of the existing standards and on this right and just use that right you don't want a specific law for every jurisdiction you cross that just adds to the burden of AI and AI cost and you know hurdle for adoption for sure so >> so since you haven't gone very quick then we're going to you to the audience >> yeah um so I think uh there needs to be some form of synergy between the hardware providers hardware manufacturers And the model builders um you know there needs to be some understanding of model builders hey you know this specific part of a piece of hardware I can dedicate for training right in real time and then there's another part of the hardware hey this is the inference inference segment if you will um I think there needs to be very close collaboration um you know because if you So if we have if we're working on developing models, building models for hardware that we know is going to remain constant and doesn't actually have our best interest in mind, right, which is inference and training.

[01:25:04]Um you know, we're not going to be fully utilizing the potential of um of you know, models moving forward. So >> thank you. So, uh, just a little bit of logistics. The food, I think there's still some food. The bar is still open.

[01:25:20]Uh, we should have it for at least a good half an hour more, but we'll do some Q&A now, but hopefully everybody will be able to stay around after that for any individual follow-ups. So, I'm going to try to keep it fast, but So, whailing always goes first. Okay, that's a tradition.

[01:25:39]>> Thank you. Um, I enjoy your talk. Um you guys mentioned about fundamentals future booth.

[01:25:48]Um for just from my understanding LLM provide answers based on pre-shrain data or maybe even fine-tuning with specific data set but LMS don't really understand the fundamentals.

[01:26:06]You they provide answers based on pattern recognition. Right? If some problem has already been solved then the answer >> yeah willing and I would add one more challenge right >> llm doesn't know when not to answer so it will always tell you what is the next character in the sequence irrespective of what the confidence interval it has right so that makes it even more harder right it's not just that it can only tell you what it has seen in past but sometimes it may have not seen or it's not confident on it still make that assertion right >> right right so it provides It's like pattern recognition type of answers but not logical answers since it doesn't have understanding of fundamentals. So what would be the future for like beyond LMS? How can you solve what currently unsolved problems?

[01:27:04]I I'll go uh mic. Oh yeah. The one thing that I think I'll repeat uh you know there are two aspect the model providers uh they definitely always looking for this model is for specific domain or industry and the way model graduate and come out to available for the consumer after a lot of training on that field and uh you know one more thing is pertaining unit data and data is in the two categories synthetic data and actual data. So both is a both area is a very big industry right now. You you should think about it this business. So uh so the synthetic data production is also not easy. You cannot just say like randomly generate this record and train my model on that and run eval and see whether what's the what's the success rate is. The second is the actual data.

[01:28:01]Uh there are two ways to use actual data. One is people are donating their data right now. People are saying I'm ready to willing to redact my data and donate you use that to train your model.

[01:28:13]The other way is the reinforcement learning that we have said. And the final thing that I think I will repeat is that is the responsibility of the consumer to say that I understand I am going to use this model uh you know for my particular domain for particular task to to translate the business logic to translate the rules and regulations those and guardrails I have I will build my semantic layer and I will put that semantic layer on top of model before a query enter or result come out of it. So that model understand like in which context someone is asking me the question. So that the other way is that you say like I'm going to host my own model I'm going to train if you can if you can afford then you can train model for yourself. So per customer model is also a another thing happening. But if we expect that model should automatically understand wide variety of business semantics is going to be tough because there are many enterprises those have built their systems in very old days that schema that information all sitting there from multi-year.

[01:29:35]Other questions? Go ahead.

[01:30:25]Uh for sure I have seen couple of examples of what you're talking about digital twins. Nvidia is one platform.

[01:30:31]There were a couple of others we would partner with. It was very very useful in industrial and manufacturing settings right where you could predict what the defac rate would be right or where the failure or choke points are. Um seen very good applications of it. Uh again going back to what Wailing's question was kind of inferring these are very good at certain specific domains they don't generalize right and knowing what domains it will work uh is going to be very very critical right and that's something that both you have to continuously test and evolve right and at the same time u there is a world where things like physics is not truly understood by uh digital twins today right and once that comes in right um there will be a lot more additional applications where you could kind of start applying AI in a digital Quinn setup, right? Where it will make sense.

[01:31:19]So, >> uh I guess more questions. Oh, there's Ash.

[01:31:29]>> What? All right.

[01:32:38]But he's putting uh the the model on the device, right? The entire large slinger model. That's the on handheld device, right? Like old device not be on the hill device. Yeah, because there there's no really mobile chip support that by Yeah. And also memory is all about send on the other lucky right there is >> maybe this is a good offline one.

[01:33:16]>> Yeah. No, I think uh uh just to quickly address right there is e uh there are ways today where you can take a very large model right and try to distill it in a much more smaller models and you can run the distilled version. It gives you approximately same result. It is not exactly the same model per se. Uh so something to think about but is distillation real? Yes. Are there applications that it has enabled today?

[01:33:40]For sure. Right. Uh and there is a lot more work to be done. Right. Again the same issue you have with That's awesome. Yeah, by the way, if you can do it, we all love it, right?

[01:33:53]>> We'll take the the running a large language model on the uh uh edge devices uh on the news. In fact, uh it has been there for about two years. But uh basically it means different things like one is you can distill the larger language model still call it larger language model. You can do a quantization other ways to reduce the size of the model. In fact, from from a SMTB uh model, you can just easily make it a few hundred megabytes and just make it running on the on the mobile phone easily. But the performance or the accuracy, for example, may be quite different. It may be similar for some use cases, but generally speaking, if you talk about a general usage larger model, a large library model, it will be uh sacrifice a lot uh if you reduce the size. And so far based on my understanding there's no hardware support really on for example cell phone could support that even on Tesla uh uh uh uh car they do not have that kind of a mobile uh uh mobile AI chip to support that. Uh Nvidia tried hard to make it happen but not yet.

[01:34:59]>> Sorry.

[01:35:04]Uh yeah deepseek is still a yeah deepseek is still large model you you can do distillation or quantization make it smaller the raw one no >> even bigger German termin is smaller but still big still big you can run even uh I tried it the benchmark it even run it on the center devices it's the QPS is horrible you have to do something to >> yeah yeah short answer is you could you can dump it right It will work but it may give you a result after 3 hours once you ask a question right and imagine requiring right and that is where uh is it practical it is not practical today at least not that we have seen or at least I have seen right so just want to call that out >> go ahead mind look you've been looking at the this kind of stuff anyway right >> yeah so um you know one area of you know exploration that's been you know been really uh hot has been quantization right and um you know there are different encoding formats that are optimal on different GPUs. Um and the short answer is is that from the benchmarks we've seen again benchmarks you know I don't think you can practically apply the results to the real world but you know to get an idea of the performance um it it doesn't scale very poorly or it doesn't scale very well to um uh lower precision or you know uh quantized models don't scale to the the full version of the models.

[01:36:33]Other questions?

[01:36:36]>> Any any questions on applications?

[01:36:39]Talk a little bit about applications.

[01:36:42]>> No.

[01:36:44]Any thoughts? Even if you don't have a question, >> but let's ask a question to the audience. What do you think the letter language model will have impact on your domain or on your business? Any answers?

[01:36:54]Go ahead.

[01:37:21]medical area is really a very hard problem for the AI. Yeah, >> she saw you do this.

[01:37:30]>> Yes.

[01:37:37]>> So, um thank you everyone. Uh I've been using uh large language model. I work in uh training and a development area. Uh some talent acquisition. uh some of our large clients using us for unique ways uh for talent acquisition and what we have seen was amazing. It it's transformative.

[01:38:03]It there are ways this selection talent selection in executive and management level used to be a myth blackbox. We had 80% accuracy to predict uh who will be selected. it it could never achieve that without large language models. So I I think beyond chat there are so many good ways of using models. Our imagination is the only barrier. There are so many good things you can do with it. Even with current model, they're so powerful. I think the the application layer um every industry can see big transformation just from my own experience. Yeah.

[01:38:53]Sean.

[01:39:00]>> So yeah, I work in healthcare and we support social care including in rural areas and we so we're an applied system.

[01:39:08]We have like an air orchestration layer over a bunch of models and we actually have an offline use in rural healthcare.

[01:39:16]A lot of rural areas don't have internet. So, we have essentially care workers or case managers that go out to rural areas and we equip the device with very small LLM models, but for targeted very specific things, it's helping them navigate very specific 23,000 care pathways that they could never remember as a human being and they could never navigate those as a human being. But that is 100% a great use of their time and allows them to serve four to eight times as many people more effectively.

[01:39:46]So yes, it's a small model, but it's a huge leap for those individuals and our accuracy on that is 98% tested against experts. So it is completely possible to game change certain areas for targeted approaches. And I look for the day when we have on-site hardware, but we have a lot of progress we can make really fast.

[01:40:08]>> Yep. Plus one to what you said, right?

[01:40:09]Uh our my own experience has been similar. If you can train expert models for specific task or domains, right?

[01:40:16]They work much better than going with generic models, right? Um it's like uh even for us as humans, right? There are smart people. You'll want to ask them any questions, but if you are really sick, you want to talk to a doctor, right? And not just the smartest guy in the room, right? So something definitely know worth uh calling out for sure. Just a quick thing adding on to that is the success of generative AI in your domain is highly dependent on the quality of data that you have. Um and how can that quality data be accessed? I think you know we discussed earlier about physics and the laws of physics right and how difficult it is to represent that in data form right so naturally for you know physics- based applications in real time LLMs are quite poor that so uh you know it's highly dependent on the field where you know you can actually access quality data and if you can't access quality data well then that's your first step is you know how can you make it meaningful I'll just add one point um Tiger thank you for that question and I think one area that I personally has experienced right um just over the last year or so I think LLM especially the pro version right with the very competent reasoning capability has transformed how we engage uh consulting projects so from summarizing a workshop uh it to even you know state them work um project summary is is actually super super efficient these days. So I think you know you all probably read an article that hey uh they're going after McKenzies and the BCGs of the world right Chachi PT made a statement a few months ago and I actually after I saw that article I did the subscription to the pro model and it has actually transformed our uh the way that we engage with clients.

[01:42:18]Yeah, one last point I will add that like I think if we look around across the board we will find that a lot of research is done on handling the unstructured data taking a documents taking a raw data making sense out of it creating a summary out of it and uh you know summarization transcript all all unstructured data taking documents and whatnot and make sense out of it. There a lot to be done in the structured data or a lot to be done how we can convert unstructured to structure. How we can make sense out of a you know data that is stored in SQL serve in agentic fashion like which can answer my agentic questions like where I'm asking a natural language question but the but the answers are coming from the database. So that is another field that is evolving very fast where the models have to understand with the help of of course semantic layer that how these questions can be answered very well with out of database.

[01:43:29]>> Is that all? I guess that does it. Thank you so much. So thank you.

[01:43:35]>> Yeah, everybody's very busy.

[01:43:42]How about let's take a picture together.

[01:43:43]>> Yeah, we'll take a picture.

[01:43:46]>> Thank you. Thanks to the audience, by the way, just very quickly, uh we do these about once a quarter, so there should be another one in a few months.

[01:43:54]And uh but thanks to all of you for making it tonight. And then again, thanks to our panel.

Related Videos

Artificial Intelligence

OpenHuman VS Hermes AI: Who Wins?

JulianGoldieSEO

285 views•2026-05-29

Artificial Intelligence

Long-Running Agents — Build an Agent That Never Forgets with Google ADK

suryakunju

142 views•2026-05-30

Artificial Intelligence

5 Mind Blowing Omni Uses Cases

PaulJLipsky

1K views•2026-06-02

Artificial Intelligence

This computer is made from real human brain cells. And you can buy it.

Talktmsmedia

3K views•2026-05-28

Artificial Intelligence

BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2

aimmediahouse

122 views•2026-06-03

Artificial Intelligence

I Made the Same Anime Fight Scene in Every AI Video Generator

NobleGooseAnime

295 views•2026-05-30

Artificial Intelligence

Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S

cnnnews18

3K views•2026-06-01

Artificial Intelligence

I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)

AICodingDaily

298 views•2026-05-29

Trending

Revisiting The Cat Cafe For The Final Time

BenGtalks

3195K views•2026-05-29

Lil bro is a menace 🤣

NotAirJordan

2037K views•2026-05-31

The Casino Had Us Guessing All Day

VegasMatt

157K views•2026-06-03

Political Science

My response to the Police

RecklessBen

1496K views•2026-06-01