Install our extension to search inside any video instantly.

How I trained a computer to predict Dota better than Valve
Added: 2026-05-27

1,189 views043:29thewondercowOriginal Release: 2026-05-26

This video demonstrates how machine learning models can predict competitive game outcomes by analyzing game state metrics like net worth difference and objective control, achieving 76.1% accuracy in Dota 2 predictions by using gradient boosted trees with carefully selected features, while also explaining that simpler models with fewer features can sometimes outperform complex ones due to reduced overfitting and better generalization.

[00:00:00]Since 2018, Valve has been predicting every game of Dota in real time. That's something like 38 trillion predictions made. But in classic Valve fashion, they never told us what these predictions are based on or even how good they are. I wanted to know, but I told myself this little side quest would be a quick check. Maybe it would make a good YouTube short. Now, a couple months later, I'm ready to share the secrets of Valve's prediction system. How good is it? and how can we beat it without looking at draft, item builds, talents, or even game time. Most of you already understand Dota, but in case you don't, you're still welcome here. Here's a 15-second summary. In Dota, two teams of five compete in a race to destroy each other's base. The overall objectives are similar to League of Legends. Players level up unique characters called heroes and gain gold for achieving feats like killing units. I like to make videos using statistics and machine learning to better understand this game or use this game to better understand machine learning and statistics. Up to now, I've been using my own regressions to show how an item influences win expecties in a game of Dota. And those models have been very effective for the limited jobs I gave them. While I've been working on my next few ideas, I keep seeing Valve's predictions sitting there taunting me, teasing me, tempting me. Why should I do all this work building and testing and validating models if Valve already has a prediction in the game? Surely, if Valve has access to trillions of data points, some of which are hidden to me, I should just be using their predictions.

[00:01:39]There's a downside to using models you don't understand, though. For one thing, just because Valve has predictions in the game doesn't mean they're any good.

[00:01:48]I tried looking it up, but I couldn't find anybody who had ever validated Valve's predictions, so it looked like I was going to have to do it.

[00:02:00]Here's where I say I don't usually showcase the whole process in my videos.

[00:02:04]The more boring stuff just gets summarized as I did this thing. But that I did this thing statement hides an immense amount of work and an equally large amount of me just [ __ ] up. I don't always feel like you, the audience, really wants to sit through an extra 5 minutes of explanations about processes that go nowhere. But you should know that a complex task is never a straight line. These days, it's easy to see a video online and mistakenly believe that stuff just comes easily to other people. It doesn't. I'm an idiot.

[00:02:37]I'm just a persistent idiot. So, I will mercifully spare you most of the details about everything I screwed up along the way because this video would be way too long. The short version is we need to get a bunch of games, take Valve's predictions, and compare them against the actual outcomes. That way, we can see if Valve actually knows what they're talking about. The problem is none of the Dota 2 stat sites show a win probability chart on their match pages.

[00:03:03]Dota doesn't even show it in the postgame summary. when probability graphs seem like an obvious feature to include. So, their absence implies that the probabilities are not in replay files. I checked on a couple replays that I had on hand, but those are all heavily reduced to save space. That said, I didn't see anything in there.

[00:03:22]So, I figured when you watch a replay or a live game, the Dota client is probably producing the predictions in real time.

[00:03:30]Then while working on a completely different analysis, I was looking through some uncompressed replays. And while I was in there, you'll never guess what I stumbled upon. The win probability. It's in this spectator object that parsers skip by default.

[00:03:46]Nobody ever reads this object because it's the most boring and least useful possible thing you can imagine. There must be over a thousand essentially useless pieces of information that the game just uses to correctly render the replay. And then way down at the bottom, it ends with radiant wind probability just tucked in there with all the goddamn trees. I threw out everything I had used up to that point. All the replays plus all the code. I rewrote my parser to preserve the win probability and I got 45,270 new replays over the course of about two weeks. The really nice thing is we know Valve's predictions down to a 60th of a second. Then we put all of that into a little database that we can cross reference. That's how we ended up with this.

[00:04:40]Here's Valve's accuracy at different game times.

[00:04:44]The typical game of Dota is 41 minutes long in the current patch, but there's no fixed timer. Some games take hours and some are over in just 10 or 15 minutes. Valve is hitting 62% accuracy before 5 minutes in the game and steadily improving until hitting about 80% accuracy from 20 to 50 minutes.

[00:05:03]Basically, from the midame until most games have ended. Valve's predictions get slightly worse as we get toward an hour of game time. But this is for three clear reasons. First, games that go long can swing extremely easily. Second, a game that stretches is either very close or the advantaged team is making a lot of mistakes. Either of those situations makes a prediction very difficult to achieve. Third, our sample size is much smaller for long games. So, if we go past an hour, we can't reliably measure Valve's predictions anymore. And when we look at how Valve performs across different ranks and heroes, we see some incredibly interesting trends. First off, Valve is essentially equally good at predicting players in every rank, but each rank follows a different path to that level of accuracy.

[00:05:54]Higher skill games are harder to predict from the beginning, while lower skill games are harder to predict in the middle and end. This probably means that low-skll lobbies are more likely to be decided by the draft, even though the opposite is often held as true. It also probably means that lower skill players are less likely to mount a comeback. So, an early advantage makes a bigger difference. Now, these are not huge differences, but they are statistically significant between how the predictions perform between different brackets at different times in the game.

[00:06:29]I don't have evidence for this specific conjecture, but I would speculate that lower skill players are just more likely to fall into despair if they get off to a bad start, leading to a lower chance of a comeback. If you have an alternate theory for these differences between skill brackets, let me know in the comments. I'm interested to read them. I already removed as many clear Smurf games as I could from this data, so I don't think the result is due to something like that. Let's take a look at how Valve models heroes. Here you'll have the accuracy as well as for each individual hero you'll see whether they overperformed or underperformed the predictions on average. These early heroes are predicted more accurately which just means that they tend to follow patterns the model understands.

[00:07:16]All of these heroes are in my experience a bit momentum dependent. They mostly do singlet target damage and have clear but limited win conditions. The wide majority of heroes are all about equally predictable, around 74%. And most heroes make comebacks about as often as they lose leads. But as we dip below 73% accuracy, we start to see two types of hero as harder to predict. First, a few heroes who seem to lose a lot of leads, probably because they struggle with lategame lockdown. Pugna, Doom, and Muerta. Nagasiren is also here, but that might be because her net worth to game impact ratio isn't very similar to most other heroes, which might be tripping up the model. The other type of difficult to predict heroes are the ones that can easily handle mega creeps or split push while behind. Luna, Sven, Gyroopter, Meo, Lykan, Terrorblade, and Broodmother. Their ability to drag games out and hang in makes them all slightly better at beating the model's predictions. The heroes most likely to overperform their predictions are Visage, Brewmaster, Death Prophet, Gyroopter, and Hoodwink. The heroes most likely to underperform their predictions are Ancient Apparition, Medusa, Dark Willow, Troll, and Centaur Warrunner.

[00:08:37]So, overall, Valve is 74% accurate with a very low error rate. But that includes predictions before the game has even started. You might think that 74% accurate doesn't sound that good, but that is probably not too far from the theoretical best possible model on this problem. This is where we need to define some terms. I use accuracy because it's intuitive for pretty much any audience, but it doesn't mean exactly what you think it means. Anyway, this is easy to see in action. Here's a hypothetical series of forecasts. Each dot is a prediction. On the left, we have a high confidence that Radiant will lose. And on the right, we have a high confidence that Radiant will win. Dots in the middle, we're not sure. Now, here's the actual outcomes if this model's predictions were the best possible guesses given the available information.

[00:09:32]So, things it said would happen 25% of the time do happen exactly 25% of the time. This is essentially the most perfect possible model. Those red dots are the expected misses in accuracy. If we gather them all up and count them, we'll see that this perfectly calibrated model only has a 73.7% accuracy. So, you'll see any non-deterministic prediction has a maximum accuracy somewhere below 100%.

[00:10:04]In things like sports predictions, this is usually somewhere between 70 and 85% accuracy. Valve's model is very effective, and it seems to be pushing the boundary of what is even possible.

[00:10:16]Their calibration is almost perfect, and predictions do tend towards having higher confidence.

[00:10:22]If you take Valve's predictions at just 15 minutes into the game, they're beating the best published halftime NBA prediction models that I could find.

[00:10:32]more than three of every four games you play, Valve is correctly predicting the winner at least 20 minutes before the game ends. I think it's fair to say Valve's model is very good. Between the aggregate scores that we've seen and the hero scores data, it's clear that Valve isn't just looking at your game when they're predicting whether you'll win.

[00:10:53]They are probably also looking at drafts, items, hero trends. They are probably updating this model in real time or at the very least very regularly. And because the overall results are the same at every player rank, I'm absolutely confident that Valve is including rank data that I don't even have access to. A great future inquiry would be to see how this model performs in pro games, but but we obviously don't have time for that right now. I genuinely had recorded the script that said I didn't have pro data, but then decided to go do a test on pro games at the very last second. So, fortunately, pro games are much easier to source thanks to Open Dota. I went back and got the 495 most recent pro games as of May 19th, 2026. That's about 7 days of matches. That is more than enough to generally assess the predictions, at least in games of typical length. You'll better understand all these metrics by the end of the video, but for now, the headline number to understand is that accuracy when predicting pro Dota is a lot lower than in pubs. The Pro games also have a bit of a larger error rate, and this is in part due to the smaller sample size and in part due to the fact that the model is just not well calibrated for pro games. Before we dive into more detail, it's time for me to beg for validation.

[00:12:10]You're 12 minutes into this video. I hope that means you're enjoying it.

[00:12:14]Please consider liking it and subscribing to the channel. It helps me please the YouTube gods, but also it just makes my day so much better. If you have some pep in your step, you could also leave a comment. So far, I've read every single one. And as of this video, you can support this channel by buying a membership here on YouTube or over on Patreon. I want to thank Fan of the Cow Tier supporter, Saint Catfish, who somehow managed to find my Patreon before I announced it and has been hanging out with me on Discord for about a week now. That's enough of that.

[00:12:50]Thanks for sticking with me. The way Dota is played by highly coordinated teams of highly skilled players is so different from how it is played in regular games, it may as well be a different game entirely. There's also orders of magnitude fewer pro Dota games on any given patch, and pro Dota matches are usually not between evenly skilled teams. Plus, pro games can be ended early by a forfeit, which is not true of regular games. The result of all this sludge is that Valve's model is just much worse at predicting pro Dota.

[00:13:23]Remember that in pubs, the calibration is almost perfect. Here in pro games, the model is regularly overconfident in close games and super underconfident as games become one-sided. Pro players are much better at keeping games close, but also much better at holding a significant lead once they have it. For pro games, the model doesn't have the same bulges in confident predictions that pub matches have. This is partly because in pro Dota teams are permitted to end the game early, but it's also because usually in pro games, one team is just much better than the other team.

[00:13:58]In a ranked game between two teams of similar skill, the team that gets the early advantage is more likely to win because they got that early advantage.

[00:14:07]In most pro games, the early advantage goes to the team that's already more likely to win because they are the better team. Knoxville over at Dat Dota has trained some prediction models on pro games specifically, including the super cool one about map control. Link in the description. He didn't share the accuracy or error rates of that model, but I would be shocked if it isn't a better system for predicting pro games compared to using Valve's general model.

[00:14:34]Here's a comparison between Valve's predictions of pro games against their predictions of pub games. Pub game predictions are much better until around 20 minutes and then it's a tossup. But that's a little complicated because pro games end so much faster than pub games on average. Let's instead show Valve's probability of predicting a game at a given percentage through each game instead of at a specific timestamp. That will make it easier to compare pro games to pup games. And now the picture is crystal clear. Pretty much anytime in the game, Valve's predictions are much better for pubs than pro games. So, this is a tool that doesn't make a lot of sense in Pro Dota broadcasts. I've sometimes heard people say that Valve's predictions have gotten worse, but I don't think that that's true. It's just that Valve has no prediction model intended to work specifically on pro games. And for the data nerds, here's Valve's error rates. For pro games, the predictions are really bad until about 10 minutes into the game. In fact, you're probably better off flipping a coin than even looking at the predictions if the game is close. To reiterate, Valve's model is on the edge of perfect for pub games and kind of okay for pro games. For any normal person, this would be the video. We learned which heroes beat the odds most often and which fail more than expected.

[00:15:57]how players at different ranks handle advantage differently, how good Valve's prediction model is at different times of the game, or how good it is overall, and even when you should be trying to mount your comebacks. It's between 20 and 25 minutes, because that's when advantage tends to solidify into certainty.

[00:16:15]We could pat out the duration with a few more graphs, hit 20 minutes, do one last reminder to like the video, and ship it.

[00:16:22]But here's my problem. I am not a normal person. Valve's overall accuracy is an incredibly respectable 74.14%.

[00:16:32]But that's not even 3% better than the basic regression I used in the divine rapier analysis I posted last month. And when I saw how close those were, my new mission became very clear. And I understand intellectually that a 3% gap is a huge improvement to make when you're so close to the peak possible model already, but it just doesn't feel like that big of a number after using so much time just getting Valve's probabilities just to find out that all along I was a mere 3% off.

[00:17:09]Well, for practical purposes, like analyzing item impact, 3% does not really matter.

[00:17:17]Like being a few percentage points behind Valve doesn't invalidate the net worth models we've used in past videos.

[00:17:24]If anything, it just proves that we're getting good enough results to have meaningful analysis.

[00:17:29]But in real terms, a 3% increase is a huge improvement. With a 3% accuracy edge, you could be profitable at any mainstream sports book despite the house advantage. Valve is absolutely eating my lunch here. And we have another problem.

[00:17:48]The better a model gets, the harder it is to make further improvements. So before we get started, let's first see how close to perfect Valve's model is.

[00:17:58]There's a test called a cover heart bracket. This lets us make a very stupid model and use it to know a definite range for what the best possible model could do with the same inputs. I dumped a bunch of metrics into this test. The best possible prediction with any combination of these inputs would reach at most about 77% accuracy and the actual optimum may be even lower. Valve is already predicting with over 74% accuracy. So it is entirely possible they are at or beyond what can be achieved with these inputs.

[00:18:36]That is another pretty strong signal that they are probably using hero drafts, item data, talent data, or hidden information like individual player rank, player histories, communication scores or behavior scores all in their predictions.

[00:18:52]And of course, I have no way to know what information they are using or are not using. But I do not want to make a model that needs any of that information. For one thing, I can't access all of that data. It's not available. For another, a model that uses draft information would immediately break if the meta changed or if a new patch that altered relative hero strengths came out. Unlike Valve, I don't have easy access to a fire hose of matches or matches that were played during beta testing a patch. Ideally, I want to predict better than Valve using just game fundamentals, things like net worth or game state. I'd like a model that doesn't need to be regularly updated in the future so that I can use it whenever I want to. And I do beat Val by the end of this video. Before we dive deeper into this absurd gambit, I should define what I mean by a better prediction.

[00:19:48]In short, I want higher confidence, better predictions, and a lower or comparable error rate. So, imagine you have two friends. One who always says that there's a 51% chance for a coin to land on heads, and the other always confidently says there's a 100% chance for coin to land on heads. Those two friends have the exact same accuracy in a statistic sense. But your friend with the lower confidence understands the problem much better. We call that being better calibrated. We use something called calibration error or expected calibration error to tell us whether the model's predictions are accurate relative to their confidence. Do events predicted at 10% actually happen 10% of the time? This goes back to the example we had earlier. There's another category of error that I'm going to simplify to prediction error. Prediction error combines calibration error and accuracy.

[00:20:42]So it heavily penalizes a model for being overconfident or underconfident.

[00:20:47]Low predictive error means a model is usually as confident as it can get without misleading you. Finally, we have what I'm going to simplify to ranking ability. This is called area under the curve or AU. If you are predicting an event that isn't 50/50, then accuracy becomes less meaningful. If you have a 90% accuracy predicting whether I'm going to beat Dendi in a 1v one mid, then you actually suck at predicting me versus Dendi 1v one mid because I will never beat him. To avoid this problem, we use that ranking ability or area under curve. It determines if your model can correctly rank games by how likely they are to win, regardless of whether the model is ultimately right or wrong on its prediction. If you have a great ranking ability but bad accuracy, what that means is the model understands the question but you don't understand what the model is telling you. Between those four metrics, we know does the model understand the question. Do you understand the model? Is the model making reasonable predictions and is the model as confident as it reasonably can be? So those are the four metrics we generally use to test predictions. What matters most to you will depend on what your model is already doing well and how badly your thesis advisor needs your study to be published. Usually accuracy is the most intuitive measurement and that's what we've used for most of this video. But ranking ability in combination with predictive error is the most versatile and reliable test of quality. And there's no silver linings here. Valve is beating us on every single metric.

[00:22:26]So we start with a basic logistic progression which is a simple predictor and then I modified that regression in order to include a little more information about how the state of the game was changing over time. That change radically lowered our error rates near the beginning and the end of the game, but overall it only increased our accuracy by about 1%. Another thing that I could do is reduce overlapping signals by removing net worth entirely and instead just using the network difference. The reason I thought this might help is because a large net worth is partially a measurement of game advantage, but mostly it's a measurement of game time. Where in the game are we?

[00:23:04]And that overlaps with the game clock that I've already included. And it did improve accuracy by another 2%. With similarly underwhelming gains across other metrics. And at this point, you're probably having flashbacks to when I made 102 logistic regression models for the Divine Rapier video, only to prove that things like player rank and tier one tower status didn't actually improve the model by any substantial amount. I did not want to repeat that process. If we want to beat Valve, it's clear that we need to go beyond simple regressions.

[00:23:39]It's time to bring out the medium-sized guns, gradient boosted trees. A logistic regression is kind of like just drawing the best straight line to separate your data into two sections and then using that to make a prediction. The problem with that is it assumes that you can make good predictions with a straight line. But Dota is not a linear game. A 10% net worth advantage at 15 minutes is enormous. A 10% net worth advantage at 100 minutes is basically a tie game.

[00:24:08]There's just a limit on how strong of a prediction we can make assuming linear relationships. Gradient boosted trees are a much better tool here as long as we're careful. You don't need to know any math to get a basic idea of how they work. A gradient boosted tree is really a series of flowcharts. You dump all your games into the first flowchart and it categorizes them into buckets with similar games grouped together. You label each game based on the bucket it ended up in. Then dump all of the games into a new flowchart that asks a different series of questions and suggests an adjustment to the first treere's conclusions. And you repeat this however many times you want, often hundreds or thousands of times. Each flowchart makes an adjustment based on what happened before it. This kind of imitates the old adage wisdom of the crowd. It's similar to asking thousands of people to predict the outcome with different information. If you ask enough people, they can often outperform one super expert who has all the information. The danger with a gradient boosting tree is that they can get really overconfident and very overfitted if you aren't careful. You might have outputs that claim 99% accuracy, but then you test them and you do worse than flipping a coin. A gradient boosting tree is much less efficient to train as well. But I've already put in weeks of committed time here. So I'm not going to shy away from something dumb like a 100x reduction of efficiency.

[00:25:39]With this new architecture in place, we could expand our inputs because gradient boosted trees are much less susceptible to errors from overlapping data. So what we'll do is make a series of increasingly convoluted models with every piece of information we can think of and then we'll scale the model back once we get to the point where we're beating Valve. And just preparing the data for this took days. The first thing I wanted to try was splitting all the net worth information for the team up by which role and which lane had an advantage. I was very optimistic about this approach because my conception of winning a game of Dota is really based around who is having the good game. I play support and I've seen a lot of games lost where I was just balling out of control but I couldn't get my cores to join the game. But the flip side of that is when my car is having a great game, it often feels like I could be cracked out of my mind on Ldnum with 99% packet loss and we're still going to win. With that in mind, for our first tree, we added in a bunch of information broken down by players role and lane for 40 total features. It boosted our accuracy by about half a percent. Then I added in a ton more data all broken down by ro and lane with change rates over time. At this point, we have 373 features and a 72.4% accuracy. Just as an experiment, I wanted to try putting in hero data. So that's hero IDs as well as a unique ID for every pair of heroes in the game.

[00:27:08]This would help the model identify that Morirana and Bane have good synergy on a team. Whereas anti- mage versus Medusa is a good opposition match. Plus, we added a bunch of non-hero data for this model. Overall, it's more than 500 input. And when we test it, it's actually worse. This is that problem I mentioned earlier where this type of model can overfit very easily and basically become useless. We'd need hundreds of thousands of games for hero pairs to work, and we only have about 40,000. So, let's run it again, but without any hero data. It ended up pretty close to the same as the prior best model. Now, we're going to skirt the line of hero data by telling the model the expected win rate of every hero in each role. So, if Lion has a 50% win rate in the current meta, then we tell the model the safe lane support expects a 50% win rate. And now we're getting somewhere. We are actually beating Valve in a few spots now. We need a small edge. I tried different assemblies of data adding or removing to see if I could nudge up a little more or avoid overfitting somewhere. I tried replacing hero IDs and telling the model what each hero's expected win rate is.

[00:28:16]Instead, I even tried downloading 20,000 more replays to increase the sample size. All the models stayed around the same peak performance. It seems like basically we have tied Valve at essentially the peak of predictive power. So we don't beat Valve overall, but at a few key points in the game, we do. I fulfilled that promise that I made earlier in this video, even if it is on a technicality. And that's what I would be saying if I hadn't then continued to add player skill. Now, Valve does not give me unfettered access to every player's rank. If they have their account set to private, which most players do, the replay data does not have their individual rank available.

[00:29:01]But Valve does always tell me the average rank of a game lobby. And when we add that in, we jump to 76.1% ACCURACY.

[00:29:13]NOW, we're beating Valve not just there, we're beating them there by a lot. and we're beating them on every metric except calibration error. Now, the nice thing about calibration error is we can do a few things to massage the output.

[00:29:28]We can reinterpret the model using a thing called an isotonic calibration, which basically just uses a layer on top of the model's predictions to adjust them as needed. And the results of that do make the model's outputs better overall, but a lot less smooth. They become sharp and jittery. We still don't have our calibration error quite as low as valves, but we are really pinching far things here. In a realworld environment, any calibration error below 5% is pretty much considered the gold standard. And we are already well below that wall beating valve in every other metric. And I'm incredibly pleased to state that we have reached 76.1% accuracy or 85.1% ranking ability, which is about 2% better than Valve in either case. This includes predictions starting from the first minute of a game. And those early game predictions are the majority of our edge on Valve. By the time we reach the typical game end, our models are basically the same. But in the first 10 minutes, we have a nearly 10% accuracy edge. By the way, I'm only showing the chart up to 60 minutes, but our average does include factors from the entire game. Here is a list of heroes by order of Valve's accuracy with our prediction added on top. Our final model does not even know what heroes are being played. We tell the model the strength of each hero in the current meta as a flat expected win rate. So instead of saying this team is spectre, sniper, and spirit breaker, it says this team has expected win rates of 55%, 51%, and 51%. The way our model interprets heroes is it gives a flat bump of win expectancy to each character based on the meta strength that we've assigned it. If the game is close, that edge goes to the team with the best heroes. As the game goes on, the edge melts away to basically nothing as the model looks primarily at net worth and related factors to determine its winner. I want to quickly flag Alchemist accuracy in our model here. It is great, over 75% and beating Val by 1.8%. Alchemist is a hero who earns gold at a different rate than other heroes, but our model has no ability to adjust its expectations based on that ability. Because our model never knows if Alchemist is being played in the first place. The fact that Alchemist has great prediction accuracy in our model should end the myth that he is somehow different in the win prediction rules compared to other heroes. Net worth is more than enough to predict Alchemist's expectancy. we do not need to overthink it. So when we look at the trends from our own hero predictions, we did have to use most of our games about 30,000 to train the model. So all of our hero analysis has to come from just the remaining games in our test set. That means there's a bit more noise on the heroto level for our predictions compared to Valve's predictions. Some of the heroes have really small sample sizes like Chen sitting well under 100.

[00:32:31]For our predictive model, Chen, Naga, Siren, Medusa, Vis, and Wraith King all overperformed their predictions by the most. I found it very interesting that Vis was the top of Valve's list as well.

[00:32:45]Something about this hero seems to make it hard to predict. And again, if you have theories about this, please leave me a comment. I would love to know. But some of this is from noise because, as I mentioned, we have a smaller sample size on some of the heroes like Chen. Here's that same list, but if we only include heroes, we have data for at least 300 games. And now Medusa is jumping to the most overperforming hero in our model.

[00:33:09]That's very interesting because she was the second worst underperformer on Valve's model. Valve overestimates her strength by about 2% and we underestimate her by about 4%.

[00:33:22]I suspect this comes down to how our model doesn't know which heroes are in play, while Valve's model almost certainly does. Medusa is a phenomenal hero for comebacks and extremely weak against a few hypersp specific counterpicked opponents. That means losing a game with Medusa doesn't look exactly like losing a game with most other heroes. And that probably trips up our model because our model does not identify heroes. We obviously do not look at hero talents or skill builds and we also never look at items. I did try putting an Aegis status but the model has literally ignored it as irrelevant.

[00:34:00]It reduced itself down to 0.00% impact. That's presumably because the team that gets Aegis is usually the team that has control of the game which is already predicted to win based on that lead.

[00:34:15]I have implied and maybe explicitly said in this video that we will outpredict Valve without any hero or item data, but we do still kind of have hero data in our model. So, isn't that cheating?

[00:34:29]Didn't we fail? Well, we aren't quite done yet. Right now, our model has 482 input features broken into about 26 groups. Gradient boosted trees are not trying to explain why a prediction is right. What they want is the single most efficient way to predict the outcome.

[00:34:51]According to the model, 24% of our prediction is just from the total number of sentry wards placed during the game.

[00:35:00]Now, I believe vision is important, but there's just no way it is single-handedly responsible for a quarter of the game. Sentry wards carry a lot of signals. The team that places more of them probably has more gold.

[00:35:13]They probably have more map control.

[00:35:15]They might have better team coordination and a larger number of sententuries placed indicates we're deeper into the game. What the model has learned to do is use sentry wards to detect all of that information instead of using the categories intended to showcase that information more precisely.

[00:35:32]To explore this, we can do a greedy backward elimination. All this does is it removes inputs one at a time and forces the model to try and replace them by looking at other information from the remaining inputs. We start with the most redundant information in the set and keep going until we have the simplest effective model possible. If we remove sentry wards from the model, nearly all of its predicted power shifts to total observer wards place for similar reasons. And when we get rid of observer wards, that prediction moves to rank information, which does include open Dota's estimates of each player's ranks.

[00:36:09]Presumably, a lot of information about who wards and when is also something the model can estimate based on how good the player is. And when we remove the rank information, all that predictive strength shifts to the status of towers and barracks. The remaining features experience and hero level both co-enccode with net worth. So we can easily remove those and we end up with just 19 features looking exclusively at building status and net worths. Implicit in this model is also the role and lane of each player and the time of game when the prediction is being made, but those are not being handed to the model explicitly. This is about half a percent less accurate than our best model that used hundreds of features, including the expected win rate of every player based on their draft. All you really need is a picture of the net worth and a sense of what's happening with the objectives. If you can capture that, well, you can beat Valve at predicting Dota.

[00:37:07]If we try to go any further, we drop right back down to around 71%, which is almost exactly what we started this video with using a threeinput logistic regression based entirely on net worth.

[00:37:19]But within those two categories, we definitely don't need all 19 features.

[00:37:24]We can break those categories out into the 19 features that compose them and do this exact same algorithm again.

[00:37:32]Obviously, we probably don't need net worth signals from every single player because those patterns overlap a lot with team net worth. And in fact, we have the net worth difference between teams. That means keeping just one of the teams net worths will give us the exact net worth of both teams. And we can actually just get rid of all of the information about tower states because towers give net worth information in two ways. You get money for killing them and they give you enough safety to farm. So the majority of that tower signal already exists in the net worth. What we end up with is just net worth difference. A small signal from Dyer's net worth and a signal from Dyier's barracks. These last two could have just as easily been radiant. Basically, the model just needs a standin to derive which team is ahead and whether the game is very close to ending. Notice that we're not even telling the model the game time anymore. net worth is already enough for it to estimate that we really can beat Valve with just these three numbers. Remember that our model is trained entirely on ranked public matchmaking. The following predictions are made from that exact model with the same inputs. I never trained a model with pro games. Yet, it is still very good and definitely better than Valve at predicting pro games. Our minimum threeinput model is actually the best performer when it comes to pro Dota. The one small problem we have is calibration which means the model's outputs are good at predicting pro games but we don't have the right interpretation to understand them. If we shift our interpretation using that isotonic regression we used earlier the 19 feature and three feature models are basically tied as much better than Valve's predictions. The primary reason is because Radiant wins pub matches 52.7% of the time if you include all skill levels. In the last 500 pro matches, Radiant has only won 48.9% of games.

[00:39:32]Basically, an advantage that would be a Radiant win is about 4% off in pub games versus pro games for all sorts of reasons we won't get into here.

[00:39:44]Real fast. I don't want to be managing a bunch of [ __ ] kickers in Discord, so I didn't want to point this out earlier in the video, but to get to 40 minutes, you're probably either a cool person or you're asleep. So now I feel safe pointing out that my Patreon does have a free tier. And with that free tier, you can get access to our community's Discord with like-minded gaming and data nerds. And you can chat directly with me there. Plus, I'll be posting some of the data about this study in future videos on Patreon as well. And some of that will even be available at the free tier.

[00:40:20]So, please do remember to like and subscribe, but there's also that option if you want to join a cool club for free. All right, so why doesn't Valve use this method to predict games? Well, for one thing, there's just a lot of fake games in Dota. Tons of bot matches, games with early abandons, and just general BS. I think it's clear that Valve has not always spent a ton of energy trying to find those games and fake accounts, but I did. I've removed something like 3 to 5% of the replays from our sample because I thought that that was bad data. That is probably where we get our real edge. In machine learning, we call this garbage in, garbage out. Anyway, does any of this even matter? Well, it does help us understand how to assign value to decisions that players make. If we know how to make a great predictive model, then we can accurately do studies to see how different decisions impact win rates compared to expected win rates. That's what we did with Divine Rapier and Hand of Midas in the last couple of months.

[00:41:24]For another thing, there are ways you can leverage this in how you think about playing the game. Notice that our model does not care who on the team has the net worth or even what that net worth is overall. It just cares about the relative difference in net worth between teams and are the barracks still standing. If you are a carry and your mid gets off to a great start, you should be focusing around maximizing the advantage that mid can bring your team.

[00:41:51]And that might sometimes mean joining the fight in ways that you otherwise wouldn't. If your opponents have an underfarmmed support, should you really choose to take 45 seconds to go gank them for the 15th time? Probably not, actually. Once you have a sufficient lead, an underfarm support is probably worth less than you would get by pushing into your opponent's jungle and taking farm there. And if you are a support player, every decision that you make should be about what will give your team the most net worth advantage, not necessarily what gives your team the most net worth. Now, so aggressive vision should be about denying farm to opponents. And if you're a farming carry, you should usually be farming in the most aggressive way you can, as long as that way is safe. You should not be sitting in the safe jungle by your base when you have an advantage. You should be farming in a way that takes as many resources away from your opponents as possible and gives you the opportunity to jump into team fights when available.

[00:42:52]Net worth itself doesn't necessarily cause these wins, but the players who have learned to optimize net worth are the ones who are winning the vast majority of games. I honestly could keep adding data and expanding this indefinitely, but this has gotten way out of hand. I have to call it quits somewhere. Thank you very much for watching, and please let me know what you think. If you liked this video, there's some more on the screen that you will love. And please remember to leave me a like and a comment before you go.

Related Videos

Artificial Intelligence

OpenHuman VS Hermes AI: Who Wins?

JulianGoldieSEO

285 views•2026-05-29

Artificial Intelligence

Long-Running Agents — Build an Agent That Never Forgets with Google ADK

suryakunju

142 views•2026-05-30

Artificial Intelligence

5 Mind Blowing Omni Uses Cases

PaulJLipsky

1K views•2026-06-02

Artificial Intelligence

This computer is made from real human brain cells. And you can buy it.

Talktmsmedia

3K views•2026-05-28

Artificial Intelligence

BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2

aimmediahouse

122 views•2026-06-03

Artificial Intelligence

I Made the Same Anime Fight Scene in Every AI Video Generator

NobleGooseAnime

295 views•2026-05-30

Artificial Intelligence

Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S

cnnnews18

3K views•2026-06-01

Artificial Intelligence

I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)

AICodingDaily

298 views•2026-05-29

Trending

Revisiting The Cat Cafe For The Final Time

BenGtalks

3195K views•2026-05-29

Lil bro is a menace 🤣

NotAirJordan

2037K views•2026-05-31

The Casino Had Us Guessing All Day

VegasMatt

157K views•2026-06-03

Political Science

My response to the Police

RecklessBen

1496K views•2026-06-01