Qwen 3.7 Max demonstrates a significant leap from simple text generation to sophisticated architectural reasoning. Its ability to navigate complex, multi-constraint optimization problems marks a turning point where AI begins to handle genuine engineering logic with high precision.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
QWEN 3.7 MAX - Thinking: A Surprise ๐Added:
Hello community. We have a new QN 3.7 maximum. So we have to test this out.
No, this is here the best QN that we ever have. Beautiful. And I tested here on Alibaba cloud the model studio. And you might ask why? Because the classical the arena I have found some irregularities. So for the moment I don't trust this result. So therefore I went with the original native environment here Alibaba. So here we go.
Now you see here on their marketing material they go here or they try to go with Opus 4.6 Max because now they have the QN 3.7 Max and look at this coding agent general agent. But what I'm interested in is science stem reasoning and general capability. And hey compare this to the QN 3.6 plus. So the generation before and the jump from 28 in in last exam here from 28.8 to 41.
This is interesting. But also look here from 8 to 44. So there was something here. Let's have a look if we can verify this. Yes, the last Q 3.6 plus was not famous. I did prefer the GLM 5.1. I have a complete YouTube playlist on this.
Let's go here to artificial analysis the intelligent index with 10 benchmark.
This is the summation of 10 benchmark here. So they position themsel here quite high. Look at this. We have our GBD 5.5 or OPOS 4.7 Max, Gemini 3.1 Pro preview, then the GBD 5.5 X high, but then we have the QN 3.7 Max. before Gemini 3.5 flash for this 10 benchmarks that are known for multiple years. So this is absolutely impressive. It is before Kimmy, before myo, before Grock, before anything from meter, even before Claude Sonet 4.6 6 max before deepsek version 4 pro max before everything just look here the old QN3.5 to 400 billion free trainable parameter model was here so the jump here now two sec two generations later is quite impressive but you know these are old benchmarks known benchmark published benchmarks and we go now for the live testing of my real complex benchmark let's go And as you see, QN 3.7 Max. Let's go with the flagship. Here we go. We activate syncing fast. No, we want to have the syncing solution. So, let's paste this in and let's go here. Beautiful syncing.
So, let's see what the new Q3.7 Max can do. Okay, evaluating the optimal path.
Oh, we have it here on the right hand side. Okay, no problem at all. Great.
Navigating the layout mechanics. So this is not an open reasoning trace. This is here something where you see that they are summarizing this because they don't want it. You can extract to the pure reasoning trace a chain of sword sequence and then maybe distill here a smaller language model out of this. So protecting here the intellectual property rights beautiful exploring strategic floor transitions proceeding with strategic movements to reach the target. So you see more or less this is no information at all evaluating the sequence of action and their consequences step by step proceeding here with strategic navigation. So you can absolutely forget about the reasoning traces here. So more or less we will just have to have a look here at the result and then yeah ah okay they give us some little tiny information like prosuppressing here the ABC sequence in succession. Okay. Yeah, there is a particular mathematical algorithm behind this. Navigate to at floor 29. This is a very good idea. This is a shortcut. Yes, you have to have the red and the green code card. Reaching the floor 50 with transition. Yes, let's do this.
Accessing the implication of invoking here the emergency accident floor 29. I mean for the flagship mall now at the end of May 2026. This should not be any problem at all. This test is now more than a year old. So I've never published it, but the complexity is done for an 01 model. Can you imagine a strawberry model from Open EI? So, and as you see, currently the models are yes, hopefully able to solve this test. Let's see where we are. Exploring the optimal path through a sequence of move and constraints. Navigating towards strategic position while managing the state transition tells us nothing at all. What's going on? Yeah, they protect this here. But at least yeah, model of 50 to steer clean of problematic floors.
Okay, we have a model of 50. Yes, of course. I have three interwoven evaluation tracks here that the AI has to solve here for come up with a solution here. So, let's see what they're doing. exploring the optimal path to acquire the green code. Okay, the energy is limited, the tokens are limited. So everything is really here on a very tight budget. So there are three interwoven optimization processes the AI has to do. So proceed to floor 29 with completeness and updated status.
Beautiful. So it should have now any moment the verification and the first solution.
Optimize this sequence. Okay. And now I think exploring potential path to stabilize the sequence. Yes. I'm reassessing the sequence of action to avoid ambiguity.
Okay. Exploring viable sequences. Yeah.
Exploring strategic landing sequences.
Okay. Exploring strategic moves to navigate the challenge efficiently.
I think they're running out of words because it more or less sense that we all the same. Yeah. But this is the price if that you have to pay if you do not want to show your real reasoning trace. So you just provide some non-informational summarization of something.
Okay. So the race is on who builds the best small language model. Definitely not from this model. Okay. Fair. Qen 3.7 Max. We have to wait here for the end.
Reach the floor. Yeah. Navigate away and towards 29. Okay.
Navigating the sequence with precision.
Okay. [laughter] Okay. So, yeah, when will it come to an end? I think we're already some minutes in, but given here the age of this test, I think should not be any problem here for the very latest flagship of Q to come up with a solution. Yeah. Ah, we have a little bit of information. Moving from floor 18 to 19 is treated as a single landing on 19 bypassing any floor 22 activations.
Okay. Hey, careful. There could be some information you are providing here.
Q3.7.
So, navigating the sequence with careful attention. Yes. Re-evaluating the path.
Okay.
Reaching floor 15. Okay. So, here we see it's also doing here a sequencing here of course of the complexity reducing it into minor complexity, multiple minor complexities. And it seems that the floor 15 is here the preferred starting target. Avoiding high-risk floors like 25. Yes, I have defined floors like here 25 that are especially problematic. Maybe you should avoid them. Maybe you should use them.
Hey, here we are exploring optimal floor transition to reach 29 again. Okay.
Yeah, there's an ABC result. Okay. 2 n minus 2. Yes, it's trying to understand here the mathematical sequences behind the single buttons in the elevator going from the floor zero to the floor 50 of a building that has exactly 50 floors.
Navigating the sequence with careful attention. My goodness, come on.
I'm assessing each possible move from the key floors 49, 48, and 47.
Now, what you see is a pattern. They start at the end and they think maybe if I start at the end the sequence will become easier. But of course I built this test in a way that it is not in any way.
Hey, we have a solution. We have a solution.
This is very nice. Beautiful. So let's go in full screen.
Okay. Let we just finish and then we have a look what we achieved.
So, where's the beginning? Where's the beginning? A B C A button sequence here.
Okay. So, we have an AB AB. Okay. Wow. This is nice. So, we have an AB AB AB CB. An emergency exit. Okay.
Interesting. Eight button presses and a special action is nine total action.
Yeah, this is a good this is a real good solution. It is not the best solution like we have seen here with our latest Gemini but it is a good solution step-by-step table now. Okay. Gives me here the floor number the energy package number. Okay. A little bit huge. Okay.
Final summary total action nine.
Requirement start is pass. Final floor is 50. Energy packages is seven. All within the limits everything is passed.
This is nice. Code earned. Yeah. Green code and red code. Yep. Check. Check.
Check everything. Resource arithmetic audit. Beautiful.
Three key synergies. Beautiful. Yep.
Shortcut emergency act 29. Now we know from the last year that this is a correct solution. But hey, you never trust any system. So let's just ask, hey, can you validate your result?
And so here we are. Now let's start the validation run. Let's see what is happening here. Oh yeah, we have here the syncing process. Yes, of course. I evaluate the effect of pressing button A. I evaluate the effect of pressing button B. [gasps] Okay, now this is going to be an absolutely fascinating analysis.
Let's just have a look here what happens here at the end. If it can give us here verification, a validation of its first result. We know that it is a correct result but I just want to give see here the response here by the AI system because you're not going to believe it but sometimes in the validation run the system here changes its mind and says hm no I don't think that this is the correct solution so yeah hey nice yeah rule applied the cost landing state gives me the state information for each and every press okay so consecutive A B C lockdown final goals.
Conclusion is 100% mathematically and logically valid. Every resource calculation balance perfectly. All conditional modifiers were correctly applied or intentionally avoided and all victory conditions are met with respect to sparse. Absolutely nice. Here we have it. Yes, this is the correct information. Let's see at the beginning.
Validate. Okay. And now we [clears throat] know that there is a better solution. Huh? So now we want to give you the system time and we ask can you find a better solution. We start the optimization run.
Yeah I know yet maybe you have to extend your exploration strategy because it's not looking at the complete search space. So here we go. Searching for a more efficient solution. Yeah.
Evaluating the optimal path to reach floor 50 to meet the objective.
Exploring path phrase. Evaluating the feasibility through sequential button presses. Yes, beautiful. No information given. Total closed proprietary system beautifully. But now we can think we have here a QN 3.7 Max that has here a real good performance. I mean come on, it's the max. It is the flagship model.
Otherwise, yeah, I would not have expected anything else from cur.
So, where we are exploring alternative path to reach the floor of 15. So you see we have a sequencing here of the task which is not really the best solution because if you have interwoven optimization cycles if you chunk it up you know in in three four different pieces and you try to solve each single piece independently you lose here the ability to find here the coherent solution for the complete sequence and this is exactly how I built this test.
So maybe not the best idea, but yeah, this is just one of the many ways.
Anyhow, I will try to find here the best solution. So we just have to give it time. Let it brief. Let's see what it comes up with.
29. Okay. Calculating the minimum steps to reach floor 15. So no, it see is the first segment here from the flow zero to the floor 15. And it is now trying here the minimum step sequence. Okay, let's see if it comes up with a solution, compares it to the existing solution and then yeah, you will see here what is different. No, it unerstood that here at floor 4 there are some critical sequences that it cannot really shorten.
Okay. Oh, here we are now. So, we are now at floor 16 to the floor 50 while managing here the state changes. So, now we're in the second link. If you want explore the optimal path. Hey, we have it. So, seven process and what? Wait a minute. This looks interesting. This looks interesting.
So, just let it finish.
Two codes, two transit, and one exit.
Now, let's have a look. Where are we?
This is now what solution did it found.
Okay, so here we are now. Huh? I try to find a shorter sequence. So, let's see what came up with. Now, the key breakthrough is realizing that button C from floor 15 lands exactly on floor 8 and floor 8 is exactly an even number for the button F. Yes, this is it.
Absolutely.
So, here's the optimized shortest possible sequence. And we go with a BB A B C F A and emergency exit. So we have 2 4 6 7 button presses and the emergency exit. So we're down one and this is beautiful. Now normally I would imagine this to be taken here the first run but second run is just beautiful. Well it's the third run because we had a validation. Yes. Okay. So total action including here the emergency exit is eight. Absolute shortest path getting the red code card. So it has some augmentation transit here the minimums the exit is mandatory. Yes. Yeah. Every rule and every resource is perfectly respected and this looks beautiful. And you know the only thing left to do is to do the validation run. So we go here full-fledged reasoning and syncing.
Oh allocated quotota exceeded. Please increase your quotota limits. Okay. So, sorry. I was just trying to test this out for you. I'm not going to pay now more. But yeah, it looks this is a real interesting model. Qen 3.7 Max. It looks it is up to the task here with the other big flagship malls of the global EI corporation. Congratulations
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 viewsโข2026-05-29
Long-Running Agents โ Build an Agent That Never Forgets with Google ADK
suryakunju
142 viewsโข2026-05-30
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K viewsโข2026-05-28
BREAKING: Microsoftโs New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 viewsโข2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 viewsโข2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K viewsโข2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 viewsโข2026-05-29
3D Platformer Update - NO CAPES
SolarLune
294 viewsโข2026-05-30











