I have the Asus Z13 Flow 128gb variant on order and “shipped”, to be delivered in a couple weeks. I was on a wait list for almost 4 months to get this, and now with new tariff news, I’m very glad I could get one when I could.
One question I have for this article, which I doubt I’ll see, is “In a desktop form factor, why cant I use DDR5 in addition to the soldered ram?” . They offered only a few words to discussion around the board, but it is really bugging me. Why couldn’t’ we have the best of both worlds with this?
The review is very performance focused, which is… fine I suppose. But it doesn’t cover the most relevant comparison, which would be the Asus versus framework comparison.
Also, its good to see these reviews including most important benchmarks: Training and Inference. But those benchmarks need to be scaled across model sizes, and then also need to come with some kind of accuracy and precision metrics. They’re using things like tokens per second as the metric, or training times, but realistically, consumers have to make a choice about the size/ price of a GPU and what models it will be able to hold. Larger models, likely will perform better, but they come with significant cost especially dependent upon how much can sit in vram versus system ram.
Its a good review but I think we need to see more focus on more relevant benchmarks for users. Tokens/ second is important, but like, train ImageNet and give me model statistics and performance. Like, do more please. Load 5 models (a 4gb, a 8gb, a 16gb a 32gb a 64gb and a 96gb+ model), give them all the same prompts, and display the results.
If I understand right they actually tried to work with AMD to implement an experimental new socket standard or something to get the low latency they needed, but weren’t able to get the results they were after
This device isn’t really for me, but the amount of RAM it comes with should last a very long time, and the fact that it’s soldered unfortunately is important to its unique performance
If I were to hazard a guess, they’re reusing a mobile board design for this somewhat and at least in mobile applications socketed dimms draw 30-50% more juice than soldered. It could also be the npu or gpu requiring the 5-10% extra memory bandwidth they get from being soldered. I do agree that I don’t think it was worth the trade offs from a consumer perspective, but framework seems to generally make good choices so I’m thinking there must’ve been some outside pressure at play affecting the financials of the project or something.
When I’ve seen this come up before, I think the AI Max needs soldered RAM due to latency that socketed RAM introduces. There were some older post where an AMD engineer even hopped in saying they tested but it had a big impact on performance.
Soldered RAM is really faster then socketed RAM?
I can’t really see why, when only looking at the connection level.
But soldered RAM also means, they only need to support exactly this RAM.
So they can leave out some stuff handing different speeds and can optimise for exactly this RAM.
Is that the reason?
I’ve actually never thought about that until now…
I have the Asus Z13 Flow 128gb variant on order and “shipped”, to be delivered in a couple weeks. I was on a wait list for almost 4 months to get this, and now with new tariff news, I’m very glad I could get one when I could.
One question I have for this article, which I doubt I’ll see, is “In a desktop form factor, why cant I use DDR5 in addition to the soldered ram?” . They offered only a few words to discussion around the board, but it is really bugging me. Why couldn’t’ we have the best of both worlds with this?
The review is very performance focused, which is… fine I suppose. But it doesn’t cover the most relevant comparison, which would be the Asus versus framework comparison.
Also, its good to see these reviews including most important benchmarks: Training and Inference. But those benchmarks need to be scaled across model sizes, and then also need to come with some kind of accuracy and precision metrics. They’re using things like tokens per second as the metric, or training times, but realistically, consumers have to make a choice about the size/ price of a GPU and what models it will be able to hold. Larger models, likely will perform better, but they come with significant cost especially dependent upon how much can sit in vram versus system ram.
Its a good review but I think we need to see more focus on more relevant benchmarks for users. Tokens/ second is important, but like, train ImageNet and give me model statistics and performance. Like, do more please. Load 5 models (a 4gb, a 8gb, a 16gb a 32gb a 64gb and a 96gb+ model), give them all the same prompts, and display the results.
If I understand right they actually tried to work with AMD to implement an experimental new socket standard or something to get the low latency they needed, but weren’t able to get the results they were after
This device isn’t really for me, but the amount of RAM it comes with should last a very long time, and the fact that it’s soldered unfortunately is important to its unique performance
If I were to hazard a guess, they’re reusing a mobile board design for this somewhat and at least in mobile applications socketed dimms draw 30-50% more juice than soldered. It could also be the npu or gpu requiring the 5-10% extra memory bandwidth they get from being soldered. I do agree that I don’t think it was worth the trade offs from a consumer perspective, but framework seems to generally make good choices so I’m thinking there must’ve been some outside pressure at play affecting the financials of the project or something.
When I’ve seen this come up before, I think the AI Max needs soldered RAM due to latency that socketed RAM introduces. There were some older post where an AMD engineer even hopped in saying they tested but it had a big impact on performance.
First time in hearing this, so thank you!
Soldered RAM is really faster then socketed RAM?
I can’t really see why, when only looking at the connection level.
But soldered RAM also means, they only need to support exactly this RAM.
So they can leave out some stuff handing different speeds and can optimise for exactly this RAM.
Is that the reason?
I’ve actually never thought about that until now…