Thank you my friend! Being able to upload unlimited pictures / videos really makes a significant difference when telling a long story, i.e., when presenting a Deep Dive.
Great content. It is so much better laid out this way than on X. Looking forward to read more about the valuation in a future post, and the "AI bubble".
Thanks Agrippa for great analysis! This was indeed in-depth yet simple enough for regular guy like me 👌
I would like ask your take on Nvidia shipment lead times. I’ve heard it can take many months to get GPUs from them. Same time Nvidia is having growth on payment delays.
Is this normal in this rapidly growing sector or something to be worried?
IREN has 8 GPU racks and HGX H100 (Quantum) networking system currently. The new NVIDIA GB200 NVL72 has 72 GPUs connected at 1.8TB/s speeds. The latency of IREN architecture is a killer for large models (million parameter). So their use cases are somewhat limited. Once NVIDIA GB200 NVL72 becomes available widely the power advantage of IREN will vanish (at least for some use cases, if not most). For large models NVIDIA claims the GB200 NVL72 delivers up to 30x faster performance and a 25x reduction in energy consumption compared to the HGX H100. Isn't this true?
I think you are getting mixed up with half-truths.
1) Claiming IREN’s latency is “insufficient for large models”.
You first have to specify what type of latency you are talking about. Internal or External. Neither seems to be a problem for IREN. They are specialists in internal latency across its data centers (100 MW super-clusters come to mind). And in terms of external latency, their 6ms RTT to Dallas seem VERY adequate as well.
And besides that, for training purposes, external latency isn’t even a meaningful factor. It’s all about internal latency optimization for that use case. External latency is a critical factor for inference, but IREN is also covered here with its 6ms RTT to Dallas (major hub) - so what’s your claim based on?
2) You are also claiming that GB200s use LESS energy than H100s which is just false.
I think you are misinterpreting the “25x in energy reduction” statement by Nvidia. They are talking about energy used per token generated = These new GPU models are just much more productive / powerful relative to the energy they are consuming.
Yet the energy the GB200s are consuming is MULTIPLES above what the
H100s were consuming. Even a single B200 / B300 GPU is requiring significantly more energy than the H100 / H200 processors.
So your entire “energy isn’t a bottleneck” argument falls apart - at least based on this line of reasoning.
FWIW, I am long on IREN too. I am talking about internal latency. IREN currently deploys NVIDIA HGX H100 systems. In this architecture, 8 GPUs sit on one baseboard and talk via NVLink (@ 900 GB/s). To talk to a 9th GPU (in the next server), they must leave the NVLink domain and go over the network. The NVL72 connects 72 Blackwell GPUs into a single NVLink Domain. This means GPU #1 can talk to GPU #72 sharing memory at 1.8 TB/s (that's also 2x the speed with 9x more GPUs). InfiniBand is a specialized, ultra-low-latency network, but still nowhere as fast as NVLink. IREN’s current architecture would be 5-10x slower for GPU rack to GPU rack communications. You are somewhat correct in saying the efficacy of the IREN HGX H100 architecture would be comparable for training (for smaller models). H100 might also be used for inference too. Basically, the current system will become Tier 2 as soon as NVIDIA GB200 NVL72 becomes widely available, even if it survives the 3-5 year life expectations.
Note that there might be competition for inference (Google TPUs, Grok LPU, AWS Inferentia, AMD Instinct MI355X etc) and training (TPU, Cerebras WSE-3 etc). These are more energy efficient. I am guessing you are talking about doing smaller tasks wherein the H100 will consume lesser energy. That's not apples-to-apples comparison. This is a comparison between the 8 GPU server and the 72 GPU version when both are used as they should be.
Total GPUs in System -> 9x more (also NV Domain size)
Total System Compute (FP4*) -> Massive improvement with new precision support (1440 PFlops)
Total System Memory (HBM3e) -> 21x (13.5 TB)
LLM Training Speedup (e.g., GPT-MoE-1.8T) -> 4x faster
LLM Inference Speedup (Real-time, Trillion-Parameter) -> 30x faster than
Hi Agrippa , great article . Liked ur well layed out thoughts. I would love ur thoughts on clustermax 2 from semianalysis ... a lot of not very inspiring comments . Would be good to break them down too . a Few "While they’ve been very successful at selling GPU Cloud capacity, their economic returns aren’t what has commonly been depicted by market participants. Our AI Cloud TCO Model (trusted by the world’s largest GPU buyers and financial leaders) estimates the precise economics of the IREN/Microsoft deal" . 2 We have tested IREN in March 2025 and found the service to be severely lacking, with multiple basic configuration errors on the hardware such as ACS not being disabled and GPU Direct RDMA not being enabled. In March 2025, our two node NCCL test on the AllReduce collective showed that IREN machines had around 129.27GB/s at 128MiB msg size when the Nvidia reference numbers and our testing for top tier neoclouds is well above >= 300GB/s busBW. 3 It is known within the industry that IREN offers below market rate prices compared to providers in the ClusterMAX Silver, Gold, or Platinum tiers. We think the reason is twofold:
Cheaper-than-average cost structure, through ownership of the datacenter and site selection centered on areas with cheap power costs (typical in the Bitcoin mining business)
Inferior service quality, relative to the market average
Me & others haven spoken with some very high ranking folks at IREN on this topic, which I believe adds some very valuable insights on this topic.
The way I understood it:
Semi approached IREN very shortly before their ranking release, asking them to set up a test environment within hours. IREN obliged, but has never set up this kind of a test-environment & when asking Semi what they are measuring (raw performance / security / etc) Semi never gave them a straight answer. They followed Nvidia guidelines 1:1 and this test environment apparently had some flaws, primarily with lacking client security features. Semi subsequently ranked them very low.
IREN folks claim that Semi handled this entire situation extremely unprofessional, likely because IREN doesn't spend a dime in advertising $ / sponsorships on Semianalysis. The entire incentive structure of ClusterMAX is severely flawed: Literally ranking cloud providers, of which many are their direct customers (in terms of ad spend).
IREN never issued a public statement / rebuttal on that situation, because they didn't want to burn any bridges (e.g. with Nvidia) due to following test environment guidelines. They didn't want to get political either and just took it on the chin.
Apparently, IREN's test environment wasn't even in any way comparable to the actual live environment for clients. Again, think about it: If IREN's (actual) client environment lacked any basic security features, why would any large-scale customer lease capacity from IREN?
The irony is so apparent that even cloud providers such as Fluidstack & MSFT (which semi ranks top-tier) are CLIENTS of IREN's cloud business.
Sure, Azure isn't directly using IREN's capacity. Microsoft is rather utilizing Horizon 1-4 for training the hyperscaler's latest AI models – but isn't that even a bigger sign of confidence? MSFT is literally using IREN's capacity for INTERNAL purposes, instead of using their own cloud capacity for that use case.
IREN is building 100MW super-clusters with variable rack densities of 130kW-200kW for Microsoft. Please ask Semianalysis who in the industry is doing that at this scale.
With all this context in mind, it's very apparent to me that Semi is desperately trying to defend their bogus ranking during a time in which it has been proven to be severely flawed: Their highest ranking companies are CUSTOMERS of IREN (the lowest ranking one).
The irony is real and Semi knows this. Therefore, it's only natural to see them claiming IREN's deal with MSFT is inferior in term of its economics and that the cause for this is "IREN's inferior service quality".
It's obvious, however, that the folks at Semianalysis have no financing / investing background and can't even calculate basic IRR / ROE / MOE metrics to judge a cloud deal's economic profile. Claiming that IREN's "economic returns" are inferior to market standards is a loosing position that's easily disproven when you actually know how to calculate returns.
I'll publish my in-depth breakdown of the IREN x MSFT next week and lay it all out. Semi should subscribe & read it, as they'd certainly learn a thing or two.
Thanks for the reply. i am still digging further . One point that u make about the irony of iren being ranked low while msft and fluid stack ranked high still rent from them . I think both could be true simultaneously , because they are doing a general ranking of neoclouds and look for a lot of software features . It is not purely a "bare metal" ranking . Lot of Security features is not needed if u have a singular client( probably doing training) with no sharing of resources . So i wouldnt expect IREN to be top tier because that is not their forte. IREN wants to be the best in bare metal neocloud . Like to know ur thoughts .
Superior quality integration of charts, videos, and pictures! Let's go!
Thank you my friend! Being able to upload unlimited pictures / videos really makes a significant difference when telling a long story, i.e., when presenting a Deep Dive.
Agreed, so much better than a tweet!
Glad to hear! :)
Great article Agrippa! Looking forward to this journey.
Thank you Vlad! Going to be a good one ;)
Great content. It is so much better laid out this way than on X. Looking forward to read more about the valuation in a future post, and the "AI bubble".
Great to hear! The upcoming posts will be absolute bangers. I can’t wait to publish them.
Excellent detailed and high quality work. Thank you
Very much appreciate it, Mark!
Love the content quality. Please keep it up <3
Always 💪
Excellent breakdown set in context to others illustrating how and why IREN is elite
Thank you mate!
Much appreciate. Love to read your work.
Awesome, I'm glad you enjoy it :)
Great job Brother!! Fantastic analysis.
Thank you, Yerra!!
Still blown away by the quality of research. Aesthetically this is far superior than X.. Upgrades people, upgrades!
Hahah my homie, thank you for the support brother. And yeah, this type of format is so much better.
Nice work!
Thanks!
Amazing work, truly incredible the amount of effort you put into this. 12/10
Thank you, Jeff!! :)
Thanks Agrippa for great analysis! This was indeed in-depth yet simple enough for regular guy like me 👌
I would like ask your take on Nvidia shipment lead times. I’ve heard it can take many months to get GPUs from them. Same time Nvidia is having growth on payment delays.
Is this normal in this rapidly growing sector or something to be worried?
IREN has 8 GPU racks and HGX H100 (Quantum) networking system currently. The new NVIDIA GB200 NVL72 has 72 GPUs connected at 1.8TB/s speeds. The latency of IREN architecture is a killer for large models (million parameter). So their use cases are somewhat limited. Once NVIDIA GB200 NVL72 becomes available widely the power advantage of IREN will vanish (at least for some use cases, if not most). For large models NVIDIA claims the GB200 NVL72 delivers up to 30x faster performance and a 25x reduction in energy consumption compared to the HGX H100. Isn't this true?
I think you are getting mixed up with half-truths.
1) Claiming IREN’s latency is “insufficient for large models”.
You first have to specify what type of latency you are talking about. Internal or External. Neither seems to be a problem for IREN. They are specialists in internal latency across its data centers (100 MW super-clusters come to mind). And in terms of external latency, their 6ms RTT to Dallas seem VERY adequate as well.
And besides that, for training purposes, external latency isn’t even a meaningful factor. It’s all about internal latency optimization for that use case. External latency is a critical factor for inference, but IREN is also covered here with its 6ms RTT to Dallas (major hub) - so what’s your claim based on?
2) You are also claiming that GB200s use LESS energy than H100s which is just false.
I think you are misinterpreting the “25x in energy reduction” statement by Nvidia. They are talking about energy used per token generated = These new GPU models are just much more productive / powerful relative to the energy they are consuming.
Yet the energy the GB200s are consuming is MULTIPLES above what the
H100s were consuming. Even a single B200 / B300 GPU is requiring significantly more energy than the H100 / H200 processors.
So your entire “energy isn’t a bottleneck” argument falls apart - at least based on this line of reasoning.
Hope this helps, cheers!
FWIW, I am long on IREN too. I am talking about internal latency. IREN currently deploys NVIDIA HGX H100 systems. In this architecture, 8 GPUs sit on one baseboard and talk via NVLink (@ 900 GB/s). To talk to a 9th GPU (in the next server), they must leave the NVLink domain and go over the network. The NVL72 connects 72 Blackwell GPUs into a single NVLink Domain. This means GPU #1 can talk to GPU #72 sharing memory at 1.8 TB/s (that's also 2x the speed with 9x more GPUs). InfiniBand is a specialized, ultra-low-latency network, but still nowhere as fast as NVLink. IREN’s current architecture would be 5-10x slower for GPU rack to GPU rack communications. You are somewhat correct in saying the efficacy of the IREN HGX H100 architecture would be comparable for training (for smaller models). H100 might also be used for inference too. Basically, the current system will become Tier 2 as soon as NVIDIA GB200 NVL72 becomes widely available, even if it survives the 3-5 year life expectations.
Note that there might be competition for inference (Google TPUs, Grok LPU, AWS Inferentia, AMD Instinct MI355X etc) and training (TPU, Cerebras WSE-3 etc). These are more energy efficient. I am guessing you are talking about doing smaller tasks wherein the H100 will consume lesser energy. That's not apples-to-apples comparison. This is a comparison between the 8 GPU server and the 72 GPU version when both are used as they should be.
Total GPUs in System -> 9x more (also NV Domain size)
Total System Compute (FP4*) -> Massive improvement with new precision support (1440 PFlops)
Total System Memory (HBM3e) -> 21x (13.5 TB)
LLM Training Speedup (e.g., GPT-MoE-1.8T) -> 4x faster
LLM Inference Speedup (Real-time, Trillion-Parameter) -> 30x faster than
LLM Inference Speedup (Real-time, Trillion-Parameter) -> 30x faster
Energy Efficiency (Performance per Watt / Total Cost of Ownership) -> 25x
Interconnect Bandwidth (GPU-to-GPU Link) -> 2x
Great article and to the point!
Thank you very much!
CEO said 35% levered IRR on MSFT deal. Can you show us the calculation on how to arrive at it?
I’m glad you ask that, since I have already done that (internal modeling). I just need to finalize my report. The break down should be out next week.
Cheers!
Hi Agrippa , great article . Liked ur well layed out thoughts. I would love ur thoughts on clustermax 2 from semianalysis ... a lot of not very inspiring comments . Would be good to break them down too . a Few "While they’ve been very successful at selling GPU Cloud capacity, their economic returns aren’t what has commonly been depicted by market participants. Our AI Cloud TCO Model (trusted by the world’s largest GPU buyers and financial leaders) estimates the precise economics of the IREN/Microsoft deal" . 2 We have tested IREN in March 2025 and found the service to be severely lacking, with multiple basic configuration errors on the hardware such as ACS not being disabled and GPU Direct RDMA not being enabled. In March 2025, our two node NCCL test on the AllReduce collective showed that IREN machines had around 129.27GB/s at 128MiB msg size when the Nvidia reference numbers and our testing for top tier neoclouds is well above >= 300GB/s busBW. 3 It is known within the industry that IREN offers below market rate prices compared to providers in the ClusterMAX Silver, Gold, or Platinum tiers. We think the reason is twofold:
Cheaper-than-average cost structure, through ownership of the datacenter and site selection centered on areas with cheap power costs (typical in the Bitcoin mining business)
Inferior service quality, relative to the market average
Hi Batusai, thank you for the praise.
Regarding SemiAnalysis:
Me & others haven spoken with some very high ranking folks at IREN on this topic, which I believe adds some very valuable insights on this topic.
The way I understood it:
Semi approached IREN very shortly before their ranking release, asking them to set up a test environment within hours. IREN obliged, but has never set up this kind of a test-environment & when asking Semi what they are measuring (raw performance / security / etc) Semi never gave them a straight answer. They followed Nvidia guidelines 1:1 and this test environment apparently had some flaws, primarily with lacking client security features. Semi subsequently ranked them very low.
IREN folks claim that Semi handled this entire situation extremely unprofessional, likely because IREN doesn't spend a dime in advertising $ / sponsorships on Semianalysis. The entire incentive structure of ClusterMAX is severely flawed: Literally ranking cloud providers, of which many are their direct customers (in terms of ad spend).
IREN never issued a public statement / rebuttal on that situation, because they didn't want to burn any bridges (e.g. with Nvidia) due to following test environment guidelines. They didn't want to get political either and just took it on the chin.
Apparently, IREN's test environment wasn't even in any way comparable to the actual live environment for clients. Again, think about it: If IREN's (actual) client environment lacked any basic security features, why would any large-scale customer lease capacity from IREN?
The irony is so apparent that even cloud providers such as Fluidstack & MSFT (which semi ranks top-tier) are CLIENTS of IREN's cloud business.
Sure, Azure isn't directly using IREN's capacity. Microsoft is rather utilizing Horizon 1-4 for training the hyperscaler's latest AI models – but isn't that even a bigger sign of confidence? MSFT is literally using IREN's capacity for INTERNAL purposes, instead of using their own cloud capacity for that use case.
IREN is building 100MW super-clusters with variable rack densities of 130kW-200kW for Microsoft. Please ask Semianalysis who in the industry is doing that at this scale.
With all this context in mind, it's very apparent to me that Semi is desperately trying to defend their bogus ranking during a time in which it has been proven to be severely flawed: Their highest ranking companies are CUSTOMERS of IREN (the lowest ranking one).
The irony is real and Semi knows this. Therefore, it's only natural to see them claiming IREN's deal with MSFT is inferior in term of its economics and that the cause for this is "IREN's inferior service quality".
It's obvious, however, that the folks at Semianalysis have no financing / investing background and can't even calculate basic IRR / ROE / MOE metrics to judge a cloud deal's economic profile. Claiming that IREN's "economic returns" are inferior to market standards is a loosing position that's easily disproven when you actually know how to calculate returns.
I'll publish my in-depth breakdown of the IREN x MSFT next week and lay it all out. Semi should subscribe & read it, as they'd certainly learn a thing or two.
Thanks for the reply. i am still digging further . One point that u make about the irony of iren being ranked low while msft and fluid stack ranked high still rent from them . I think both could be true simultaneously , because they are doing a general ranking of neoclouds and look for a lot of software features . It is not purely a "bare metal" ranking . Lot of Security features is not needed if u have a singular client( probably doing training) with no sharing of resources . So i wouldnt expect IREN to be top tier because that is not their forte. IREN wants to be the best in bare metal neocloud . Like to know ur thoughts .