Research publicly available analyst models, investor presentations, and expert commentary on whether neocloud unit economics —…
Full research prompt
Research publicly available analyst models, investor presentations, and expert commentary on whether neocloud unit economics — including GPU utilization rates, revenue per GPU per day, power costs, and financing costs — remain viable as Blackwell and future architectures (Rubin/Vera Rubin) deliver dramatically higher performance-per-dollar. Specifically examine whether price-per-FLOP compression could make existing H100 fleets economically uncompetitive before debt is repaid, and what publicly estimated returns on invested capital look like across the industry.
From "Deep dive on the 'neocloud' GPU-rental industry — CoreWeave, Lambda, Crusoe,...
The neocloud GPU-rental model functions as a financed wager on one specific accounting assumption rather than a durable or transitional structure. CoreWeave, Lambda, and Crusoe depend on this leveraged position within the broader industry.
Neoclouds (specialized GPU cloud providers like CoreWeave, Crusoe, Lambda, Nebius, and Voltage Park) operate a high-fixed-cost, debt-financed rental model centered on NVIDIA GPUs. They purchase or lease servers, house them in colocation or purpose-built facilities, and sell compute by the GPU-hour (or via multi-year take-or-pay contracts). Viability hinges on the spread between realized revenue per GPU-hour and the sum of depreciation, power, colocation, operations, and interest costs.[1][2]
As of early-to-mid 2026, published H100 rates range from budget-tier on-demand levels of roughly $2.85–$3.50 per GPU-hour (down 64–75% from 2023 peaks above $8/hour) to premium contract rates at CoreWeave of $4.25 (PCIe, stable for years) to $6.16 (SXM). Blackwell-era GPUs (B200/GB200) command scarcity premiums initially, with B200s averaging ~$6.50/hour and GB200 rack-scale around $17.85/hour in spot/contract data.[2][1]
CoreWeave’s 2025 revenue of $5.1 billion (on analyst estimates of 250,000–400,000 GPUs) implies a blended realized yield well below published rate cards, reflecting contract discounts, utilization below 100%, and mix. At 100% utilization and published rates, an 8-GPU H100 server could generate $241,000/year (Lambda) to $432,000/year (CoreWeave), but actuals are lower.[1]
Power, colocation, and financing add meaningful variable and fixed costs, while depreciation dominates. Servers cost $250k–$400k+ for 8x H100 configurations (higher for Blackwell systems). Neoclouds typically depreciate over 4–6 years (CoreWeave uses 6; others shorter), creating annual per-server depreciation of $40k–$75k+. Interest on GPU-collateralized debt (often 9–15% initially, improving with scale and credit ratings) adds another layer; CoreWeave paid ~$1.2 billion in interest on $21.4 billion debt in 2025. Power and colocation are lower but critical—high-density AI racks (40–80 kW) command premium space, and electricity varies widely by location (sub-$0.05/kWh ideal).[1][3]
Gross margins before depreciation run 55–65% in bare-metal analyses, but net economics are tight due to leverage and reinvestment.[2]
A debt-financed cluster typically breaks even around 70% utilization. Below that (e.g., 55%), monthly losses can reach hundreds of thousands per 1,024-GPU H100 cluster; above it (e.g., 85%), meaningful profits emerge. CoreWeave does not publicly disclose fleet-wide utilization, but revenue/GPU inferences and industry commentary point to blended realized utilization in the 60–85% range depending on provider and contract mix. Model FLOPS utilization (MFU, a key efficiency metric) is higher at specialized neoclouds—CoreWeave claims 20%+ advantages over baselines via better orchestration and networking.[2][1]
Take-or-pay contracts (multi-year commitments where customers pay regardless of usage) provide downside protection and enable cheaper debt, but on-demand/spot players face full price and utilization risk. Pricing has shown volatility, with H100 on-demand rates fluctuating and hyperscalers cutting prices (e.g., AWS reductions up to 45% in some cases).[4]
Blackwell (and future Rubin/Vera Rubin) architectures accelerate price-per-FLOP compression, pressuring older H100 fleets. Blackwell systems deliver substantially higher performance (e.g., via larger dies, faster memory, and rack-scale designs like GB200 NVL72), so providers can deliver more compute per GPU-hour. Analyst scenarios illustrate the effect: at $30k–$40k list per Blackwell GPU and 65–85% utilization, bare depreciation costs fall to ~$1.35–$2.35/hour before power/markup, enabling customer pricing of $2.50–$4+/hour while halving or bettering cost per TFLOP-hour versus prior generations (even before factoring throughput gains).[5]
This compresses pricing for equivalent performance. H100 fleets face a “depreciation cliff”—secondary/used market values have reportedly fallen sharply (up to 85% from peaks in some commentary), and on-demand rates have declined markedly. Older GPUs cascade to cheaper inference or lower-priority workloads, but sustained high utilization requires either long-term contracts locked in before compression or acceptance of lower rates. Multi-year take-or-pay backlogs (CoreWeave’s reached $99.4 billion by March 2026) shield contracted capacity, but uncontracted or expiring fleets risk economic uncompetitiveness before full debt repayment if demand does not absorb the performance uplift at scale.[6][7]
Neoclouds are aggressively securing Blackwell/Rubin allocations to ride the next scarcity premium while older silicon depreciates. However, rapid generational turnover shortens realistic economic lives below accounting assumptions (critics like Michael Burry have argued 2–3 years), amplifying risk for leveraged H100-heavy fleets.[2]
Public models and commentary show ROIC/returns that are modest to marginal, highly sensitive to assumptions on utilization, contract tenure, depreciation life, and residual GPU value. One illustrative 5-year H100 contract model at ~$3/hour yielded ~12.4% IRR under fixed pricing but dropped sharply (to low single digits or near zero) with 10% annual price declines or spot exposure. Kerrisdale Capital’s detailed GB200 NVL72 unit economics (drawing on SemiAnalysis TCO) showed EBIT margins of ~20.5% under CoreWeave’s 6-year depreciation and 87% utilization assumptions, falling to near 0% or negative with shorter (4–5 year) lives more reflective of obsolescence.[8][3]
CoreWeave reports strong adjusted EBITDA margins (~56% in Q1 2026) and hypergrowth but posts net losses driven by interest and depreciation; critics argue returns sit below the cost of capital when using realistic assumptions. Hyperscalers benefit from lower cost of capital and scale; neoclouds’ higher financing costs widen the gap. Industry-wide, sustained 15–25%+ ROIC appears challenging without premium differentiation or software layers.[9][3]
For new entrants or competitors, the model favors those securing long-term offtake, optimizing power (location, on-site generation, efficiency), deploying rapidly (modular/prefab saves months of idle depreciation), and differentiating via SLAs, networking, or orchestration. Pure commodity GPU-hour rental faces relentless price compression and utilization pressure. H100 fleets remain viable if heavily contracted or repurposed for inference, but unhedged exposure to generational shifts risks stranded capital. Power availability and speed-to-energize have become the binding constraints more than GPU supply. Overall economics remain viable for well-capitalized, contract-heavy players like CoreWeave through 2026–2027 backlogs, but thinner for smaller or spot-oriented operators as Blackwell/Rubin volumes ramp and performance-per-dollar improves.[2]
Additional research into specific TCO models (e.g., SemiAnalysis or updated Silicon Data pricing indices) or company filings would further refine fleet-level projections.
Recent Findings Supplement (June 2026)
Neocloud unit economics remain under pressure from rapid price compression on prior-generation GPUs, but newer architectures like Blackwell and Rubin command scarcity premiums that support near-term viability for well-capitalized players with take-or-pay contracts. H100 spot and short-term contract rates fell sharply into early 2026 (e.g., one-year rentals moving from ~$1.70/GPU-hour in Oct 2025 to ~$2.35 by Mar 2026 per SemiAnalysis data), yet contracted rates for premium providers held higher, and newer silicon (B200/GB200) rented at $6.50–$17.85/hour.[1][2] This dynamic allows fleets to cascade older hardware into lower-value inference while riding premiums on new allocations, but it tightens margins for pure H100 operators before debt maturities.
- H100 published/on-demand rates compressed 64–75% from 2023 peaks above $8/hour to a $2.85–$3.50/hour budget-tier norm by early 2026; CoreWeave maintained stable high rates (e.g., H100 SXM ~$6.16/hour after a late-2025 move, PCIe fixed at $4.25).[3][3]
- Revenue per 8-GPU H100 server at published rates and 100% utilization: ~$241k/year (Lambda $3.44/hr) to ~$432k/year (CoreWeave); real blended yields are lower due to utilization and negotiated contract discounts.[3][3]
- Newer GPUs sustain premiums: B200 ~$6.50/hr average, GB200 racks ~$17.85/hr in 2026, with Rubin expected to launch at 30–50%+ premium over Blackwell before compressing.[1][4]
For entrants or competitors: Secure early NVIDIA allocations and multi-year offtake to capture premiums; spot/on-demand players face faster erosion. Legacy H100 fleets require rapid re-contracting or workload migration to avoid negative cash flow as prices drop.
Debt-financed neoclouds require ~70% utilization for breakeven on depreciation + interest alone, with limited margin of safety; actual provider utilization varies widely but contracted backlogs provide some buffer. Illustrative models show a debt-financed H100 cluster breaks even in the high-$1/GPU-hour range before power, colocation, and opex; below ~70% utilization, monthly losses can reach hundreds of thousands on a 1k-GPU cluster.[1][5] Enterprise customer utilization is often low (e.g., Cast AI data showing ~5% average across clusters), but neocloud providers target much higher via take-or-pay deals.
- Breakeven threshold: ~70% utilization for debt-financed clusters; 55% vs. 85% utilization swings a 1,024-GPU H100 cluster from –$330k to +$340k/month.[1]
- Bitdeer reported 41% utilization (Jan 2026, mixed fleet) rising to 90–92% (May 2026); CoreWeave does not disclose but its ~$5.1B 2025 revenue on an estimated 250k–400k GPUs implies blended utilization/yield below rate-card levels.[6][7][3]
- Gross margins (bare-metal, pre-depreciation): 55–65%; adjusted EBITDA margins ~60% at CoreWeave (2025) mask net losses from non-cash charges.[5]
Implication: Entrants must prioritize contracted demand and operational excellence (e.g., orchestration for higher MFU) over raw capacity; low-utilization spot players are vulnerable to price swings.
Heavy GPU-collateralized debt (> $20B industry-wide) combined with aggressive depreciation schedules creates refinancing and residual-value risks, especially if new architectures accelerate obsolescence. CoreWeave alone held ~$21.4B debt end-2025 with $1.2B annual interest; total neocloud GPU-backed debt exceeds $20B (CoreWeave $14.2B+). Depreciation assumptions (4–6 years) vs. shorter economic lives (Burry: 2–3 years) front-load costs.[8][5]
- CoreWeave FY2025: $5.13B revenue (+168% YoY), $66.8B backlog (end-2025; grew further in 2026), Q1 2026 revenue $2.08B (+112% YoY), but net loss ~$1.17B driven by ~$2.45B depreciation + interest.[9][3]
- 2026 capex guidance (CoreWeave $31–35B; Nebius $20–25B) signals continued leverage against backlogs while refreshing fleets.[10][11]
- Residual risk: H100 values could fall 30–50% below acquisition prices by 2027 in stress scenarios; lenders scrutinize resale/renewal assumptions.[12]
For competitors: Strong balance sheets or access to lower-cost capital (e.g., investment-grade GPU-backed facilities) are table stakes; smaller players face consolidation risk in a 2026 shakeout.
Blackwell and especially Rubin/Vera Rubin improve performance-per-dollar dramatically (NVIDIA claims up to 50x throughput/MW and 10x lower cost-per-token vs. Hopper/Blackwell in inference), which supports viability for new deployments but risks stranding or devaluing existing H100 fleets via faster price compression. H100s are cascading into cheaper inference, with A100s from 2020 still re-contracting near original rates in some cases, but overall generational compression (pricing drops ~50% over 5 years per McKinsey) shortens payback windows.[13][14]
- Rubin targets: 5x inference perf, 10x lower token cost, 3.5x training perf vs. Blackwell at rack scale; early availability H2 2026 with scarcity premiums.[15][16]
- Power density rising (B200 up to 1kW/GPU vs. ~700W H100) shifts constraint from GPUs to energized MW; modular builds save ~12 months and ~$8M dead depreciation per $50M cluster.[1]
Implication: Operators with early Rubin access and power secured can reset economics favorably; pure H100 fleets need utilization >70–80% and quick migration to inference workloads or face negative ROIC before full debt repayment.
Public filings show negative or low ROIC in buildout phases (e.g., CoreWeave –1.02% annualized ROIC as of Mar 2026 quarter) despite high EBITDA margins, reflecting heavy reinvestment; sustainability hinges on backlog conversion, utilization sustainment, and renewal rates at viable pricing. Nebius and others report sold-out capacity and pricing power in the near term, but industry-wide $3.1T capex needs (McKinsey to 2030) and continuous refresh cycles leave thin post-depreciation margins (14–16%).[17][14]
- CoreWeave emphasizes "attractive returns on capital" backed by backlog and declining WACC via investment-grade debt, but GAAP results reflect growth-phase losses.[18]
- No broad positive ROIC estimates surfaced for the sector; models stress conservative depreciation and high utilization assumptions.
Overall for market entrants: Success requires differentiated power access, software/operational edges for utilization/MFU, and diversified customer bases beyond hyperscaler concentration. The model is viable for leaders with scale but unforgiving for those without rapid time-to-revenue or strong financing.