Source Report | Is there an AI Bubble?

Wholesale Token Costs Overview

OpenAI and Anthropic charge per million tokens on a pay-as-you-go API basis, with input tokens (prompts) priced 5-20x lower than output tokens (responses), creating asymmetric economics where resellers can markup outputs more aggressively. Typical wholesale costs range from $0.05-$5 input / $0.005-$25 output per million tokens, enabling resellers to apply 3-10x multiples on blended usage while covering fixed costs like hosting and compliance.[1][2][3]

OpenAI gpt-4o-mini: $0.15 input / $0.60 output per 1M tokens (blended 1:1 usage ~$0.375/M, best value mid-tier).[2]
Anthropic Claude Haiku 4.5: $1 input / $5 output per 1M (blended ~$3/M, speed-optimized).[3]
OpenAI gpt-5: $1.25 input / $0.125 output? (table anomaly, likely $1.25/$12.50 based on patterns; premium at ~$6.875 blended).[1]
Anthropic Claude Opus 4.5: $5 input / $25 output per 1M (blended ~$15/M, flagship reasoning).[3]
Long-context premium: Anthropic Sonnet 4.5 >200K input tokens jumps to $6/$22.50/M, adding 2x cost for scale.[3]

Implication for resellers: Low-end models like gpt-4o-mini or Haiku allow 80%+ gross margins at scale; resellers target these for volume apps (chatbots, summarizers) while premium models justify higher markups via perceived quality.

Competing here means specializing in cost pass-through for high-volume/low-latency apps—general resellers erode margins below 50% without proprietary optimizations like caching.

Typical Markup Multiples in End-User Apps

End-user SaaS apps (e.g., chat interfaces, AI agents, no-code tools) charge $10-100/month per user or $0.01-0.10 per query, translating to 4-15x markup multiples on wholesale tokens assuming 1K-10K tokens/query and 20-50% duty cycles (active usage fraction). This covers infra, UI, and sales but reveals thin margins without scale or bundling.[1][2]

Per-token resellers (e.g., proxy APIs): 2-5x markup, e.g., wholesale $0.15/M input → $0.60-0.75/M charged.[2]
Subscription apps: Jasper.ai-like tools at $49/user/month imply ~10x on 500K tokens/month/user (wholesale ~$200, charged $500 effective).[inferred from patterns; confidence medium, needs app-specific data].
Enterprise: ChatGPT Team $25/user/month bundles GPT-5 access (wholesale equiv. $5-10/user at light use), ~3x markup including admin features.[1]
Query-based: Tools like DocsBot charge ~$0.02-0.05 per 1K tokens, 5-10x over gpt-4o-mini blended $0.000375/1K.[6]

Mechanism: Resellers blend input/output (often 1:4 ratio in convos), apply fixed fees, then markup 5x average to hit 60-80% gross margins post-overhead.

To compete: Bundle with non-token value (e.g., fine-tuning, integrations) to justify >10x; pure token reselling caps at 3x as providers add direct enterprise tiers.

Gross Margin Estimates for Token Resellers

Token resellers achieve 50-85% gross margins by leveraging volume discounts (unseen in public data but implied via enterprise negotiations), caching (Anthropic "thinking" adds 16% but reusable), and output optimization—subtracting 10-20% for infra yields net 40-70%. Low-end models drive profitability: gpt-4o-mini at 5x markup = 80% margin; Opus at 3x = 67%.[1][3]

High-volume reseller example (gpt-4o-mini, 1M in+out/month): Wholesale $0.75, charge $5 (6.7x), COGS 15%, gross margin 85% ($4.25 profit).[2]
Premium example (Opus 4.5, same volume): Wholesale $30, charge $75 (2.5x), margin 60% ($45 profit)—justified by 2x quality reducing retries.[3]
Break-even: ~2.5x markup covers 20% opex (servers, compliance); scale to 1B tokens/month unlocks provider volume tiers (est. 20% off).[inferred; high confidence from pricing tiers].
Caching impact: Anthropic prompt caching cuts repeat input costs 50-75%, boosting margins 10-20% on agentic apps.[3]

Non-obvious: Margins invert on long-context—2x wholesale spike erodes to 40% unless charged as "pro" tier.

Entry strategy: Start with Haiku/gpt-nano reselling at 8x for 75% margins; differentiate via auto-optimization to sustain vs. direct APIs.

Implied Demand from Margin Economics

High 60-80% margins signal explosive demand for accessible AI: resellers wouldn't tolerate commoditization unless token demand grows 5-10x YoY, fueled by agentic/enterprise shift (e.g., Opus viable post-67% cut). Implied volume: $10B+ annual reseller spend at current pricing, assuming 20% market share vs. direct use.[2][3]

Margin sustainability: 5x average multiple implies resellers process 10-20B tokens/month profitably, pointing to 100M+ end-users at 1K tokens/day.[inferred from sub/pricing; medium confidence].
Demand driver: Price drops (Opus 67% cheaper) unlock non-mission apps, est. 3x usage elasticity—e.g., Haiku at $3/M blended enables $0.01/query consumer tools.[3]
Saturation risk: Margins >70% attract proxies (e.g., ScratchDB), but 80% stickiness from integrations implies $50B+ total addressable demand by 2027.[7]

What changed: 2026 cuts (e.g., Opus $15→$5 input) flipped premium from "opex killer" to "growth engine," implying reseller TAM doubles as devs build vs. buy.

To capture demand: Target verticals with sticky workflows (legal, code); margins imply room for 100+ viable resellers before consolidation.

Limitations and Competitive Moats

Public data lacks real reseller financials (e.g., no Jasper/LangChain P&Ls), so margins are modeled from wholesale + typical SaaS benchmarks—actuals likely 10-20% lower post-churn/refunds. Confidence high on costs (direct from providers), medium on end-user pricing (inferred).[1][2][3]

Moats beyond markup: Providers push enterprise (ChatGPT Team $25/user), eroding resellers; winners add data layers (e.g., Shopify's sales-data underwriting analog: resell with RAG/user history for 2x retention).[1]

Research gaps: Need proprietary datasets on app token volumes/user; additional scraping of 10+ SaaS (e.g., Perplexity, Character.ai) would refine multiples ±15%.

Sources:
- [1] https://www.finout.io/blog/openai-pricing-in-2026
- [2] https://www.cloudidr.com/llm-pricing
- [3] https://www.metacto.com/blogs/anthropic-api-pricing-a-full-breakdown-of-costs-and-integration
- [4] https://hackceleration.com/anthropic-review/
- [5] https://www.lilbigthings.com/post/anthropic-vs-openai
- [6] https://docsbot.ai/tools/gpt-openai-api-pricing-calculator
- [7] https://scratchdb.com/compare/anthropic-claude-vs-openai-api/
- [8] https://www.anthropic.com/news/claude-new-constitution
- [9] https://www.youtube.com/watch?v=ME8_6c6eY4o

Recent Data Update (February 2026)

OpenAI API Pricing Cuts in 2026 Enable 5-10x Markup Potential for Resellers

OpenAI's 2026 pricing tables reveal aggressive reductions across GPT-5 series models, dropping input costs to $0.05-$15/M tokens (vs. prior GPT-4 levels around $2.50-$30), allowing resellers to apply 5-10x markups on end-user apps while maintaining 80-90% gross margins on high-volume usage. This stems from tiered models like gpt-5-nano at $0.05 input/$0.005 output, enabling cheap scaling for consumer apps that charge $0.50-$5/M effective tokens via subscriptions.[1]

gpt-5.1: $1.25 input/$0.125 output/M; gpt-5-mini: $0.25/$0.025; Batch API discounts up to 50% off listed rates.
Enterprise add-ons like Web Search at $10/1K calls + tokens push wholesale costs higher but justify $50-100/M reseller pricing.
Implied demand: Resellers targeting SMBs can hit 85% margins at 8x markup (e.g., $1 input → $8 user-facing), fueled by 98% historical price drops since 2023.[4]

For resellers: Target gpt-5-nano/mini for volume plays; margins compress on realtime/audio ($4-$40/M) where latency justifies 3-5x only—bundle with ChatGPT Team ($25/user/mo) for sticky revenue.

Anthropic's Claude 4.5 Launch Delivers 67% Cost Drop, Boosting Reseller Margins to 75-85% at 4-6x Multiples

Anthropic released Claude 4.5 in late 2025 with Opus at $5/$25/M (down from $15/$75), Haiku at $1/$5, and Sonnet at $3/$15, introducing prompt caching/batch processing for 90% savings—resellers now underwrite apps at $20-100/M user rates, pocketing 75%+ margins via volume tiers that scale poorly for direct users but perfectly for aggregated API proxies.[2]

Long-context (>200K tokens) at premium $6/$22.50/M for Sonnet; "extended thinking" adds 16% but cuts iterations.
Legacy Opus 4 at $15/$75 remains for comparison, highlighting 67% drop; Haiku 3.5 at $0.80/$4 as budget baseline.[2]
Tiered plans criticized as "stupid" for API resellers—$20 gets 1x usage, $200 gets 20x, incentivizing gaming vs. OpenAI's linear scaling.[3]

For resellers: Exploit caching (90% off) for agent apps; avoid Opus tiers where margins dip below 70%—demand surges as 4.5 rivals GPT-5 at half cost, implying 25% market spend share.[6]

Reseller Margin Economics: 80% Average Gross from 6x Markup on Discounted Wholesale

Updated 2026 comparisons across 60+ LLMs show GPT-4 quality now at $0.75/M (98% below 2023's $60), with resellers standardizing 5-8x multiples on batch/cached rates—yielding $4-6/M user pricing off $0.50-1 wholesale, or 80-87% margins assuming 20% infra overhead. Demand implication: High margins signal oversupply, but sticky enterprise uptake (44.5% adoption) drives $5B+ annual reseller volume.[4][6]

OpenAI batch: 50% off (e.g., gpt-4o-mini $0.075 input → $0.375 user at 5x).[1]
Anthropic optimization: 90% via caching → effective $0.30/M Sonnet, resell at $2/M for 85% margin.[2]
Non-obvious: Realtime models (OpenAI $4-40/M, Anthropic unlisted) cap at 4x markup due to latency sensitivity.[1]

For resellers: Price linearly like OpenAI to avoid gaming; margins >80% viable only on nano/mini—entering now captures 50% projected 2026 price drops.[4]

Emerging Model Races Signal Further Margin Compression by Mid-2026

Anthropic preps Sonnet 5 while OpenAI plans GPT-5.3 (post-gpt-5.1), per Feb 2, 2026 update—new releases could halve costs again, squeezing reseller multiples to 4x but exploding demand via accessible flagship perf, with implied $10B+ token throughput as adoption hits 45%.[7]

Practical tables confirm Anthropic's docs-first pricing transparency aids reseller quoting.[9]
No regulatory/policy shifts noted; Claude's "new constitution" focuses values, not economics.[8]

For resellers: Lock multi-year wholesale now pre-Sonnet 5; margins drop 20% post-launch—pivot to hybrid OpenAI/Anthropic for diversification.

Confidence: High on pricing (direct from 2026 breakdowns); medium on margins (inferred from tables, no explicit reseller studies); additional reseller financials (e.g., Vercel/LangChain 10-Ks) would refine estimates.

Sources:
- [1] https://www.finout.io/blog/openai-pricing-in-2026
- [2] https://www.metacto.com/blogs/anthropic-api-pricing-a-full-breakdown-of-costs-and-integration
- [3] https://solmaz.io/log/2026/01/10/anthropics-pricing-is-stupid/
- [4] https://www.cloudidr.com/blog/llm-pricing-comparison-2026
- [5] https://www.lilbigthings.com/post/anthropic-vs-openai
- [6] https://electroiq.com/stats/openai-vs-anthropic-statistics/
- [7] https://handyai.substack.com/p/anthropic-preps-sonnet-5-while-openai
- [8] https://www.anthropic.com/news/claude-new-constitution
- [9] https://dev.to/superorange0707/choosing-an-llm-in-2026-the-practical-comparison-table-specs-cost-latency-compatibility-354g

Additional Insights from Follow-up Questions

Data center construction is projected to surge in 2026-2027, driven by AI and hyperscale demand, with global spending potentially reaching $500-600 billion annually for hyperscalers alone and total U.S. construction hitting $86 billion in 2026. This fits into broader forecasts of $3 trillion in global investments by 2030, including nearly 100 GW of new capacity added from 2026 onward.[1][2][3][4][8]

Growth Projections

Spending and Capacity: Moody's estimates $3 trillion globally over five years (2026-2030) for data center expansion, with U.S. hyperscalers (six largest) planning $500 billion in capex for 2026, rising to $600 billion in 2027.[3][4] JLL projects ~100 GW new capacity online between 2026-2030 (doubling global total to ~200 GW by 2030 at 14% CAGR), equating to $1.2 trillion in real estate value plus $1-2 trillion for IT fit-outs.[2][7] U.S.-specific outlook shows $86 billion in construction spending in 2026, up 782% from 2022 levels.[8]
Construction Volume: AIA Consensus Forecast predicts 26% growth in data center construction in 2026 and 16% in 2027, accounting for 29.7 points of total nonresidential growth (vs. 8.1 points overall); this drives office sector gains when data centers are included.[1][6] Forecasts vary widely (15.5%-45.2% for 2026), reflecting scenarios from AI supercycles to supply constraints.[1]

Cost and Scale Trends

Metric
2026 Projection
Notes

Global Construction Cost per MW
$11.3 million (up 6% from 2025)
7% CAGR since 2020; driven by labor shortages, materials, and scale (e.g., campuses needing 4,000-5,000 workers).[2][5][7]

Total Investment Needs
Up to $3 trillion by 2030
Includes $870 billion debt financing; hyperscalers pre-lease most capacity.[2][4]

Electricity Demand
600 TWh globally
14% rise from 525 TWh in 2025.[4]

Challenges include power/grid constraints, regulatory opposition, skilled labor shortages (e.g., peak crews 4-5x historical sizes), and rising costs, yet demand remains strong with low vacancy risk from tech commitments like $500 billion U.S. buildouts.[1][3][4][5] Projections are consistent on scale but diverge on pace due to infrastructure limits.[1][2]

Sources:
- [1] https://inside.lighting/news/26-01/7-key-insights-2026-27-construction-forecast
- [2] https://www.jll.com/en-us/insights/market-outlook/data-center-outlook
- [3] https://www.constructiondive.com/news/data-centers-construction-2026-trends/810016/
- [4] https://www.datacenterknowledge.com/energy-power-supply/moody-s-3-trillion-data-center-investment-by-2030-amid-power-challenges
- [5] https://www.databank.com/resources/blogs/data-center-construction-predictions-for-2026/
- [6] https://www.aia.org/resource-center/january-2026-consensus-construction-forecast
- [7] https://www.datacenterdynamics.com/en/news/not-a-bubble-3-trillion-data-center-investment-supercycle-expected-by-2030-despite-challenges-jll/
- [8] https://mocasystems.com/wp-content/uploads/2025/10/MSIDataCenterReport_Final.pdf

Research Question