Research all publicly available information about OpenAI's custom AI chip development program, including partnership with TSMC,…

OpenAI has developed a custom AI inference ASIC (initially codenamed Project Titan/XPU internally, publicly unveiled as “Jalapeño” on June 24, 2026) in partnership with Broadcom and TSMC to reduce dependence on Nvidia GPUs, lower inference costs by targeting ~90% reductions versus general-purpose GPUs, and support scaling of models like ChatGPT at gigawatt-scale data centers.[1][1]

This effort, accelerated by using OpenAI’s own models in the design process (achieving a nine-month schematic-to-tape-out cycle), positions the company to control more of its hardware stack for better unit economics on high-volume inference workloads.[1]

The chip is an application-specific integrated circuit (ASIC) optimized for large language model (LLM) serving rather than general-purpose compute.
It is co-developed with Broadcom (reported ~$10 billion partnership) and manufactured by TSMC.[2]
Hardware team led by VP Richard Ho (ex-Google TPU program) grew to ~40 engineers.[3]

For competitors or new entrants, this demonstrates that frontier AI labs can rapidly prototype inference-optimized silicon when they combine deep model knowledge with experienced ASIC partners like Broadcom, bypassing the multi-year cycles typical of standalone chip design.

Partnership with TSMC and Manufacturing Details

OpenAI’s first-generation chip is fabricated on TSMC’s advanced 3nm (N3) process node, with a second generation planned for the more advanced A16 (1.6nm-class) node; TSMC handles production while Broadcom supports design, networking (e.g., Tomahawk silicon), and system aspects alongside partners like Celestica for integration.[3][4]

This leverages TSMC’s capacity already used for Nvidia and other AI accelerators, while OpenAI secures dedicated HBM4 memory supply (12-layer stacks) via an exclusive deal with Samsung representing a significant portion (~7%) of Samsung’s projected 2026 HBM output.[4]

Early reports (Feb 2025) confirmed TSMC 3nm fabrication with plans to tape out in 2025 for 2026 mass production.[3]
Mass production targeted for H2 2026, with initial deployments/rollouts by end of 2026 (December targeted in some reports).[4][5]
The design includes high-bandwidth memory (HBM) and extensive networking capabilities, using a systolic array architecture common in AI accelerators.[3]

This foundry relationship and memory supply agreement illustrate how AI companies are now competing directly for leading-edge capacity and specialized memory, creating supply-chain leverage that pure software players previously lacked; entrants must secure similar ecosystem partnerships early or face allocation risks.

The $500B Stargate Initiative

Stargate is a separate but complementary $500 billion AI infrastructure project (announced January 2025) led by OpenAI (operational responsibility), SoftBank (financing, with Masayoshi Son as chairman), Oracle, and MGX, aiming for 10 GW of U.S. data center capacity by ~2029 to power OpenAI’s compute needs—including deployment of custom chips like Jalapeño.[6][7]

It is not the chip program itself but the massive data center buildout (multiple sites including Abilene, TX flagship and others in TX, NM, OH, etc.) where these accelerators are expected to operate at scale alongside Nvidia and other hardware.[8]

Initial $100B deployment announced, with progress toward full $500B/10 GW commitment reported as ahead of schedule by late 2025 (nearly 7 GW planned across sites).[7]
OpenAI continues partnerships with Microsoft Azure, Nvidia, AMD, etc., for training and mixed workloads while using custom silicon to optimize inference economics within Stargate-scale infrastructure.[1]

Stargate shows how custom chips fit into broader hyperscale infrastructure strategies; companies entering this space must align silicon roadmaps with multi-gigawatt data center timelines and power procurement, or risk mismatched capacity.

Chip Architecture, Specifications, and Timelines

Jalapeño/Titan uses a systolic array architecture with HBM4 memory, supports low-precision formats (FP4, INT8, BF16), and emphasizes reduced data movement, high networking integration, and superior performance-per-watt for inference—achieving “substantially better” efficiency than state-of-the-art GPUs.[3][4]

Development was unusually fast due to hardware-software co-design using OpenAI models.[1]

Key specs (Gen 1): TSMC N3 (3nm); HBM4 (12-layer, Samsung); low-precision focus; target ~90% inference cost reduction vs. GPUs.[4]
Gen 2 (Titan 2): Planned TSMC A16; enhanced memory (HBM4E expected); further efficiency gains; 2027 target.[4]
Timeline milestones: Design work advanced by early 2025; tape-out targeted 2025; mass production H2 2026; initial data center rollout end-2026; testing of models like GPT-5.3-Codex-Spark reported by June 2026.[3][1]

The rapid iteration and model-assisted design process highlights a new competitive advantage for AI labs with proprietary models—they can optimize silicon specifically for their workloads faster than traditional semiconductor timelines allow.

Intended Use Cases, Strategic Implications, and Current Status (as of June 2026)

The chip is purpose-built primarily for inference (running trained LLMs for user queries, APIs, agents, etc.) rather than training, enabling lower per-token costs to support broader deployment of capable models while maintaining a limited initial role alongside Nvidia/AMD hardware for training.[3][1]

It is intended for OpenAI’s internal operations (e.g., ChatGPT-scale serving) with potential external availability noted in announcements; early physical samples tested by mid-2026 with plans for end-of-year rollout.[1]

Power efficiency gains and cost targets address the high expense of inference for reasoning models, which has been a major driver of OpenAI’s compute spending.[4]
Part of a diversified hardware strategy (Nvidia for training, custom silicon for inference optimization) within Stargate and other infrastructure.[1]
Unveiled June 24, 2026, with Broadcom collaboration highlighted for LLM-optimized intelligence processing.[1]

For market participants, OpenAI’s move validates inference-specific ASICs as a viable path to sustainable economics at scale; new entrants should evaluate whether their workloads justify similar custom silicon investments or if they can leverage these efficiencies through partnerships or cloud offerings.

Public information is drawn from Reuters reporting (2025), VentureBeat and company announcements (June 2026), industry analyses (2026), and OpenAI/Stargate project updates (2025). No comprehensive public die-level specifications or exact performance benchmarks (e.g., tokens/sec or TOPS) have been disclosed beyond qualitative claims of superior efficiency. Further details may emerge with production ramps or additional announcements.

Recent Findings Supplement (June 2026)

OpenAI unveiled its first custom AI chip, "Jalapeño" (an LLM-optimized inference accelerator co-designed with Broadcom), on June 24, 2026—the most significant recent development.[1][1]

This marks a shift from earlier exploratory reports (pre-2026) to a concrete, named product with a rapid development timeline and clear inference focus. It builds on the 2025 Broadcom partnership and aligns with Stargate-scale infrastructure needs. No other major new announcements (e.g., training-focused chips, detailed node specs, or Stargate updates) appear in post-December 2025 sources.

Jalapeño is a purpose-built inference accelerator, not a general-purpose design or training chip. OpenAI led the architecture around LLM fundamentals (kernels, memory movement, networking, and serving patterns for frontier models like those powering ChatGPT and Codex), with Broadcom handling silicon implementation and networking (e.g., Tomahawk). Celestica contributes board/rack/system integration.[2]

It is explicitly optimized for inference workloads (answering user queries on models like GPT-5.3-Codex-Spark) rather than training; early lab samples run at target frequency/power and are designed for flexibility across current/future LLMs.[1]
Architecture emphasizes reduced data movement and balanced resources for utilization closer to theoretical peaks; early tests indicate substantially better performance-per-watt than current state-of-the-art (detailed report expected in coming months).[2]
Development cycle: ~9 months from initial design to TSMC tape-out (accelerated partly by OpenAI models aiding design/optimization)—claimed as one of the fastest high-performance ASIC cycles.[1]

Manufacturing ties directly to TSMC, with 2026 deployment targeted. The design was sent to TSMC for fabrication; prior 2026 reports referenced TSMC N3 (3nm) for a "Titan" chip (likely related or predecessor naming), with a second-gen on A16 planned later.[3]

Deployment: Engineering samples active; production deployment planned by end of 2026 as the first in a multi-generation platform for gigawatt-scale data centers (with partners including Microsoft).[2]
Internal use only (not sold externally), complementing Nvidia/AMD GPUs to cut costs and scale inference.[1]

The $500B Stargate initiative provides the broader infrastructure context but shows no major post-2025 shifts. Announced in January 2025 (with SoftBank, Oracle, etc.), it targets 10 GW / $500B in U.S. AI data centers. By late 2025/early 2026 updates, it was on or ahead of schedule (~7 GW planned, >$400B committed), with sites expanding (e.g., Abilene, Texas flagship operational elements).[4]

Jalapeño and the Broadcom platform explicitly support gigawatt-scale rollout starting 2026, tying chip development to Stargate execution. No new regulatory, policy, or research publication updates on the program appear in recent sources.

Implications for competitors/entrants: This validates co-design models (hyperscaler + Broadcom-style partners) for fast inference ASICs and highlights software-hardware co-optimization (using AI models in chip design) as a differentiator. It intensifies pressure on Nvidia for inference efficiency while underscoring TSMC's central role. Detailed performance metrics and second-gen details (potentially A16) will clarify competitiveness. Stargate's scale suggests sustained demand for such custom silicon.

Partnership with TSMC and Manufacturing Details

The $500B Stargate Initiative

Chip Architecture, Specifications, and Timelines

Intended Use Cases, Strategic Implications, and Current Status (as of June 2026)

Recent Findings Supplement (June 2026)

Other reports in this analysis

Continue Reading

Bumble Turnaround Potential

Workday turnaround

Competitive Landscape: AI Writing Tools (2026)

Get Custom Research Like This