OpenAI and Broadcom Unveil Jalapeño Custom AI Inference Chip
OpenAI and Broadcom have developed “Jalapeño,” a custom AI inference chip manufactured by TSMC. According to OpenAI, the chip aims to reduce the cost per inference token by roughly 50% compared to current hardware, with a target deployment by the end of 2026 to decrease the company’s reliance on Nvidia’s pricing and supply chain.
Why did OpenAI build the Jalapeño chip?
OpenAI created Jalapeño to slash its largest operating cost: the purchase and operation of Nvidia chips. While Nvidia provides the hardware necessary to train models, the process of serving responses to users—known as inference—now accounts for two-thirds of computing in AI data centers, according to company data.
OpenAI President Greg Brockman told CNBC that designing their own stack allows the company to serve more intelligence with greater efficiency. By moving away from general-purpose hardware, OpenAI intends to optimize specifically for the workloads of ChatGPT, Codex, and its upcoming agentic AI products.
How does Jalapeño compare to Nvidia and Google hardware?
Early lab testing indicates that Jalapeño delivers performance on par with Google’s Tensor Processing Units (TPUs) and Nvidia’s Blackwell processors. The primary difference is the cost. OpenAI claims the chip operates at roughly 50% lower cost per inference token than its competitors.

Unlike the custom silicon built by “hyperscalers” like Amazon (Trainium) or Google, which are designed to serve external cloud customers, Jalapeño is built exclusively for OpenAI’s own products. It is not a training chip and it isn’t general-purpose; it is engineered for one specific job at a massive scale.
| Feature | Nvidia Blackwell | OpenAI Jalapeño |
|---|---|---|
| Purpose | General-purpose AI (Training & Inference) | Dedicated Inference |
| Cost per Token | Industry Standard | ~50% Lower (Claimed) |
| Manufacturing | TSMC | TSMC (3nm node) |
What is the timeline for the new AI silicon?
The chip has been handed over to OpenAI, but it isn’t in users’ hands yet. Initial deployment is targeted for the end of 2026. Board and rack integration is currently being handled by Celestica.
While the hardware exists, the data supporting the 50% cost reduction hasn’t been released. OpenAI stated a technical report with underlying data is coming “in the months ahead.” Bloomberg reported on the chip’s potential to run models faster and more cheaply, but these figures remain unverified by third-party benchmarks.
How will “gigawatt-scale” data centers change AI?
Broadcom CEO Hock Tan stated that the partnership with OpenAI enables the deployment of “gigawatt scale” data centers starting in 2026. These facilities, built with Microsoft and other partners, represent the largest infrastructure buildouts in the history of the industry.

This move signals a structural transition in the AI market. The industry is moving from a phase of building models to a phase of running them at an enormous, sustainable scale. For OpenAI, this is a strategic hedge against the supply chain volatility that has characterized the AI boom, such as the tight supply of AI memory chips seen throughout the year.
Frequently Asked Questions
Can Jalapeño be used to train new AI models?
No. According to OpenAI, the chip is designed specifically for inference workloads—serving responses to users—not for the initial training of models.
Who is manufacturing the chip?
The chip is manufactured by TSMC using a 3nm node, with design collaboration from Broadcom.
When will ChatGPT start running on Jalapeño?
OpenAI has targeted the end of 2026 for initial deployment.
What do you think about OpenAI moving away from Nvidia? Will custom silicon become the standard for all AI labs? Let us know in the comments or subscribe to our newsletter for more deep dives into AI infrastructure.