OpenAI Ships Jalapeño، أول تشكيلة التقييم التي تم إنشاؤها مع Broadcom

OpenAI's first custom inference chip, named Jalapeño, promises significantly better performance-per-watt than any state-of-the-art alternative when running real-time coding models - exactly the workload that drives up inference costs.

Why Inference, Not Training, Gets the Custom Silicon

Jalapeño is an inference-only processor, designed and manufactured with Broadcom. That's a deliberate bet. Training still leans on Nvidia hardware, but inference - especially for agentic products like Codex - is where the cost curve kills margins. OpenAI president Greg Brockman explained the logic: "We have a deep understanding of the workload" and are looking for "specific workloads that are underserved." Real-time code generation is one of those underserved workloads.

Small reductions in per-query inference cost compound fast when you're serving millions of developer sessions. OpenAI claims early benchmarks show Jalapeño beats current alternatives on efficiency. If those numbers hold in production, it directly improves the company's bottom line while cutting reliance on Nvidia's supply chain.

Jalapeño Completes the Full-Stack Play

This isn't just a chip. OpenAI's announcement makes the strategic frame explicit: "OpenAI is not only developing frontier models or building products on top of them; it is designing the infrastructure underneath them: chip architecture, kernels, memory systems, networking, scheduling, deployment systems, and product experience."

Custom silicon lets OpenAI optimize every layer toward one goal - making models faster, more reliable, and cheaper for users. Google and Amazon already did this with TPUs and Trainium. Now OpenAI has its own inference engine, and it uses its own AI models to help design the chip. That loop of model-informed chip design could produce advantages that third-party hardware can't replicate.

The Economics of Owning the Silicon

OpenAI is already building data centers and agentic products. Owning the inference chip closes the loop. Every inference query that runs on Jalapeño instead of a rented Nvidia GPU cuts the variable cost and locks in margin. The company didn't disclose exact wattage or price improvements, but even a 15-20% gain per watt at scale would make a dent in the billions spent on inference compute.

Expect future generations of Jalapeño to expand into training tasks - if the performance-per-watt gains hold up in production.

Source: OpenAI unveils its first custom chip, built by Broadcom
Domain: techcrunch.com

OpenAI Ships Jalapeño، أول تشكيلة التقييم التي تم إنشاؤها مع Broadcom

Why Inference, Not Training, Gets the Custom Silicon

Jalapeño Completes the Full-Stack Play

The Economics of Owning the Silicon

More in Artificial Intelligence