A boom in AI demand and the accompanying shortage in memory supply is all anyone in the industry is talking about. At CES 2026 in Las Vegas, Nevada, it was also at the heart of Nvidia’s latest major product releases.
On Monday, the company officially launched the Rubin platform, made up of six chips that combine into one AI supercomputer, which company officials claim is more efficient than the Blackwell models and boasts increases in compute and memory bandwidth.
“Rubin arrives at exactly the right moment, as AI computing demand for both training and inference is going through the roof,†Nvidia CEO Jensen Huang said in a press release.
Rubin-based products will be available from Nvidia partners in the second half of 2026, company executives said, naming AWS, Anthropic, Google, Meta, Microsoft, OpenAI, Oracle, and xAI among the companies expected to adopt Rubin.
“The efficiency gains in the NVIDIA Rubin platform represent the kind of infrastructure progress that enables longer memory, better reasoning, and more reliable outputs,†Anthropic CEO Dario Amodei said in the press release.
GPUs have become an expensive and scarce commodity as the rapidly scaling data center projects drain the global memory chip supply. According to a recent report from Tom’s Hardware, gigantic data center projects required roughly 40% of the global DRAM chip output. The shortage has gotten to such a point that it is causing price hikes in consumer electronics and is rumored to impact GPU prices as well. According to a report from South Korean news agency Newsis, chipmaker AMD is expected to raise the prices of some of its GPU offerings later this month, and Nvidia will allegedly follow suit in February.
Nvidia’s focus has been on evading this chip bottleneck. Just last month, the tech giant made its largest purchase ever with Groq, a chipmaker that specializes in inference.
Now, with a product that promises high levels of inference and the ability to train complex models with fewer chips at a lower cost, the company might be hoping to ease some of those shortage-driven worries in the industry. Company executives shared that Rubin delivers up to ten times reduction in inference token costs and four times reduction in the number of GPUs used to train models that rely on an AI architecture called mixture of experts (MoE), like DeepSeek.
To add on that, the company is also unveiling a new class of AI-native storage infrastructure designed specifically for inference, called Inference Context Memory Storage Platform.
Agentic AI, the tech world’s hot new thing for the last year or so, has put an increased importance on AI memory. Rather than simply responding to single questions, AI systems are now expected to remember much more information about earlier interactions to autonomously carry out some tasks, which means there is more data to be managed during the inference stage.
The new platform aims to solve that by adding a new tier of memory for inference, to store some context data and extend the GPU’s memory capacity.
“The bottleneck is shifting from compute to context management,†Nvidia’s senior director of HPC and AI hyperscale infrastructure solutions Dion Harris said. “To scale, storage can no longer be an afterthought.â€
“As inference scales to giga-scale, context becomes a first-class data type, and the new Nvidia inference context memory storage platform is ideally positioned to support it,†Harris claimed.
Time will tell if efficiency can successfully address some of the bottlenecks brought about by the intense chip demand. But even if the memory problem is resolved, the AI industry will continue to face other bottlenecks in its unprecedented growth, most notably via the immense strain that data centers put on the U.S. power grid.
Original Source: https://gizmodo.com/nvidia-new-rubin-platform-shows-memory-is-no-longer-afterthought-in-ai-2000705639
Original Source: https://gizmodo.com/nvidia-new-rubin-platform-shows-memory-is-no-longer-afterthought-in-ai-2000705639
Disclaimer: This article is a reblogged/syndicated piece from a third-party news source. Content is provided for informational purposes only. For the most up-to-date and complete information, please visit the original source. Digital Ground Media does not claim ownership of third-party content and is not responsible for its accuracy or completeness.
