This blog supports the conversation around AI data center space requirements from our white paper on networking for AI data centers, giving additional information and pertinent figures linked to the growth in modern AI data center size.
Understanding the AI buildout
In 2020, global data creation and consumption estimates exceeded 64 zettabytes (for context, a zettabyte roughly equates to one billion terabytes). By 2025, factors such as the proliferation of internet-connected devices, the rise of cloud computing, and the surge in user-generated social media content will all contribute toward data creation and consumption predictions surpassing 180 zettabytes. To keep pace with the evolving demand for data, hyperscale data centers must adapt.
The profound impact on AI buildouts translates into a shift from 5-10kW server racks to new designs capable of handling 50-100kW or more – some cutting-edge facilities now accommodate racks reaching a staggering 200kW. Considering the average American home uses approximately 30kWh per day, this shift towards increased power means each data center cabinet consumes the same daily energy as 3–6 typical American homes. But which physical aspects of AI data centers create this growing need for more power? And how does the answer relate to an increase in AI data center size?
Core revolution: The processing power shift
Traditional CPU-powered data centers provide the necessary computational heavy lifting for tasks including web hosting, virtualization technologies (i.e., allowing multiple machines to run on a single server), and database management. However, CPU-based servers prove insufficient for parallel processing tasks, such as the vast machine learning calculations needed during training and inference. Instead, AI data center operators turn to the more powerful GPU.
GPUs excel at parallel processing. Measured in FLOPS (floating-point operations per second), the number of calculations per second varies between GPU chips. However, most modern GPUs can achieve a calculations-per-second performance that reaches teraflops (trillions of FLOPS), with some high-end chips reaching petaflops (quadrillions of FLOPS). As an example, the NVIDIA GeForce RTX 3090 performs at up to 35.6 teraflops, meaning a GPU calculation speed of 35.6 trillion floating-point calculations every second.
From CPU to GPU: Does faster mean bigger?
Both CPUs and GPUs contain individual processing units called ‘cores’, each capable of conducting independent calculations. GPUs comprise thousands of comparatively smaller cores, whereas CPUs contain up to 24 more powerful cores. While the CPU cores are more powerful, the sheer number of available GPU cores translates into multitasking efficiency, ideal for managing the parallel processing tasks associated with modern AI data center workloads.
Each core contains fundamental switch components called transistors. Consequently, due to the gulf in core count between CPUs and GPUs, the number of transistors in a single GPU’s makeup can dwarf that of the CPU, resulting in more components and the need for more physical space. For example, the Intel i7-9700K (CPU) contains three billion transistors, whereas the Nvidia Blackwell B100 accelerator (GPU) contains 208 billion transistors.
More components not only require more space but generate more heat, requiring additional physical room to house specialized cooling equipment.
Power and cooling: Making room for the AI boom
The increased resourcing required to manage modern AI data center workloads necessitates greater consideration around power consumption and efficient cooling methods. The thousands of cores and billions of transistors making up advanced GPU chips places significant demands on power, with the 200kW per rack mentioned above contrasting drastically with the 5-10kW per rack typically associated with traditional CPU-powered hyperscale data centers. With increased power comes greater heat generation and the need for advanced cooling solutions.
Air cooling
Traditional air-cooling techniques rely on fans and heat sinks to disperse heat and cool data center components. Passive heat sinks act to increase the surface area around electrical components, which enhances heat transfer to the surrounding air where pipes transfer the warm air to cooler areas. Active cooling fans direct air towards the heat-generating electrical components.
However, while traditional air-cooling methods provide an adequate solution up to 20 kW per rack, the method proves insufficient for AI data center workloads. Instead, AI data center operators turn to liquid cooling techniques.
Liquid cooling
A recent study revealed that 38.6% of IT professionals expect to see a rise in liquid cooling techniques throughout data centers by 2026. Liquid cooling techniques include:
Direct-to-chip
Direct-to-chip technology introduces liquid coolant directly to the chip, providing a sufficient cooling solution for large AI data centers with heat dispersal needs exceeding 100kW+ per rack. By targeting heat generating areas, this approach significantly enhances efficiency in comparison to traditional, less-focused air cooling.
Rear-door heat exchangers
Tailored for high-density data centers with server rack densities ranging from 20 kW to 50 kW per rack, rear-mounted heat exchangers use chilled water to cool servers. Operators can retrofit existing racks with rear-door heat exchangers, avoiding a full revamp of the entire infrastructure. Depending on workloads, this approach may prove efficient in both AI data centers and older, traditional hyperscale facilities.
Immersion cooling
Immersion cooling techniques involve submerging servers in tanks filled with a biodegradable, non-toxic synthetic liquid. This method excels in heat extraction and can manage heat loads exceeding 250 kW. Despite the obvious advantages for AI data centers, immersion cooling poses not only potential environmental risks, but compatibility issues between the cooling fluids and server components. Therefore, immersion cooling is not currently widely adopted.
The demand for robust power and cooling infrastructure not only expands the physical footprint of AI data centers but leads to greater innovation in energy efficiency and thermal management. Efficient cooling methods will continue to advance alongside the growth in AI technologies.
Future AI data center size: Big will get bigger
The conversation around increasing AI data center size naturally moves toward Stargate, OpenAI and Microsoft’s upcoming, collaborative AI data center project. Set to launch in 2028, Stargate will become the world’s largest AI supercomputing facility. Stargate represents the fifth and final phase of an ambitious, multi-year plan, with the four-phase lead up involving the creation of incrementally large, supporting supercomputers. Phase four begins next, which involves building a supercomputer in Wisconsin, USA, necessary to acquire the chips needed for Stargate (due for completion by 2026).
Although the exact location and dimensions of Stargate remain unknown, Microsoft’s recent purchase of 200 acres of land near its phase four site fuels speculation that Stargate could span hundreds of acres with a cost of $100 billion. This advanced project sheds light on what to expect from future AI data center deployments, from the innovation and capex to the planning and physical space required to keep driving AI technology.