Data center cooling solutions and development trend

With the explosive growth of artificial intelligence (AI) computing, the power density of AI chips – particularly GPUs and TPUs – is climbing at an astonishing rate, posing unprecedented challenges for thermal management technologies.

AI Chip Power Trends: Exponential Growth Trajectory

  • 2020-2023: NVIDIA A100 (400W) → H100 (700W)
  • 2024: Blackwell architecture B200 chip up to 1000W (single-chip power consumption), GB200 superchip group (2x B200 + Grace CPU) exceeds 2700W.
  • Future Forecast: 2026-2027 chip power may break 1500-2000W/chip, pushing rack power density towards 100-150kW/rack (current traditional air-cooled rack limit ~20-30kW).

Core Drivers of Soaring Power Density

  • Process Scaling Bottlenecks: While 3nm/2nm processes increase transistor density, leakage current and heat density rise sharply.
  • Chip Packaging Revolution: 3D stacking technologies like CoWoS (TSMC) and Foveros (Intel) concentrate heat in tiny areas, achieving heat flux density exceeding 100W/cm² (traditional CPUs only 10-30W/cm²).
  • Compute Arms Race: Large model parameters grow ~10x annually (GPT-3 175B → GPT-4 trillion-level), training compute demand doubles every 6 months.

Air cooling has reached its physical limits at both the chip and rack levels (low air specific heat capacity, slow thermal conductivity).

Liquid Cooling Emerges as the Only Viable Path

Liquid Cooling TechnologyClassification
DirectSingle-phase immersion
Single-phase sealed
Two-phase immersion
Single-phase spray
IndirectCold plate
  1. Cold Plate Liquid Cooling (Mainstream Transition Solution)
    • Principle: Metal cold plates attach directly to GPU/CPU chips; coolant flows through microchannels to remove heat.
    • Capability: Supports single chips 1000-1500W, heat flux density 500W/cm².
    • Representative Cases: NVIDIA DGX H100 series fully adopts cold plate cooling; Meta AI data centers deploy at scale.
  2. Immersion Cooling (Ultimate Solution)
    • Single-Phase Immersion: Servers submerged in dielectric fluorinated fluid; fluid temperature can reach 60°C (PUE≈1.05).
    • Two-Phase Immersion: Coolant absorbs heat and boils (phase change latent heat), vapor rises to condensers, re-liquefies, and returns; thermal efficiency increases 5-10x.
    • Capability: Single rack 200kW+, heat flux density 1000W/cm², suitable for Blackwell-class chips.
    • Commercialization Progress: Google TPU v5 uses two-phase immersion; Microsoft Azure deploys single-phase immersion clusters.

Operational Details:

  • Cold Plate Cooling: Installs metal plates on heat-generating components like chips; circulating coolant directly removes heat.
  • Immersion Cooling: Servers are fully submerged in tanks filled with liquid that absorbs all generated heat.
    • Single-Phase: Heat transfers from hardware to liquid via convection; pumped liquid flows through servers, is cooled externally, and returns.
    • Two-Phase: Uses low-boiling-point liquids (30-50°C) which vaporize upon absorbing heat; vapor travels to condenser coils, re-condenses, and returns to the tank.

Environmental Impact Data (vs. Traditional Air Cooling):

  • Cold Plate: Reduces 15-16% greenhouse gases (GHG), 15% energy consumption, and 31-50% water usage.
  • Single-Phase Immersion: Reduces 13-16% GHG, 15% energy consumption, and 45-80% water usage.
  • Two-Phase Immersion: Reduces 20-21% GHG, 20% energy consumption, and 48-82% water usage.
  • Note: When liquid cooling systems use 100% renewable energy, water savings increase by an additional 13%-48% due to the lower water footprint of clean energy sources.

Technology Trade-offs:
Each cooling technology has distinct characteristics.

  • Cold Plate Systems: Offer the easiest retrofit for existing facilities, provide strong chip-level cooling without overhauling entire infrastructure. However, “air handling systems remain significant power consumers,” researchers emphasize, “and cold plate copper components must be replaced with each IT refresh cycle.”
  • Single-Phase Immersion: Balances environmental benefits with system complexity; costs less than two-phase. Requires use of flammable hydrocarbon oils, potentially necessitating new safety regulations.
  • Two-Phase Immersion: Delivers optimal energy and water efficiency but relies on synthetic refrigerants containing Per- and Polyfluoroalkyl Substances (PFAS) – persistent “forever chemicals” under increasing regulatory scrutiny (EU REACH, US EPA) due to health risks.

Deployment Complexity:

  • Two-phase immersion best suits new, high-density builds.
  • Cold plate is the preferred choice for retrofits.
  • Single-phase immersion fits new, medium-density data centers.

Coolant Fluids & Material Compatibility Requirements:

Currently the liquid cooling solutions used in supercomputer centers include the following types of coolant:

Fluorine-containing coolants: Perfluoropolyether, Trimer
Non-fluorine-containing coolants : Mineral oil, Synthetic hydrocarbon oil,Modified silicone oil, Ester oil
Single-phase: Ethylene glycol,Propylene glycol
Two-phase: Fluorinated fluid, Water


Early engagement with coolant fluid suppliers for HPC/AI center builds is critical. Key considerations include:

  1. Chemical Composition, Disposal Plans & Compliance Risks: Full lifecycle assessment is essential.
  2. Socio-Economic, Community & Business Impact: Evaluate broader implications.
  3. Environmental Metrics:
    • Ozone Depletion Potential (ODP): Prioritize fluids approaching zero ODP; avoid HFCs or CO2.
    • Fluid Properties: Rigorously assess viscosity, flammability, volatility.
    • Toxicity & Persistence: Prioritize fluids with low bioaccumulation potential and low terrestrial/aquatic toxicity.
  4. Material Compatibility (Critical for Reliability):
    • Seal Integrity: Coolants must be chemically compatible with rubber seals and gaskets to prevent degradation, swelling, shrinkage, or embrittlement over the system’s operational lifetime and temperature range (potentially up to 60-80°C outlet temperatures).
    • Key Requirements for Sealing Materials:
      • Chemical Resistance: Must withstand prolonged exposure to specific coolant chemistry (e.g., fluorinated fluids, dielectric oils, hydrocarbons, PFAS-based fluids) without significant degradation.
      • Thermal Stability: Maintain elasticity and sealing force across the full operating temperature range (e.g., -40°C to 120°C+).
      • Low Permeability: Minimize fluid permeation/leakage through the seal.
      • Long-Term Durability: Resist compression set and maintain sealing performance for years under pressure and thermal cycling.
    • OBT Rubber Seal possesses extensive experience in developing and supplying advanced elastomer sealing materials (e.g., specialized Fluoroelastomers (FKM), Ethylene Propylene Diene Monomer (EPDM)) engineered for demanding compatibility with diverse cooling fluids used in high-performance liquid and immersion cooling systems, ensuring long-term reliability.

← Back

Thank you for your response. ✨

Suzhou Obtiv Technology Co.,LTD

No.211 Zhujiang Road, Suzhou City, China

Discover more from CUSTOM RUBBER PRODUCTS MANUFACTURE

Subscribe now to keep reading and get access to the full archive.

Continue reading