|
|
|
Direct-to-Chip Liquid Cooling for the AI Data Center
AI 数据中心的直接芯片液体冷却
加速计算、AI、ML 和高性能计算 (HPC) 需求正在飙升,推动了对液体冷却的需求。客户目前正面临开发新冷却策略以实现更高机架功率密度的重要决策,超大规模数据中心运营商正在考虑 100kW+ 机架。液体冷却有助于解决当前与空气冷却相关的性能、效率和空间利用率挑战。在单相或两相冷却技术之间,采用液体冷却与直接到芯片或浸入式冷却之间,有几个关键的权衡(包括整体性能、资本支出、运营支出、占地面积和水消耗)。由于加速计算是一个范围,需要根据用例(训练与推理)使用各种硬件,因此需要一种能够适应这些差异的冷却策略至关重要。安全性和可持续性,尤其是对于新流体,也是制定液体冷却策略时需要考虑的因素。Accelerated computing, AI, ML and high-performance computing (HPC) demand is skyrocketing, driving needs for liquid cooling. Customers are currently facing important decisions on developing new cooling strategies to enable higher rack power densities, with 100kW+ racks being considered by hyperscale data center operators. Liquid cooling helps solve the current challenges of performance, efficiency and space utilization associated with air cooling. Liquid cooling adoption has several key tradeoffs (including overall performance, capex, opex, floorspace, and water consumption) between single or two phase cooling technologies, and direct to chip or immersion. Since accelerated computing is a spectrum, requiring a variety of hardware depending upon use case (training vs inference), necessitating a cooling strategy that can accommodate these differences is critical. Safety and sustainability, particularly for new fluids, are also factors that need to be considered when developing a liquid cooling strategy.
What Is a Data Center?
A data center is a physical facility consisting of high-performance servers, storage systems, networking equipment, and other infrastructure. Used by organizations for storing, managing, and distributing data, data centers support the needs of large-scale applications as well as cloud computing, colocation, content delivery, and more. Today’s modern data centers make use of virtualization, automation, artificial intelligence (AI)/machine learning (ML), and other technologies to optimize availability, scalability, security, and efficiency.
What Are the Core Components of a Data Center?
The core components of a data center include:
Servers, which are the primary computing devices that process and manage data;
Storage devices, which are used to house large volumes of data;
Networking equipment, such as routers, switches, and other components that connect the various devices within the data center and enable them to communicate with each other; and
Cooling systems, including air conditioning, ventilation, and in some cases liquid cooling, that maintain optimal temperature and humidity levels within the facility to prevent overheating and equipment failure.
In addition to these components, data centers also require backup power sources, such as generators or uninterruptible power supplies (UPS), to ensure that operations continue without disruption in the event of a power outage.
|