Optimizing Data Centers

The data centers that run AI systems consume an enormous amount of energy. Learn about the new hardware that could drastically improve performance and efficiency.

By Keren Bergman


Data centers use an enormous amount of energy — and the demand is growing exponentially. From 2022 to 2026, the amount of energy consumed by U.S. data centers is projected to more than double to 1,000 terrawatt-hours per year. 

To take one example, the data centers used to train GPT-4 consumed more energy during that four-month process than New York City uses in the heat of the summer. While data centers do everything from hosting streaming videos to managing air traffic, the main drivers of energy consumption are AI and machine learning models.

My colleagues and I have grown increasingly concerned about these trends. With exponentially larger models consuming exponentially larger amounts of energy, data centers are on track to consume more than 10% of the global energy supply within the next 10 years.

Image
Graph that demonstrates "THE TOTAL ENERGY USED TO POWER U.S. DATA CENTERS IS GROWING EXPONENTIALLY."
The total energy used to power U.S. data centers is growing exponentially. Source: McKinsey

With today’s state-of-the-art technology, energy use and model size increase in lockstep. We desperately need more efficient data centers to bend the curve of energy consumption so that AI applications can continue to grow without consuming an undue share of the global energy supply.

The inefficiency of today's data centers

Data centers contain three essential components: computation, memory, and communication among those subunits. Training very large models requires connecting as many as 10,000 computing elements in complex networks. 

In today’s data centers, most of that data travels as electrons over wires. Simply moving that data is expensive in terms of energy. Even worse, the physics of electronic communication sharply limits the distance those high data rate signals can travel, meaning computation and memory have to be packaged very closely together on a common interposer substrate and in sockets. This causes a couple of problems. For one, we have to connect many of these sockets in such a way that all components are powered on all of the time. And since many of these sockets have to be relatively far from each other, the communication channels connecting them have low bandwidth.

The elements inside the socket are inefficient. Imagine thousands of powerful computers and memory units wasting a lot of energy as they sit idle waiting for data to travel through the narrow straws that connect them.

Even when conventional fiber-optic cables are used to connect the sockets, it’s hugely inefficient for data to travel the relatively short distance from the chip to the connection point at the socket’s edge.

The promise of integrated photonics

We can address these inefficiencies using integrated photonic technologies to transfer data directly from one chip to another. This approach increases bandwidth by more than 100 times, effectively bringing the type of bandwidth currently available within a single chip to an entire data center. 

With modern photonics, we can continue to grow AI performance by orders of magnitude while keeping energy consumption essentially flat. 

The reason for the higher bandwidth lies in the fundamental properties of light. With photonics, it’s possible to send multiple streams of data at the same time using different colors. Those signals don’t interfere with each other because photons don’t interact with one another, unlike electrons, which must be separated. Photonics offers almost unlimited bandwidth density.

Bringing modern photonics to real-world data centers

Developing this technology is one thing — seeing it implemented in data centers is another. Two years ago, Columbia Engineering colleagues Michal Lipson, Alex Gaeta, and I co-founded the startup Xscape Photonics. Our goal is to commercialize these ideas by bringing photonics inside the AI compute socket. If successful, this innovation will revolutionize data centers by providing almost unlimited communication bandwidth, opening the door for scaling while making the entire system significantly more energy efficient.

I also direct the Center for Ubiquitous Connectivity, which we call CUbiC. This systems-level research center unites the efforts of 24 PIs, more than 114 PhD students, and 17 undergraduates across 15 universities. Our outstanding team of researchers strives to flatten the computation-communication gap, delivering seamless edge-to-cloud connectivity with transformative reductions in the global system energy consumption.

The rate of growth in energy consumption by data centers is frightening. With the size of models and datasets growing exponentially and with no end in sight, it is essential that we develop technologies to decouple computation from energy use. Luckily, technologies to flatten the curve of data center energy consumption are on the horizon. As these technologies come to fruition, we must ensure that they are implemented as quickly and widely as possible.

Image
Keren Bergman

Keren Bergman
Charles Batchelor Professor of Electrical Engineering; Scientific Director, Center for Integrated Science and Engineering