Google is using machine learning and artificial intelligence to wring even more efficiency out of its mighty data centers.
In a presentation today at Data Centers Europe 2014, Google’s Joe
Kava said the company has begun using a neural network to analyze the
oceans of data it collects about its server farms and to recommend ways
to improve them. Kava is the Internet giant’s vice president of data
centers.
In effect, Google has built a computer that knows more about its data
centers than even the company’s engineers. The humans remain in charge,
but Kava said the use of neural networks will allow Google to reach new
frontiers in efficiency in its server farms, moving beyond what its
engineers can see and analyze.
Google already operates some of the most efficient data centers on
earth. Using artificial intelligence will allow Google to peer into the
future and model how its data centers will perform in thousands of
scenarios.
In early usage, the neural network has been able to predict Google’s
Power Usage Effectiveness with 99.6 percent accuracy. Its
recommendations have led to efficiency gains that appear small, but can
lead to major cost savings when applied across a data center housing
tens of thousands of servers.
Why turn to machine learning and neural networks? The primary reason
is the growing complexity of data centers, a challenge for Google, which
uses sensors to collect hundreds of millions of data points about its
infrastructure and its energy use.
“In a dynamic environment like a data center, it can be difficult for
humans to see how all of the variables interact with each other,” said
Kava. “We’ve been at this (data center optimization) for a long time.
All of the obvious best practices have already been implemented, and you
really have to look beyond that.”
Enter Google’s ‘Boy Genius’
Google’s neural network was created by Jim Gao, an engineer whose
colleagues have given him the nickname “Boy Genius” for his prowess
analyzing large datasets. Gao had been doing cooling analysis using
computational fluid dynamics, which uses monitoring data to create a 3D
model of airflow within a server room.
Gao thought it was possible to create a model that tracks a broader
set of variables, including IT load, weather conditions, and the
operations of the cooling towers, water pumps and heat exchangers that
keep Google’s servers cool.
“One thing computers are good at is seeing the underlying story in
the data, so Jim took the information we gather in the course of our
daily operations and ran it through a model to help make sense of
complex interactions that his team – being mere mortals – may not
otherwise have noticed,” Kava said in a
blog post.
“After some trial and error, Jim’s models are now 99.6 percent accurate
in predicting PUE. This means he can use the models to come up with new
ways to squeeze more efficiency out of our operations. ”
A graph showing how the projections by Google’s neural network tool aligned with actual PUE readings. Click for larger image.
How it Works
Gao began working on the machine learning initiative as a “20 percent
project,” a Google tradition of allowing employees to spend a chunk of
their work time exploring innovations beyond their specific work duties.
Gao wasn’t yet an expert in artificial intelligence. To learn the fine
points of machine learning, he took a
course from Stanford University Professor Andrew Ng.
Neural networks mimic how the human brain works, allowing computers
to adapt and “learn” tasks without being explicitly programmed for them.
Google’s
search engine is often cited as an example of this type of machine learning, which is also a
key research focus at the company.
“The model is nothing more than series of differential calculus
equations,” Kava explained. “But you need to understand the math. The
model begins to learn about the interactions between these variables.”
Gao’s first task was crunching the numbers to identify the factors
that had the largest impact on energy efficiency of Google’s data
centers, as measured by PUE. He narrowed the list down to 19 variables
and then designed the neural network, a machine learning system that can
analyze large datasets to recognize patterns.
Pages: 1
2