When I first heard about a new breed of computer chips designed to super-charge machine learning models from Google, I immediately wanted to learn more. Just how different are these chips exactly? How are they made? I even got more curious. What are they capable of? And, what will this mean for the future of machine learning?
Like you probably know, Google’s data centres underpin innumerable online services of unprecedented size and scope, which no doubt requires a super powerful and efficient breed of hardware. So, over the past decade, the internet giant has been designing all sorts of innovative hardware to achieve high-level data centre efficiency. The result: The company has built a powerful tailored computer chip that is accelerating their AI models.
According to Google, their increasing use of compute-intensive machine learning in applications and services drove them to make a decision that has seen them custom design an entirely new machine learning accelerator that was announced in May 2016. The chip is dubbed a tensor processing unit (TPU) because it helps run TensorFlow, a software engine that drives their deep neural networks. Besides powering TensorFlow, TPUs are used successfully in text processing for Google Street View. Additionally, they are used with Google Photos and image processing, where a single TPU can process more than 100 million photos per day. The chips are also used in RankBrain for high-level and quick search results. Google has also stated that TPUs were used in AlphaGo program that finally beat the Go game champion Lee Sedol.
Early this year, the company reported that they will release the chip out of their empire for use by other businesses. With their release, not only will we see more TPUs in designs around the globe, but we’ll likely see other companies follow suit and create their own AI chips for use in products in years to come.

Image licensed via Bigstock.
How Different and Powerful Is Google’s TPU?
The first custom accelerator ASIC for machine learning, the TPU is custom-built to provide high-tech performance and power efficiency when running TensorFlow. By delivering 15 to 30 times higher performance, it is extremely fast compared to present-day central processing units (CPUs) and graphics processing units (GPUs). In terms of watt measure, an efficiency metric, TPUs outperforms standard processors by 30 to 80 times.
Widespread in the 1960s, the CPUs represent the electronic circuitry that carries out program instructions by performing basic logical, arithmetic, control, input, and output operations in a computer. They have limited space and storage capability and can only go for linear and serial processes. GPUs, on the other hand, appeared in the late 1990s and are designed to run computer graphics through rapid memory manipulation and alteration to accelerate image creation, especially for video game consoles. They have many cores, apply a texture-mapping method for detail and colour, run parallel processes, and can process far more pictures and graphical data than CPUs.
In 2015, general purpose GPUs started being used to train convolutional neural networks. However, because of the way they work, neural networks increasingly demand massive computational power, and this is the point where TPU’s show what they are built for.
TPUs: Fast and Low Power Consumption Machine Learning Chips
TPUs implement a process called quantization, which uses fewer computational bits (8 bits) to approximate a single value between a preset minimum and maximum. The concept only needs fewer transistors, thereby achieving space optimization, and allows more operations per second, greatly reducing energy consumption. As such, TPUs make predictions very fast.
Contrary to most CPUs and GPUs, which implement reduced instruction set computer (RISC) design to define simple instructions and execute them as fast as possible, TPUs integrate complex instruction set computing (CISC) design to implement high-level instructions that run much more complex tasks on many networks, including LSTM, convolutional, fully connected models, and more. This means that TPUs implement a matrix processor that can process hundreds of thousands of matrix operations in a single clock cycle. Think of it like this: CPUs and GPUs can print a document word by word or line by line, while a TPU can print the entire document at once.
In May 2017, Google announced second generation TPU2, dubbed “cloud TPU,” which is available by means of their Google Compute Engine to accelerate machine learning workloads, including training and running the fundamental machine learning models. First generation TPU was only implemented to train neural networks. Cloud TPUs comprise a custom network that allows for the construction of machine learning supercomputers referred to as TPU pods. A pod packs 64 devices to create supercomputers that can achieve 11.3 PFLOPS with 4TB of high-level memory.

Image courtesy of Seeking Alpha.
New Chip Markets?
Google isn’t the only company working to design its own chip for customized purposes. Apple, Facebook, and Amazon have also followed suit. Microsoft has not been left behind either, as it has announced the hiring of engineers to work on AI designs for its Azure cloud. With this trend, a question arises: Will cloud companies just decide to design their own chips and not purchase them from previous chip giants?
If you ask me, what Google started sounds like a threat to leading chip making giants, namely Intel and NVidia. As a matter of fact, Facebook is designing its own chip to better analyze video and lessen its dependency on Intel. Apple is focused on designing AI chips that can be used on Mac and iPhone devices, while Amazon is dedicated to making chips that will help run video streaming cloud services more efficient.
As for Google, they believe that software is much more efficient when efficient hardware lies and are now on their third generation of TPUs. By building custom hardware for machine learning, they will be able to confront new research and intensify the potential of future machine learning applications. For now, customers can enjoy Google’s AI chip that’s available in the market—but they are charging $6.50 for every TPU hour.
What Does the Future Look Like?
The tech industry just never stops. By unveiling their TPU, Google may have just triggered a new chip revolution. It’s like the tables have turned, and now cloud giants may soon become chip giants. I can almost guarantee that we’ll increasingly see more custom powerful chip-related innovations in the future from all these companies. That said, we are also bound to see laptops and smart devices that will sport even more power-packed chips if chip design and development heads down this road.