Google has announced the Cloud TPU v5p, its most powerful AI accelerator yet.

The company announced today that it has launched its new Gemini large language model and, with that, the new Cloud TPU v5p, an upgraded version of the Cloud TPU v5e that launched into general availability earlier this year.
Google has announced the Cloud TPU v5p, its most powerful AI accelerator yet.

The company announced today that it has launched its new Gemini large language model and, with that, the new Cloud TPU v5p, an upgraded version of the Cloud TPU v5e that launched into general availability earlier this year. A v5p pod will consist of 8,960 chips and will be backed by Google's fastest interconnect yet, with up to 4,800 Gbps per chip.

Of course, Google claims that these chips are much faster than the v4 TPUs. And the team cites the 2x and 3x improvements of FLOPS and high-bandwidth memory respectively in the v5p. It's a little like comparing the new Gemini model to the older OpenAI GPT 3.5 model though. Google itself already moved the state of the art beyond the TPU v4. In many respects, though, v5e pods were a bit of a step back from the v4 pod, featuring just 256 v5e chips per pod versus 4096 in the v4 pods and a total of 197 TFLOPs 16-bit floating point performance per v5e chip versus 275 for the v4 chips. For the new v5p, Google promises up to 459 TFLOPs of 16-bit floating point performance, backed by the faster interconnect.

Google says all this means the TPU v5p can train a large language model like GPT3-175B 2.8 times faster than the TPU v4 — and do so more cost-effectively, too (though the TPU v5e, while slower, actually offers more relative performance per dollar than the v5p).

"In our early stage usage, Google DeepMind and Google Research have seen 2X speedups for LLM training workloads using TPU v5p chips compared with the performance on our TPU v4 generation," writes Jeff Dean, chief scientist, Google DeepMind and Google Research. "The robust support for ML Frameworks (JAX, PyTorch, TensorFlow) and orchestration tools enables us to scale even more efficiently on v5p.". We also do see important gains in the performance of embeddings-heavy workloads with the 2nd generation of SparseCores. TPUs are crucial to allow our largest-scale research and engineering efforts on cutting-edge models like Gemini."


The new TPU v5p isn't yet available to the general public, so developers will have to reach out to their Google account manager to get on the list.

Blog
|
2024-11-19 20:59:43