Meta, the re-branding of Facebook that signifies the dawn of a new age for the company, needs a powerful new AI research supercomputer to help it build better AI models for a range of purposes.
These are text analyses in multiple languages, images, video, augmented reality, and all that’s needed to pave the way towards building a computing platform for the so-called metaverse.
The AI will back features like real-time translation, high-level collaboration, and more. The supercomputer will also train new AI models to help determine what content is harmful or benign, keeping the virtual citizens of the metaverse safe through real-time content moderation.
To build this new cluster, Meta has partnered with NVIDIA, an expert in the field, who promised a commissioning date before the end of 2022. The new AI supercomputer will have the following technical features:
- 760 NVIDIA DGX A100 compute nodes
- 6,080 NVIDIA A100 GPUS linked on a Quantum 200Gb/s Infini Band
- Cache storage of 46 petabytes
- Max computing performance of 1,895 petaflops
For the record, NVIDIA built an AI supercomputer for Facebook again back in 2017, which used 22,000 NVIDIA V100 Tensor Core GPUs and handled 35,000 separate training jobs a day. The new system will be capable of training 20 times that number three times faster when both its phases are completed.
In the age of the shortages induced by the COVID-19 pandemic direct and side-effects, it’s impressive that NVIDIA managed to build the first part of the supercluster in just 18 months, so the project completion timeline remains optimistic.
For Meta, this new system will enable the company to continue along the same path they have been walking since 2013, which is AI research and model training. The more complex and rich the platforms become, the more power is required to carry out this research and develop practical AI tools.
In the case of the metaverse, the platform and the interactions that will take place there are of comparable complexity with the real world, so Facebook needs all the super-computing power they can get.