AI technology is evolving fast. Data Centres need to keep up.
The AI industrial revolution is in full swing.
If Chat GPT made waves in the AI industry, new AI’s like Groq and Sora are going to make a tsunami.
AI chip company Groq recently announced the release of its latest large language model (LLM) running in Meta’s LLaMA with a demo showing how Groq could significantly accelerate the performance of AI inferencing. The building blocks of natural language processing models are ‘tokens’. Tokens are coded pieces of text which enable the AI to analyse and understand human language. The tests showed Groq running at 500 tokens a second which is 20x faster than Chat GPT according to benchmark test comparisons by ArtificalAnalysis.com. This speed boost will be massive for generative AI models and will make AI inferencing even faster and improve practical, real-world use.
In the same week, Open AI released Sora, a generative AI model that creates video from text. From just a short text prompt Sora can create up to 60-second video clips, with the ability to create videos from static images as well as fill in the blanks in existing videos. The examples I’ve seen are highly realistic and exceptionally creative.
If this is where we are today, how far will we go in the future? Already AI technologists are exploring the possibility of Artificial General Intelligence (AGI), where we will move from having different AI tools for specific problems to all-encompassing systems that can solve any problem a human can.
Surf up for the new waves of AI infrastructure.
NVIDIA is currently leading the charge in AI chip design and has become a household name with the largest GPU (Graphic Processing Units) market share. However, the huge demand for chips to support the growth in AI has created a gap in the market for new entrants with alternate designs. This leaves little doubt that, as the speed and accuracy requirements of these of AI tools continue to evolve, so too will the physical architectures on which they are built.
Technology like Groq has been made possible with their new style of Tensor Processing Unit (TPU), the GroqChip, which they refer to as a Language Processing Unit (LPU) and is a whole new processor category altogether. Unlike GPU chips that ChatGPT and Gemini run on, LPU’s have a more streamlined processing architecture that enables AI inferencing to perform even faster. The LPU is just one example of a new competitor challenging the status quo.
What does this mean for data centres?
AI technology, like the GroqChip and Nvidia’s H100 GPU are deployed in super-dense architectures to reduce latency for parallel processing and high-speed connectivity. These AI environments require much more power and cooling than the cloud architectures we are familiar with, demands which will only increase as these systems become more embedded in our everyday work practices. This is fueling innovative new thinking around data centre design and technology. One example of this is Liquid Cooling (such as Immersion Cooling or direct-to-chip cooling) which uses far less power to more efficiently cool AI and Cloud infrastructure. NVIDIA CEO, Jensen Huang recently confirmed that the companies next DGX server family will be liquid cooled.
Given the increasing need for high-density solutions and evolving chip designs, not all data centres will be able to support these super-dense architectures. AI data centres need a combination of highly resilient power and cooling designs that are robust enough to meet hyperscale compute demands, can adapt rapidly to new architectures, and can maintain unparalleled uptime. Unlike traditional data centres with more rigid, conventional designs, AI-ready data centres are designed from the ground up to accommodate large-scale deployments of new technologies for AI training and inferencing.
Data centres that are purpose built for AI, like Macquarie Data Centres, IC3 Super West, are designed to accommodate these new technologies and optimise the power/cooling equation so that AI workloads run at the lowest Power Usage Effectiveness (PUE) possible, saving power costs and improving environmental outcomes.
This is not the beginning of AI, but it’s certainly the year that ChatGPT brought AI into the collective consciousness, and the pressure is on for data centres to keep up with the fast pace of AI innovation.