Hi, this is Wayne again with a topic “The Insane Hardware Behind ChatGPT”.
When you fire up chat, GPT you’re, connecting to a big silicon brain that lives somewhere and contains something, so what exactly is that thing? What is the hardware that runs the chat, bot that you’ve fallen in love with there’s, actually a sort of basic building block of the chat, CPT infrastructure, the Nvidia a100 GPU? And if you thought, the graphics card in your computer was expensive. The a100 goes to around ten thousand dollars a pop roughly the same as six RTX 4090s artificial intelligence applications, often utilize gpus, because gpus are very good at doing lots of math calculations at once. In parallel and nvidia’s, newer models have the tensor cores, which are good at Matrix operations that AIS frequently use. So even though the a100 is called a GPU, it’s built specifically for AI and analytical applications, and as such, you can’t realistically game on it. It doesn’t even have a display out, although you can get the a100 in a PCI Express version such as the one Linus holding up here.
It’S more common in data centers for them to come in this form factor called sxm4. Unlike a normal graphics card, the sxm4 cards lie flat and connect to a large motherboard like PCB using a pair of sockets, with the connectors sitting on the underside of the card. Although sxm is just a connector and data is still carried over a PCI Express interface. Sxm4 is preferred over the traditional pcie slot for data centers, because the socket can handle more electrical power. The pcie version of the a100 can use a Max of up to 300 watts, but the sxm4 version handles up to 500 watts leading to higher performance. An SX M4 a100 has 312 teraflops of fp16 processing power. To put that in context, that’s nearly four times as much as an RTX 4090, the most powerful consumer GPU on the market at the time of filming. Additionally, these gpus are linked up with a high-speed nv-link interconnect so that the gpus that sit on a single board can act like a single gigantic chungus GPU.
Now, let you know what lies at the heart of the GPU servers, though exactly how many a100s are needed to keep the service running for a hundred million users. We’Ll tell you right after we thank circuit specialist for sponsoring this video circus Specialists, provide tools and supplies to the stem community. At competitive prices, explore their wide range of offerings, including resistors capacitors, soldering stations, oscilloscopes and more leveraging their technology and sourcing expertise.
Their goal is to give customers access to tools and parts that may otherwise be unavailable or too expensive. Their commitment to Quality ensures that you receive reliable and High Performance Products suited for your needs, so let circus specials help. You upgrade your electronics toolkit by checking them out at the link below and using code lmg for 10 off it turns out. You can run chat, GPT just fine on its own on a single Nvidia hgx a100 unit. These units typically contain Eight a100 gpus in One machine powered by a pair of server CPUs that each feature a few dozen cores.
But the issue is that with so many users you, you need a lot more processing power to ensure the chatbot can answer query smoothly. For everyone, openai and Microsoft, who are behind the chat, GB project haven’t disclosed exact numbers about their Hardware, but given the processing capacity of these hgx a100 systems chat, GPT likely uses somewhere around 30 000 a100s to keep up with demand. To put that into context, it’s a heck of a lot more than the roughly four or five thousand they likely needed to train the language model in the first place. Training is the process of feeding the AI lots of information in order to build it out before it can be used publicly intuitively, it might seem like the training process would need more processing power than actually running the model, but because of the massive amount of i O chat GPT has to handle with 100 million users.
It ends up actually requiring roughly six times more gpus, to run it and with as pricey as these systems are. You can bet this meant a massive investment on Microsoft’s part, while the actual dollar amount hasn’t been disclosed. We do know that it was in the hundreds of millions of dollars, in addition to several hundred grand a day, just to keep the system running unless you think that Microsoft is going to stop there. The company is also integrating the newer Nvidia h100 gpus into its Azure Cloud, AI service, which actually dwarfed the a100’s fp16 performance by a factor of six. In addition to adding fp8 support, which should prove to be very useful for AI. Due to how the math calculations involved in running AI models work, not only will this ensure that more people can use chat, gbt and other AI services, but will also allow Microsoft to train more complicated large language models. Maybe soon you’ll be able to completely replace those pesky real-life friends of yours. So thanks for watching guys, if you like this video hit like hit, subscribe and hit us up in the comment section with your suggestions for topics that we should cover in the future. .