Nvidia’s 2024 Computex Keynote: Everything Revealed in 15 Minutes

Nvidia's 2024 Computex Keynote: Everything Revealed in 15 Minutes

Hi, this is Wayne again with a topic “Nvidia’s 2024 Computex Keynote: Everything Revealed in 15 Minutes”.
Just last week Google announced that they’ve put CDF in the cloud and accelerate pandas. Pandas is the most popular data science library in the world. Many of you in here probably already use pandas. It’S used by 10 million data scientists in World downloaded 170 million times each month. It is the Excel that is the spreadsheet of data scientists. Well, with just one click. You can now use pandas in C collab, which is Google’s cloud data, centers platform accelerated by CDF. The speed up is really incredible: let’s take a look that was a great demo right didn’t take long. This is Earth two, the idea that we would create a digital twin of the earth that we would go and simulate the Earth so that we could predict the future of our planet to better avert disasters or better understand the impact of climate change, so that we Can adapt better so that we could change our habits now? This digital twin of Earth is probably one of the most ambitious projects that the world’s ever undertaken and we’re taking taking step large steps every single year and I’ll show you results every single year, but this year we made some great breakthroughs. Let’S take a look on Monday: the storm will via North again and approach Taiwan. There are big uncertainties regarding its path. Different paths will have different levels of impact on Taiwan someday in a near future, we will have continuous weather prediction at every at every square. Kilometer. On the planet, you will always know what the climate’s going to be. You will always know, and this will run continuously because we’ve trained the AI and the AI requires so little energy in the late 1890s Nicola Tesla invented an AC generator. We invented an AI generator. The AC generator generated electrons nvidia’s, AI generator, generates tokens. Both of these things have large Market opportunities. It’S completely fungible in almost every industry and that’s why it’s a new Industrial Revolution, and now we have a new Factory, a new computer and what we will run on top of this is a new type of software and we call it Nims Nvidia inference microservices.

Now, what what happens is the Nim runs inside this Factory and this Nim is a pre-train model. It’S an AI! Well, this AI is, of course, quite complex in itself, but the the Computing stack that runs AI are insanely complex when you go and use chat, GPT underneath their stack is a whole bunch of software, underneath that prompt is a ton of software and it’s incredibly complex Because the models are large billions to trillions of parameters, it doesn’t run on just one computer. It runs on multiple computers.

Nvidia's 2024 Computex Keynote: Everything Revealed in 15 Minutes

It has to distribute the workload across multiple gpus tensor parallelism pipeline parallelism data, parallel, all kinds of parallelism, expert parallelism, all kinds of parallelism, Distributing the workload across multiple gpus processing it as fast as possible, because if you are in a factory, if you run a factory, Your throughput directly correlates to your revenues. Your throughput directly correlates to quality of service, and your throughput directly correlates to the number of people who can use your service. We are now in a world where data center throughput utilization is vitally important.

Nvidia's 2024 Computex Keynote: Everything Revealed in 15 Minutes

It was important in the past, but not violently important. It was important in the past, but people don’t measure it it today. Every parameter is measured, start time, uptime utilization throughput idle time. You name it because it’s a factory when something is a factory.

Its operations directly correlate to the financial performance of the company, and so we realize that this is incredibly complex for most companies to do so. What we did was we created this AI in a box and it contained ERS and incredible. Ams of software inside this container is Cuda CNN tensor, RT Triton for inference Services. It is cloud native so that you could Auto scale in a kubernetes environment. It has Management Services and hooks so that you can monitor your AIS. It has common apis standard API so that you could literally chat with this box.

We now have the ability to create large language models and pre-train models of all kinds, and we we have all of these various versions, whether it’s language based or Vision, based or Imaging based, or we have versions that are available for healthc care, digital biology. We have versions that are digital humans, that I’ll talk to you about and the way you use this of just come to ai. nvidia.com and today we uh just posted up in hugging, face the Llama 3 Nim, fully optimized, it’s available there for you to try and You can even take it with you, it’s available to you for free and so and finally, AI models that reproduce lifelike appearances, enabling real-time path, traced, subsurface, scattering to simulate the way light penetrates the skin, scatters and exits at various points, giving skin its soft and translucent Appearance Nvidia Ace is a suite of digital human Technologies, packaged as easy to deploy fully optimized, microservices or Nims developers can integrate Ace Nims into their existing Frameworks, engines and digital human experiences, neotron slm and llm Nims to understand our intent and orchestrate other models. Rea speech Nims for interactive speech and translation audio to face and gesture Nims for facial and body animation, an Omniverse RTX with dlss for neural, rendering of skin and hair, and so we installed every single RTX GPU with tensor core G tensor, core processing. And now we have 100 million GeForce RTX AIP, PCS in the world and we’re shipping 200, and this this copy Tex we’re featuring four new amazing laptops. All of them are able to run AI.

Your future fure laptop, your future PC, will become an AI it’ll, be constantly helping you assisting you in the background, ladies and gentlemen, this is Blackwell. Black Blackwell is in production, incredible amounts of Technology. This is our production board. This is the most complex highest performance computer.

The world’s ever made this is the gray CPU, and these are you could see each one of these blackw dieses, two of them connected together. You see that it is the largest die, the largest chip the world makes, and then we connect two of them together with a 10 terabyte per second link. So this is a dgx Blackwell. This has.

This is air cool? Has eight of these gpus inside look at the size of the heat sinks on these gpus about 15 kilowatt, 15,000 watts and completely air cooled? This version supports x86 and it’s it goes into the infrastructure that we’ve been shipping Hoppers into. However, if you would like to have liquid cooling, we have a new system and this new new system, it’s based on this board and we call it mgx for modular and this modular system. You won’t be able to see this this. Can they see this? Can you see this, you can the are you okay, I say, and so this is the mgx system and here’s the two uh black Blackwell boards.

Nvidia's 2024 Computex Keynote: Everything Revealed in 15 Minutes

So this one node has four Blackwell chips, these four blackwall chips. This is liquid, cooled nine of them, nine of them. Uh well, 72 of these 72 of these gpus 72 of these G gpus are then connected together with a new MV link. This is MV link, switch fifth generation and the MV link switch is a technology Miracle. This is the most advanced switch the world’s ever made. The data rate is insane, and these switches connect every single one of these black Wells to each other, so that we have one giant 72, GPU Blackwell. Well, the benefit. The benefit of this is that in one domain, one GP domain – this now looks like one GPU.

This one GPU has 72 versus the last generation of eight, so we increased it by nine times the amount of bandwidth we’ve increased by 18 times. The AI flops we’ve increased by 45 times, and yet the amount of power is only 10 times. This is 100 kilow, and that is 10 kilow.

This is one GPU, ladies and gentlemen, dgx GPU, you know the back of this GPU is the MV link spine. The mvlink spine is 5,000 wires, two miles and it’s right here. This is an MV link spine and it connects 702 gpus to each other.

This is a electrical mechanical Miracle. The transceivers makes it possible for us to drive the entire length in Copper and, as a result, this switch the Envy switch Envy link switch driving. The Envy link spine in Copper makes it possible for us to save 20 kilow in one rack. 20. Kow can now be used for processing just an incredible achievement, so we have code names in our company and we try to keep them very secret. Uh often times uh most of the employees don’t even know, but our next Generation platform is called Reuben.

The Reuben platform, the Reuben platform um – I’m I’m not going to spend much time on it uh. I know, what’s going to happen, you’re going to take pictures of it and you’re going to go, look at the fine prints, uh and feel free to do that. So we have the Ruben platform, and one year later we have the Ruben um Ultra platform.

All of these chips that I’m showing you here are all in full development, 100 % of them and the rhythm is one year at the limits of Technology, all 100 % architecturally compatible. So this is, this is basically what Nvidia is building a robotic Factory is designed with three computers: train the AI on Nvidia AI. You have the robot running on the PLC systems uh for orchestrating the the the factories, and then you of course, simulate everything inside Omniverse.

Well, the robotic arm and the robotic amrs are also the same way. Three computer systems. The difference is the two omniverses will come together. So they’ll share one virtual space when they share one virtual space that robotic arm will become inside the robotic Factory and again three three uh three computers and we provide the the computer, the acceleration layers and pre-train uh pre-trained AI models we’ve well, I think we have Um some robots that we like to uh welcome. Here we go about my [ Applause ] size and we have. We have some friends to join us, so the fut, the future of robot robotics, is here the next wave of AI and and of course you know, Taiwan builds computers with keyboards.

You build computers for your pocket. You build computers for data centers in the cloud in the future, you’re going to build computers that walk and computers that roll you know around and um. So these are all just computers and, as as it turns out uh, the technology is very similar to the technology of building. U all of the other computers that you already buil today, so this is going to be a really extraordinary uh Journey for us. .