Nvidia 2024 AI Event: Everything Revealed in 16 Minutes

Nvidia 2024 AI Event: Everything Revealed in 16 Minutes

Hi, this is Wayne again with a topic “Nvidia 2024 AI Event: Everything Revealed in 16 Minutes”.
I hope you realize this is not a concert you have arrived at a developers conference. There will be a lot of science described algorithms, computer architecture, mathematics Blackwell is not a chip. Blackwell is the name of a platform. Uh people think we make gpus and, and we do but gpus don’t look the way they used to this is hopper.

Nvidia 2024 AI Event: Everything Revealed in 16 Minutes

Hopper changed the world. This is Blackwell, it’s okay, Hopper, 28 billion transistors, and so so you could see you. I can see there there a small line between two dyes.

Nvidia 2024 AI Event: Everything Revealed in 16 Minutes

This is the first time two dieses have abutted like this together in such a way that the two CH, the two dies, think it’s one chip, there’s 10 terabytes of data between it 10 terabytes per second. So that these two, these two sides of the Blackwell Chip have no clue which side they’re on there’s no memory, locality, issues, no cach issues, it’s just one giant chip and it goes into two types of systems. The first one is form fit function compatible to Hopper, and so you slide a hopper and you push in Blackwell.

Nvidia 2024 AI Event: Everything Revealed in 16 Minutes

That’S the reason why one of the challenges of ramping is going to be so efficient. There are installations of Hoppers all over the world and they could be, they could be. You know the same infrastructure, same design, the power, the electricity, The Thermals, the software identical push it right back, and so this is a hopper version for the current hgx configuration, and this is what the other the second Hopper looks like this now.

This is a prototype board. This is a fully functioning board and I just be careful here this right here is I don’t know10 billion the second one’s five. It gets cheaper after that, so any customer in the audience. It’S okay, the gray CPU has a super fast chipto chip link.

What’S amazing is this: computer is the first of its kind, where this much computation first of all fits into this small of a place. Second, it’s memory coherent. They feel like they’re, just one big, happy family working on one application. Together we created a processor for the generative AI era and one of the most important important parts of it is content.

Token generation. We call it. This format is fp4. The rate at which we’re advancing Computing is insane, and it’s still not fast enough.

So we built another chip. This chip is just an incredible chip. We call it. The mvy link switch it’s 50 billion transistors, it’s almost the size of Hopper all by itself.

This switch ship has four MV links in it, each 1.8 terabytes per second and and it has computation in it. As I mentioned, what is this chip for? If we were to build such a chip, we can have every single GPU talk to every other GPU at full speed. At the same time, you can build a system that looks like this now this system, this system is kind of insane. This is one dgx.

This is what a dgx looks like now, just so you know there only a couple: two three exop flops machines on the planet as we speak, and so this is an exif flops, AI system, in one single rack I want to thank, I want to thank some Partners that that are joining us in this uh aw is gearing up for Blackwell they’re uh they’re, going to build the first uh GPU with secure AI they’re uh, building out a 222 exif flops system. We Cuda accelerating Sage maker AI. We Cuda accelerating Bedrock AI uh Amazon robotics is working with us uh using Nvidia Omniverse and Isaac.

Sim AWS Health has Nvidia Health Integrated into it. So AWS has has really leaned into accelerated Computing. Uh. Google is gearing up for Blackwell.

Gcp already has A1 100s h100s t4s l4s a whole Fleet of Nvidia Cuda gpus, and they recently announced the Gemma model that runs across all of it. Uh we’re work working to optimize, uh and accelerate every aspect of gcp we’re accelerating data proc, which for data processing, the data processing engine Jacks, xlaa, vertex, Ai and mujo for robotics, so we’re working with uh, Google and gcp across whole bunch of initiatives. Uh Oracle is gearing up for blackw.

Oracle is a great partner of ours for Nvidia dgx cloud and we’re also working together to accelerate something. That’S really important to a lot of companies. Oracle database Microsoft is accelerating, and Microsoft is gearing up for Blackwell Microsoft. Nvidia has a wide- ranging partnership, we’re accelerating could accelerating all kinds of services when you, when you chat, obviously and uh AI services that are in Microsoft, Azure uh, it’s very, very, very likely nvidia’s in the back uh doing the inference and the token generation uh we Built they built the largest Nvidia infiniband super computer, basically, a digital twin of ours or a physical twin of ours.

We’Re bringing the Nvidia ecosystem to Azure Nvidia DJ’s Cloud to Azure uh Nvidia Omniverse is now hosted in Azure. Nvidia Healthcare is in Azure and all of it is deeply integrated and deeply connected with Microsoft fabric, a NM, it’s a pre-trained model. So it’s pretty clever and it is packaged and optimized to run across nvidia’s install base, which is very, very large. What’S inside it is incredible. You have all these pre-trained stateof the open source models. They could be open source, they could be from one of our partners. It could be created by us like Nvidia moment. It is packaged up with all of its dependencies, so Cuda the right version, cdnn, the right version, tensor RT llm, Distributing across the multiple gpus tried and inference server all completely packaged together.

It’S optimized, depending on whether you have a single GPU, multi-, GPU or multi- node of gpus, it’s optimized for that, and it’s connected up with apis that are simple. To use these packages. Incredible bodies of software will be optimized and packaged and we’ll put it on a website and you can download it. You could take it with you, you could run it in any Cloud. You could run it in your own data Center. You can run in workstations if it fit, and all you have to do is come to ai.

nvidia.com. We call it Nvidia inference microservice, but inside the company we all call it Nims. We have a service called Nemo microservice that helps you curate. The data preparing the data so that you could teach this on board this AI, you fine-tune them, and then you guardrail it. You can even evaluate the answer, evaluate its performance against um other other examples, and so we are effectively an AI Foundry. We will do for you and the industry on AI what tsmc does for us building chips, and so we go to it with our go to tsmc with our big Ideas.

They manufacture and we take it with us and so exactly the same thing here. Ai Foundry and the three pillars are the NIMS, Nemo microservice and dgx Cloud. We’Re announcing that Nvidia AI Foundry is working with some of the world’s great companies. Sap generates 87 % of the world’s global Commerce. Basically, the world runs on sap. We run on sap, Nvidia and sap are building sap, Jewel co-pilots uh, using Nvidia, Nemo and dgx Cloud uh service.

Now they run 80 85 % of the world’s Fortune. 500 companies run their people and customer service operations on service now and they’re, using Nvidia AI Foundry to build service now uh assist virtual assistance. Cohesity backs up the world’s data, their sitting on a gold mine of data, hundreds of exobytes of data over 10,000 companies, Nvidia AI Foundry, is working with them, helping them build their Gia generative AI agent snowflake is a company that stores the world’s uh digital Warehouse.

In the cloud and serves over three billion queries a day for 10,000 Enterprise customers, snowflake is working with Nvidia AI Foundry to build co-pilots with Nvidia Nemo and Nims net apppp. Nearly half of the files in the world are stored on Prem on net app Nvidia. Ai Foundry is helping them uh, build chat, Bots and co-pilots like those Vector databases and retrievers with enidan, Nemo and Nims, and we have a great partnership with Dell.

Everybody who everybody who is building these chatbots and generative AI when you’re ready to run it you’re, going to need an AI Factory, and nobody is better at Building endtoend Systems of very large scale for the Enterprise than Dell, and so anybody any company. Every company will need to build AI factories, and it turns out that Michael is here, he’s happy to take your order. We need a simulation engine that represents the world digitally for the robot, so that the robot has a gym to go, learn how to be a robot. We call that virtual world Omniverse and the computer that runs Omniverse is called ovx and ovx. The computer itself is hosted in the Azure Cloud. The future of heavy Industries starts as a digital twin.

The AI agents, helping robots workers and infrastructure navigate unpredictable events in complex industrial spaces will be built and evaluated first in sophisticated digital twins. Once you connect everything together, it’s insane how much productivity you can get, and it’s just really really wonderful. All of a sudden everybody’s operating on the same ground.

Truth, you don’t have to exchange data and convert data, make mistakes. Everybody is working on the same ground. Truth from the design Department to the art Department, the architecture Department, all the way to the engineering and even the marketing department.

Today, we’re announcing that Omniverse Cloud streams to The Vision Pro, and it is very, very strange that you walk around virtual doors when I was getting out of that car and everybody does it. It is really really quite amazing Vision, Pro connected to Omniverse portals you into Omniverse, and because all of these cat tools and all these different design tools are now integrated and connected to Omniverse. You can have this type of workflow really incredible.

This is Nvidia Project Groot, a general purpose Foundation model for humanoid robot learning. The group model takes multimodal instructions and past interactions as input and produces the next action for the robot to execute. We developed Isaac lab a robot learning application to train Gro on Omniverse Isaac. Sim and we scale out with osmo a new compute orchestration service that coordinates workflows across djx systems for training and ovx systems for simulation. The group model will enable a robot to learn from a handful of human demonstrations, so it can help with everyday tasks and emulate human movement. Just by observing us all. This incredible intelligence is powered by the new Jetson Thor robotics chips designed for Gro built for the future, with Isaac lab osmo and Groot we’re providing the building blocks for the next generation of AI powered, [ Applause, ] robotics about the same size, the soul of Nvidia.

The intersection of computer, Graphics, physics, artificial intelligence – it all came to bear at this moment the name of that project. General robotics. 003. I know super good super good.

Well, I think we have some special guests. Do we hey guys? So I understand you guys are powered by Jetson they’re powered by Jetson little Jetson robotics computer inside they learn to walk in Isaac Sim. Ladies and gentlemen, this this is orange, and this is the famous green. They are the bdx robots of Disney amazing, Disney research come on you guys, let’s wrap up, let’s go five things where you going. I sit right here, Don’t Be Afraid, come here green hurry up. What are you saying? No, it’s not time to eat.

It’S not time to eat I’ll, give I’ll give you a snack in a moment. Let me finish up real quick, come on green hurry up, stop wasting time. This is what we announce to you today. This is Blackwell. This is the plat amazing, amazing processors, MV, link switches, networking systems and the system. Design is a miracle. This is Blackwell, and this to me is what a GPU looks like in my mind: .