Hi, this is Wayne again with a topic “Google I/O 2024: Everything Revealed in 12 Minutes”.
Welcome to Google IO it’s great to have all of you with us. More than 1.5 million developers use Gemini models across our tools, you’re using it to debug code, get new insights and the build build. The next generation of AI applications we’ve also been bringing Gemini’s breakthrough capabilities across our products in powerful ways. We’Ll show examples today across search photos, workspace, Android and more today we have some exciting new progress to share about the future of AI assistance that we’re calling project Astra building on our Gemini model.
We developed agents that can process information Faster by continuously encoding video frames. Combining the video and speech input into a timeline of events and caching, this for efficient, recall, tell me when you see something that makes sound. I see a speaker which makes sound. Do you remember where you saw my glasses? Yes, I do your glasses were on the desk near a red apple. What can I add here to make this system faster? Adding a cach between the server and database could improve speed. What does this remind you of shringer cat today? I’M excited to announce our newest.
Most capable generative video model called vo vo, creates high quality 1080p videos from text image and video prompts it can capture the details of your instructions in different Visual and cinematic Styles. You can prompt for things like aerial shots of a landscape or a time lapse and further edit your videos using additional, prompts you can use vo in our new experimental tool. Called video FX we’re exploring features like storyboarding and generating longer scenes vo gives you unprecedented creative control.
Core technology is Google deep mind’s, generative video model that has been trained to convert input text into output, video. It looks good. We are able to bring ideas to life that were otherwise not possible. We can visualize things on a time scale, that’s 10 or 100 times faster than before. Today we are excited to announce the sixth generation of tpus called trillion. Trillum delivers a 4.7x Improvement in compute performance per chip over the previous generation.
It’S our most efficient and performant TPU. Today, we’ll make trillum available to our Cloud customers in late 2024. Alongside our tpus, we are proud to offer CPUs and gpus to support any workload.
That includes the new Axion processes we announced last month, our first custom arm-based CPU, with industry-leading performance and Energy Efficiency. We are also proud to be one of the first Cloud providers to offer envidia Cutting Edge Blackwell gpus available in early 2025, one of the most exciting Transformations with Gemini has been in Google search in the past year. We answered billions of queries as part of her search generative experience.
People are using it to search in entirely new ways and asking new types of questions longer and more complex queries, even searching with photos and getting back the best the web has to offer. We’Ve been testing this experience outside of labs and we are encouraged to see not only an increase in search usage, but also an increase in user satisfaction. I’M excited to announce that we will begin will’ll begin launching this fully revamped experience. Ai overviews to everyone in the US this week and we’ll bring it to more countries, soon say: you’re heading to Dallas, to celebrate your anniversary and you’re. Looking for the perfect restaurant, what you get here breaks AI out of the box, and it brings it to the whole page. Our Gemini model uncovers the most interesting angles for you to explore and organizes these results into these helpful clusters like like you, might never have considered restaurants with live music or ones with historic charm.
Our model even uses contextual factors like the time of the year. So since it’s warm in Dallas, you can get rooftop patios as an idea and it pulls everything together into a dynamic whole page experience you’ll start to see this new AI organized search results page when you look for inspiration, starting with dining and recipes and coming to Movies, music books, hotels, shopping and more I’m going to take a video and ask Google: why will does not stay in place and in a near instant? Google gives me an AI overview. I guess some reasons this might be happening and steps I can take to troubleshoot. So looks like first, this is called a tonger very helpful and it looks like it may be unbalanced and there’s some really helpful steps here, and I love that because I’m new to all this, I can check out this helpful link from Audio Technica to learn even More and this summer, you can have an in-depth conversation with Gemini using your voice.
We’Re calling this new experience live using Google’s latest speech models. Gemini can better understand you and answer. Naturally, you can even interrupt while Gemini is responding and it will adapt to your speech patterns, and this is just the beginning. We’Re excited to bring the speed gains and video understanding capabilities from Project Astra to the Gemini app when you go live you’ll be able to open your camera, so Gemini can see what you see and respond to your surroundings in real time now. The way I use Gemini isn’t the way you use Gemini, so we’re rolling out a new feature that lets you customize it for your own needs and create personal experts on any any topic. You want we’re calling these gems they’re, really simple, to set up just tap, to create a gem, write your instructions once and come back whenever you need it.
We’Ve embarked on a multi-year journey to reimagine Android with AI at the core, and it starts with three breakthroughs: you’ll see this year, first we’re putting AI powered search right at your fingertips, creating entirely new ways to get the answers you need. Second Gemini is becoming your new AI assistant on Android there to help you any time and third, we’re harnessing on device AI to unlock new experiences that work as fast as you do, while keeping your sensitive data private one thing we’ve heard from students is that they’re Doing more of their schoolwork directly on their phones and tablets, so we thought, could Circle the search, be your perfect study? Buddy? Let’S say my son needs help with a tricky physics. Word problem like this one. My first thought is: oh boy, it’s been a while, since I’ve thought about kinematics, if he stumped on this question, instead of putting me on the spot, he can Circle.
The exact part he’s stuck on and get stepbystep instructions right. Where he’s already doing the work now we’re making Gemini context aware, so it can anticipate what you’re trying to do and provide more helpful suggestions in the Moment, In other words, to be a more helpful assistant. So let me show you how this works and I have my shiny new pixel 8A here to help me so my friend Pete is asking if I want to play pickle ball this weekend, and I know how to play tennis sort of. I had to say that for the demo uh, but I’m new to this pickle ball thing, so I’m going to reply and try to be funny and I’ll say uh is that, like tennis, but with uh, pickles um, this would be actually a lot funnier with a Meme, so let me bring up Gemini to help with that and I’ll say: uh create image of tennis with Pickles.
Now one you think you’ll notice is that the Gemini window now hovers in place above the app so that I stay on the flow okay, so that generates some pretty good images uh what’s nice is. I can then drag and drop any of these directly into the messages app below so like so and now I can ask specific questions about the video so, for example, uh what is is kind type, the two bounce rule, because that’s something that I’ve heard about, but Don’T quite understand in the game by the way this us signals like YouTube’s captions, which means you can use it on billions of videos, so give it a moment and there and get a nice distinct answer. The ball must B once on each side of the Court. Uh, after a serve so instead of trolling through this entire document, I can pull up Gemini to help and again Gemini anticipates what I need and offers me an ask this PDF option. So if I tap on that Gemini now ingests all of the rules to become a pickle ball expert, and that means I can ask very esoteric. Questions like, for example, are spin uh serves allowed and there you have it it turns out.
Nope spin serves are not allowed, so Gemini not only gives me a clear answer to my question. It also shows me exactly where, in the PDF to learn more building, Google AI directly into the OS elevates the entire smartphone experience and Android is the first mobile operating system to include a built-in on device Foundation model. This lets us bring Gemini goodness from the data center right into your pocket, so the experience is faster, while also protecting your privacy, starting with pixel. Later this year, we’ll be expanding what’s possible with our latest model Gemini Nano with multimodality. This means your phone can understand. The world the way you understand it so not just through text input but also through sites, sounds and spoken language.
Before we wrap. I have a feeling that someone out there might be counting how many times you have mentioned AI today, [ Applause, ] and since the big theme today has been letting Google do the work for you. We went ahead and counted so that you don’t have [ Applause. ] to that might be a record in how many times someone has said: AI .