Hi, this is Wayne again with a topic “AI Generated Videos Just Changed Forever”.
That’S all right, so this is simultaneously really impressive and really frightening at the same time, and it’s hitting me in ways that I didn’t really expect so, do you remember, Will Smith eating spaghetti? Do you remember when this was what AI generated videos look like remember when we said okay, this AI stuff is cool and all, but clearly there’s a long way to go before. There’S any need for concern. Well, welcome to the Future people, because this is also an AI generated video, and so is this completely synthesized out of thin air by computers. This one too. This is not real, absolutely ridiculous.
How far we’ve come in literally one year. This does feel like another chat. Gpt Dolly moment for AI, and maybe I’m overreacting, because okay, I’m a video creator, so an AI, that’s actually doing my job. Maybe that feels a little more threatening, so I’m particularly impressed by it, but also so this. This stuff is really good.
So today, Sam Altman and open AI announced a new model called sora and it can generate full up to one minute. Video clips from just text input. So the same way dolly was able to understand our text input and turn it into a photo, realistic or stylized image or whatever you want same thing with Sora.
But now, since it’s videos, it also needs to understand how all these things like, refle and textures and materials and physics all interact with each other over time, to make a reasonable looking video and, of course, right away. There’S a bunch of examples on their website that are crazy. Now, before I show you these, I just need you to keep this in mind, you’re about to watch a bunch of AI generated videos, and you know that you’re about to watch a bunch of AI generated content. So your brain you’re already looking for this stuff – and it’s not perfect – you will find imperfections, but not everybody who sees AI gener ated content on the internet knows to be looking for that. So also keep that in mind. This is also the worst that this technology is going to be from here on now.
So, okay, here’s one of the videos there’s no audio to any of these clips, but the prompt for this one is a stylish woman walks down a Tokyo Street filled with warm glowing, neon and animated City signage. She wears a black leather jacket, a long red dress and black boots. This video is already m ahead of where we were.
It has accurate lighting. It has materials it has skin tones movements even has Reflections all over the place. Now, of course, if you look at it for more than about 10 seconds very closely, there are lots of giveaways like this dude in the background kind of looks like he’s gliding in a weird way. The frame rates and the Reflections in the water are for some reason lower than the rest of the video. The camera movement overall is just a bit inconsistent and it just I don’t know it just kind of feels a little bit off, but then again this. This is where we were one year ago, so just keep that in the back of your head for all this okay, how about this one? This is another one which has a long prompt about a camera following behind a white vintage SUV with a black roof rack.
As it speeds up a steep dirt road, this is also again really good. This kind of looks a little more video gamey because of how Rock Solid the Drone footage is, but clearly very usable, here’s another one, a litter of golden retriever puppies playing in the snow, their heads pop in and out of the snow covered in it. It’S so good. It feels like the physics of the fur and the ears and everything with the snow flying around in slow motion is incredible. I’Ve looked through all of the sample videos on open ai’s website and clearly these are the handpicked best ones that they chose to share where they just put in some text and then get a video and don’t modify it, but there’s really impressive stuff in there.
Some of it has humans, some of it doesn’t. Some of it is more realistic feeling like the truck driving one, but some some of them are more video, gamey or more stylized. A lot of it is slow motion.
I just have to say how insanely fast these models are improving is genuinely like. That’S the shocking part, like I remember not even that many months ago, Dolly 3 really really high end, and you could always still find something off about it like, especially if you ask it for something like a photorealistic image of a human, something about like the hands Or the ears would always just be a little bit off, never mind the physics, but even this video here is crazy. At first glance, The Prompt for this AI generated video is a young man in his 20s is sitting on a piece of a cloud in the sky. Reading a book. This one feels like 90 % of the way there for me like it’s beyond the uncanny valley of like apples, personas which are actually based on humans.
This is a madeup person. I mean his eyes are kind of weird and the motion of the pages in the book are kind of odd and yeah obviously he’s in a cloud and that’s a giveaway, but like the lighting and the shadows and the skin tones. And then all the realism of the textures on the shirt and the way, the the shirt and the pants move and the hair they’re all really impressive, and then for this one. They typed in a movie trailer featuring the adventurers of the 30-year-old Spaceman wearing a red wool: knitted motorcycle helmet, Blue Sky salt desert cinematic style shot on 35mm film and the close-up of his face.
The Fabrics on the helmet, the film grain, through every shot and the Cinematic style. This is one of the most convincing AI generated videos. I’Ve ever seen minus, maybe the weird physics of that dude walking kind of in fast motion. So, Sam Alman, if you follow him on Twitter, he’s going through a whole bunch more of like people’s requests and posting a bunch more generated videos.
And so, if you want to check out his profile, you can see those. But here’s the thing about these AI generated videos now as good as they’ve gotten to this point. They can and will pass as real videos to people who are not looking for AI generated videos. Now that is obviously insanely sketchy during an election year in the US, and also terrifying for a bunch of other internet related reasons. But it’s also perfect for stock footage like there are already all kinds of presentations and advertisements and then PowerPoints that are in need of. Oddly specific stock videos and these AI generated videos are already good enough to 100 % pass for that purpose. Like look at this one, this one with the waves at Big Sir, this drone shot. Honestly, if I saw this on Twitter, I wouldn’t even think twice I’d. Be like oh nice, drone shot, dude, wouldn’t even think about AI.
If I wasn’t pixel peeping at like the way the water was moving like this. This is a totally usable video in an ad for some California based product, and that has all sorts of implications for the Drone pilot that no longer needs to be hired for all the photographers and videographers, whose footage no longer needs to be licensed to show up. In that ad, that’s being made it’s already that good, there’s other stuff like this wall of TVs, which would be a totally expensive and difficult thing to shoot with a camera and all these old, expensive props.
But if you can just generate it this well with reflections and the environment and everything else around it, I mean why do it any other way? It’S also very capable of historical themed footage, so this is supposed to be California during the Gold Rush it’s AI generated, but it could totally pass for the opening scene in an old western with the right music over it. How long until an entire ad every single shot is completely generated with AI or what about an entire YouTube video or an entire movie, I’m tempted to say like we’re a long way away from that, because you know this still has flaws clearly and there’s no sound And there’s a long way to go with the prompt engineering to iron these things out, but then again the the spaghetti was like a year ago now, actually like that open AI on their website, they show some of the downfalls too, of this particular model and because Who would know better than the people who have been using it? This is a a very private tool by the way right now, it’s in super limited access, so it’s in the hands of red teamers, which basically means people testing it pushing the limits trying to break it and a few Trust creators, but they have found uh plenty Of weird Edge stuff, like this clip here of a bunch of gray, wolf pups looks normal at first, but then it’s pretty clear that something’s kind of off with the way they’re just kind of appearing out of nowhere and walking through each other. That’S kind of weird or this clip of a guy running on a treadmill which I mean I don’t really have to say much more about why this one’s weird. But this is my favorite one again so again, just try to try to put yourself in the mind of someone who’s not expecting AI you’re, just scrolling through Facebook or Twitter, or something right.
So you just see this video so first I just want you to watch this clip as if it’s just a stock video you found of a grandma celebrating her birthday and just try to try to think like what. I wonder, what birthday she’s celebrating right. I don’t know how old do you think she is 60 65. Maybe it’s the big 70. She seems to really like that cake and did you did you see it? Did you catch that I’m going to I’m going to play it again, but this time watch the video knowing that AI generated photos and videos have trouble accurately doing hands I’ll play it again and now it feels super obvious. Like every time you watch it watch a different set of hands, it gets weirder and weirder. You can watch it like five times and there’s dead giveaway after dead giveaway, not even mentioning the weird inconsistencies with the direction of the wind on the candles. But even as I’m saying all that, even as it’s coming out of my mouth, I can’t help but remember that 12 months ago we were critiquing this.
So what is this all mean? Well, I mean there’s what it means now and there’s what it means for the future now Sora. This thing that they’ve made is clearly a really impressive video generation, AI tool that is both going to fool people and also be very useful. There’S there’s also a a watermark in the bottom corner of every video generated by it. So if you see one of those videos – and ideally it hasn’t been cropped out, then that’s at least a pretty clear indicator that it’s a generated. It’S sore video, but also I I do think they’re going to have to be very careful with this they’re they’re, going to have a whole bunch of safety stuff to keep in mind. I think they’ll probably have to be even more safe than Dolly like you shouldn’t be able to generate people’s likenesses like you shouldn’t, be able to make a politician.
Look like they’re doing something on video, especially this year, you probably won’t be able to make. Will Smith eating spaghetti, but it also definitely means stock. Video generation is absolutely going to take a dent out of video licensing like I can basically guarantee that like logistically. Why would anyone making something pay for footage of a house in the cliffs when they can generate one for free or for a small subscription price like that? Is the real scary part of what this tool implies, but in the future gets it gets pretty existential man? I mean okay. If this is trained on all videos that have ever been made by humans, then surely it can’t be Innovative or creative in ways that humans haven’t already been right. I don’t know either way I’ll have all the links below for all the Sora stuff.
For open AI stuff – and I guess I’ll talk to you next year, when we look back and go remember that first version of Sora and how bad those wolf pups looked when they spawned out of nowhere. Just remember this is the worst that this technology is going to be from here on out. That’S hot thanks for watching that’s hot catch you in the next one peace, .