AMD’s Hidden $100 Stable Diffusion Beast!

AMD's Hidden $100 Stable Diffusion Beast!

Hi, this is Wayne again with a topic “AMD’s Hidden $100 Stable Diffusion Beast!”.
Foreign things are moving so fast in the machine learning space that we could actually see General artificial intelligence, or at least something that resembles General artificial intelligence within the next five years. Like I know, we’ve been saying that since the 80s or certain people have been saying that since the 80s – but maybe it’s actually really happening this time – I don’t know if you want to experiment with this. No, it gets a little tricky. You can use gamer gpus, but you don’t have a lot of vram or you can try to piece other things together. Nvidia gets all the attention, but AMD is actually catching up fast, but make no mistake.

AMD's Hidden $100 Stable Diffusion Beast!

They’Ve always been there in the super computer space I mean. There’S a reason that Oak Ridge is using the AMD stack for all of their stuff, but those are the smartest guys in the room and sometimes it’s exhausting being the smartest guys in the room. So what do you do? Well, the Instinct mi-25s are about a hundred dollars on eBay, because the one percenters don’t want those anymore. They don’t want those in the data center they’re busy buying forty thousand dollar gpus or twenty five thousand dollar gpus systems. Like the mi2 10.. I took a part with Gamers Nexus and we did some builds our super micro, big twin system, six mi210s and 2u.

That is an absolutely ridiculous system, an AMD for their part. They partnered with pi torch. So if you use Python for machine learning or anything like that, you can drop in and you’re ready to go. It’S a little bit more of an uphill battle, getting an instinct mi-25 to work with that setup. But if you’re willing to put in the work a hundred dollars for an instinct mi-25, you can Flash the V bios on it to be a WX 9100 and it does actually have a single Mini DisplayPort out which will work with that bios.

AMD's Hidden $100 Stable Diffusion Beast!

You can almost double the power limit of the card and as long as you can keep it cool with whatever Madness you happen to be running, it will actually be pretty stable. Now, gigabuster on our forum is the one that put this together and figured it out and the dependencies and all of the software see the mi-25s are so old. They’Re right on the edge of software support and AMD has been adding new software and new features and new everything for their Instinct line, for you know like the mi-100 and the mi-200, and now the Mi 300 were on the precipice of that, and so those are The cards that are getting the most attention – the mi-25 – is based around the Vega 10, so that’s gcn 5.0, but it has 16 gigabytes of vram 16 gigabytes of vram. Yes, the membrane bandwidth is 462 gigabytes per. Second, you can do a lot with that with machine learning, even though 16 gigabytes, I mean some of these models, take like 40 gigabytes of vram, but you can still do a lot of stable diffusion, stable, diffusion, automatic 111 running in your local browser. Doing your own stuff, you can get a bunch of previews, I mean it takes like 20 minutes to get 16 previews at very high fidelity.

768 I’ll. Show you it’s it’s worth it. I promise and the Mi 25 has dual 8-pin power connectors and is fortunately a standard GPU style connector, sometimes the CPU 8-pin and the GPU 8-pin the Enterprise cards. A lot of the time will have a CPU style, 8-pin connector, which is a different wiring than a GPU style, 8-pin connector.

But these two have the seat: the GPU style, eight pin connector. So it’s pretty easy to hook up in an existing system. The biggest problem is cooling, so we’ve got the NZXT bracket here that we’ve modified a little bit getting the GPU mounting pressure just right when you do.

This is a little tricky, definitely not recommended not for the faint of heart and probably not your first project in an Ideal World. The more accessible solution is to download this 3D printable shroud and uh bum somebody’s 3D printer. If you don’t have one it’ll Mount here on the end of your card and then you can pick up a standard. You know this is a bfb1012h brushless blower motor, that’s three pin so it’s wired for your motherboard and then boom.

Look at that now. This is the longest GPU ever, but this will work in cases such as the fractal meshify, the big one, and as long as this fan is running at Full Tilt, you can run 170 Watts through this card without too much issue. Now stable diffusion is a lot of fun and you can run 768 by 768 models with this. It’S it’s actually surprisingly competent, so at floating Point 32.

512 by 512, with the Euler 20 step, it’s about uh, it’s 2.56 to 2.57 iterations per second at 768 by 768, it’s more like 27 seconds, so not bad, but it is only using 12 gigabytes of vram, so you’re staying well under the 16 gigabyte limit for Comparison for how far we’ve come that super micro, big twin system, if you didn’t see those videos, be sure to check out those videos floating Point 32, 20 Step 2 seconds for 512 by 512 and 6 seconds for 768 by 768.. That’S pretty fast, and so once you follow the guide and get everything up and running, it works really well now, if you’re using newer Hardware, you don’t really have to worry about the versions as much again, because AMD is supporting the pi torch foundation and because they’re Supporting AI in general, and because you know it’s it’s it’s it’s amd’s coming. We did this fun clip of The Shining for a video that we released on Halloween last year, and it’s even more like that today, I’ve got this thing running generating. You know fun interesting, Danny DeVito images, because, if you’ve watched level one for a long time, you know our Benchmark for AI is when we get to an AI agent.

That’S just here is the Lord of the Rings movies from Peter Jackson. I would like for you to replace every character in this movie with Danny DeVito and we’re basically at the point where AI can do that a lot sooner than I expected. So, that’s why I say: artificial general intelligence, probably coming a lot sooner than I expected and probably on the order of five years or so or at least you can have a personal assistant that is indistinguishable from General artificial intelligence. Maybe I don’t know we’ll see, because you can do this on an mi-25. It’S the software, that’s catching up.

The hardware that we have today is what’s going to run, that that’s probably why people are buying these gpus for 25, 35, 45, 000, even on eBay. Well, the newer ones, not the mi-25s. These are these, are you know a hundred dollars? Oh actually, this is the Radeon Pro v540 Amazon is getting rid of these right now.

This is not the kind of GPU that you will want to do this stuff on, but this is a dual GPU solution. Amazon used to have these. You can get your hands on these as well.

This is going to be a different video, though these are maybe not for machine learning and it’s a little tricky to get the drivers for that. If you can help with the windows drivers for this because they’re in the Amazon Cloud and that’s pretty much it they’re not on the AMD website, because this is a you know – v540, which is a dual version of another AMD GPU. So it’s a little weird, but maybe is a good candidate for our vfio GPU pass-through stuff which, by the way, is making a lot of progress. Gon na look out for a video on that soon yeah the Instinct Mi 25 for 100.

That you’re able to do this at a reasonable speed, genuinely very impressive and yeah. You can do it on a gamer GPU but 16 gigabytes of hbm2 for a hundred dollars. Again, that’s a really good deal. I don’t know that I would pay a lot more than 100 because you will put in a lot of work in order to get it actually working and follow the level one guide. Again.

Thanks gigabuster, but uh yeah you can. You can build kind of a beastly machine assuming that you can keep them cool, stable, diffusion on AMD Hardware, both old and new, basically ready shockingly good, and it’s a preview for what’s next, I’ve also written a little guide on getting open Assistant working with one of Their open source models, yeah there’s a model, that’s really good, but it’s sort of encumbered by some licensing issues for commercial and other use, but they do have fully open models. So you can download one of the 12 billion parameter models there and be able to run it, but you will need a beefy GPU it out of memories uh, even with 16 gig vram, gpus and AMD is working on proper Rock M support for 7000 series, gpus And Beyond so 20, gigs, 24 gigs, but just understand the AMD has their cdna and their rdna and those are separate lines. These are cdna cards, compute DNA and that’s what they still have in the data center.

That’S what our mi210 is. That’S what the Mi 300s are, and eventually those roads may come back together, but fundamentally your cdna and your uh, your your gaming cards are a different things, and so it’s a little. It’S there for experimentation, but it’s it’s a little different. This has been a project.

I mean: where else can you play with the angle, grinder and 3D printed parts and also machine learning toward our ultimate goal of being able to just ask an AI agent to substitute in your favorite actors and characters into whatever movie and genre? You want to create any kind of mashup or meme that you want much to the horror of literally everybody: that’s not a normal human being, I’m one of those level one. This has been some fun with the AMD Instinct mi-25 and to show that you know if you’re just going to run pytorch you’re, basically good to go on an AMD cdna cards at this point and it’s very good there’s a reason that Oak Ridge is using this And whether this level, one I’m signing out, you find me in the level one forms foreign .