The Supermicro Dual-Socket AS -2125HS-TNR featuring Genoa

The Supermicro Dual-Socket AS -2125HS-TNR featuring Genoa

Hi, this is Wayne again with a topic “The Supermicro Dual-Socket AS -2125HS-TNR featuring Genoa”.
This is the super micro 2125 HS TNR, listen until super micro comes up with better names for these things, we’re going to call this the super micro Beefcake, because this is Genoa – this is 128 Lanes. This is 70 gigabytes per second of storage interface speed. We can do it, okay, spoiler alert. I picked this up as a bare bone system from Super Micro, but I’ve been adding my own Genoa, CPUs and ddr5 memory. Let’S take a closer look.

Foreign did. He add an Intel quick, assist accelerator to a Genoa system. Uh. Yes, I did. This is uh.

The Supermicro Dual-Socket AS -2125HS-TNR featuring Genoa

This is an Intel. Quick assist. 8970.! That’S another video, though, let’s focus on the chassis here. So obviously this is a storage oriented platform, but that focuses on compute. You see things with the PCI Express Lanes are not a lot different with Genoa versus Milan, the Milan generation I mean, we’ve moved to PCI Express 5 and ddr5. So, okay, we’ve doubled the amount of pcie bandwidth that we have, but it’s still 128 Lanes. We have 128 Lanes between two sockets now between you know. If you’re running a single socket or a dual socket configuration most of the default configurations are 128 PCI Express Lanes. The one exception to that is when the links between the CPU sockets are not as much you can take away, one of the links between CPU sockets and you can have a total of 160 usable PCI Express Lanes and a two socket system.

The Supermicro Dual-Socket AS -2125HS-TNR featuring Genoa

However, that is not what super micro has opted to do here. There are four interface Lanes between the two sockets that might seem counter to what your expectations would be. It’S like.

The Supermicro Dual-Socket AS -2125HS-TNR featuring Genoa

Oh, this is a storage oriented system, but I’ve actually been running this system. For a few months and doing any kind of a job where you’ve got heavy processing – and you rely heavily on moving information from socket to socket that fourth high-speed GMI link between the sockets is much appreciated. Plus we can get the job done with 24 Lanes.

That’S no problem. We’Ve got a very simple but effective mechanism for routing our cooling over our CPUs. We’Ll talk more about that in a minute, as well as our ddr5 memory, which does use quite a bit more power than the last generation of ddr4. That’S sort of par for the course isn’t it.

4. 800. Ddr5 4800. This supports out of the box.

You can do ddr5 5600 via the BIOS, but that is not an officially supported configuration when we’re talking about Genoa, although it does work. But if you want that to be stable, you will have to run a higher SOC voltage. That’S not officially supported either. Are we overclocking Genoa at this point? Well, you shouldn’t you can, but you shouldn’t, that is an off-label use, not officially supported or condoned by super micro. So don’t get your hopes up now. One of the reasons that I got this platform is because this is a relatively low-cost high performance storage platform. Two sockets mean that we have a ton of CPU horsepower available to play with, and yes, I’ve run this up to a 96 core configuration. We’Ll talk about the specific configurations in just a second, but that also means that our Remote Management solution, our out-of-band solution and all that is built into the motherboard – it’s not removable or modular.

We do have a single ocp3 slot in this configuration. If you check out page 13, 14 15 in the manual for this chassis, super micro goes over the flexible configurations for this platform. You can have a lot of options here at the rear as configured my system just has a slot. 3.

4 slot Riser like slot three and four Riser here at the back, so I can run a qat card, a 200 gig card 400 gig card, but we’ve also got our ocp3 slot, which is 16 PCI Express 5 Lanes. Yes, that’s right if we were to run a dual 400 gigabit card in this, we can achieve about 45 50 gigabytes per second off of this chassis, assuming that we’re using the appropriate amount and volume of pcie Gen 4 drives in this chassis. So, like the QX CM7 rocking about six to seven gigabytes per second seven times, four, that’s a lot of gigabytes per second! Hopefully, you’ve got a relatively low software overhead doing that and if not you’ve got. You know two times 96 cores up to to be able to do all your processing and offload and everything else.

One of the reasons to run dual socket for a storage server is because you have a lot of processing overhead that you need to run double the memory bandwidth and also, theoretically, double the network exit bandwidth. So this actually would be a good candidate chassis to run a Nick that has an interface to both CPU sockets and configure that accordingly, if you want to run an nvme ceph cluster, because you’ve got kind of a lot of CPU overhead, you might be able to Do with some latency two 400 gig nics in this Platform, One PCI Express one ocp3, probably actually a pretty good choice. I’M not there! Yet personally, I’m just running one in 200, gigabit just for experimentation purposes, so I’ll get there, but that’s a relatively pedestrian! You know. 20 to 40 gigabytes per second, not super impressive in this day and age, but yeah or conversely, you can say that the pcie Gen 5 platform that General brings to the table is so over the top impressive that merely 32 gigabytes per second from your network interface Is just pedestrian feeling, because you know we we are talking about over 150 gigabytes per second main memory, bandwidth realized in just about every application.

You can think of. Truly we are living in the future. If you dig right into it, you actually notice. The motherboard is scarcely bigger than the CPU sockets and the actual memory interface.

This is really just a glorified adapter for plugging CPUs, directly into storage and whatever your network interface is, and that’s basically it I mean the motherboard really is just connectors and power regulation, although even that’s on dimms these days now, let’s dive into the nitty-gritty of the Operational parameters for this chassis, like I say, I’ve been running this and I’ve run this over a wide variety of temperature configurations and also with ctdp CPUs of about 400 watts. Now, if you are going to run this platform with CPUs of 400 watts, there are some things to keep in mind. First is take a good look at your heatsink configuration remember with our plastic ducts. We’Ve got these two sand Ace fans.

These fans are directing a significant amount of airflow over our Tower CPU Cooler. However, if you have a high ambient temperature, I did tests in our server closet at 76 degrees Fahrenheit. That’S this degrees in celsius as well as 62 degrees Fahrenheit this in Celsius. Sorry, the air conditioning system is in Fahrenheit.

I probably could change it, but yeah don’t worry about it and at 62 400 watts. The server has no problems at all. No throttling everything else is pretty good. If you run it at 76, however, you will have to use the Remote Management to dial up the fan profile in order to prevent the CPUs from starting to throttle with the 96 core generos CPU. You can lose up to three percent performance because the CPUs will not boost as high at the ctdp of 400 watts. Now, whether that’s the CPU throttling or the vrm on the motherboard throttling the available temperature, that’s available, I’m not really sure at three percent.

It’S not really worrying, it’s almost, but not quite margin of error territory. I double checked our Eaton E3 pdu just check power usage and the power usage was a bit lower in test runs in the warmer server environment. Super micro specifications say that the operating temperatures are good up to 95 degrees Fahrenheit, which is quite toasty. There are some data centers out there that are designed to run in about 80 degrees Fahrenheit. I would just say that, if you’re going to do ctdp at 80 degrees Fahrenheit, this is probably not the chassis that you want or you’re going to have to run your fans at a higher RPM and they will use significantly more power.

I think if you’re running you know a 72 or 73 degree Data Center and there’s plenty of cool air coming in the front of this chassis, it’s not going to be an issue in addition to a configuration with our kyoxia drives. We also set up solidime storage. Now, solidime consumer storage required the use of adapters, and I used these icdoc adapters. You can check out. These are great. These will let you use really consumer level ssds in your fancy, pants storage server. This is not something I recommend you shouldn’t actually do this, I’m just chasing that world record unicorn which, by the way we were able to achieve.

With this platform, we were able to use Microsoft, SQL server and do a two terabyte database, backup in under 15 seconds. Yeah super micro got it done. Super micro got it done to such an extent that it actually broke the internal thread scheduler in SQL Server, Microsoft, SQL server, not really designed for use on 192 cores it sort of tops out at about 64 cores. Oh, I know the the Microsoft SQL MVPs will say: oh yeah, you know you can do, but they don’t actually test it on there. They expect you to run a virtual machine cluster of those, and actually that would be a better choice if you were going to run uh something really high performance that also wasn’t licensed per socket.

You would be better off stacking two or three of these chassis and building a cluster with a lower core count. You know 16 32 cores per system versus loading up on 64 or 96 cores in a single chassis for the database system, and the reason for that is that you run your database workload distributed as a cluster. You don’t have a single point of failure.

Listen if you’re shelling out a hundred thousand dollars plus for your database server license. I think you can shell out to get a redundant High availability solution where you’re not just running it on one single box, it’s not really ever the case that you would not want to redundant solution, a highly available solution. Uh at those kind of price points.

I mean even if you’re you’re, rocking an open source database system, you make sense to go cluster first. The other use case that I considered was using this as some kind of a a head like a processing head or a front end for a software defined storage. Server or a software-defined storage stack, this chassis actually would make a good storage node that used both nvme and spinning rust, meaning that, you would add, you know, a regular old serial attached, scuzzy or some other kind of an interface card in your pcie slot here and Use dual 100 Gig: dual 200 gig: dual 400 gig ocp3.

To get this thing connected to the network you can have you know you could have upwards of half a petabyte in those 24 bays at the front and then everything else connected to external disk shelves. I mean this dual socket: configuration could easily handle 96 drives, such as from the Dual actuator Seagate exos drives that we took a look at previously and then you you build two of those, because you need redundancy and failover and replication and everything else, and you would Be hard-pressed to find a better solution for any kind of storage need that you might have both in terms of speed and the volume of information. I mean the fact that we have PCI Express Gen, 5 plus ddr5, plus the fastest Zen 4 cores. There’S there’s nothing else faster and lower latency that you can get in the server space right now, couple that with true Enterprise storage and there’s really probably not a better storage solution out there. Unless you get into a really insanely expensive systems. This would make an excellent platform for doing those kinds of experimentations if you’re doing something like kubernetes, for example – and you want to do a multi-tenant kubernetes, where everybody has an SSD, you know a 15 or 30 terabyte m.2.

You don’t have to fill it up. Just say you use 12 of your 24 plus you throw in some 64 core CPUs here with the memory footprint and the memory configuration I mean remember, you can run up to six terabytes per socket here, but you know you sort of the the configuration you get In 2023 is probably about two terabytes per socket. You get a rack of these and you’re basically starting to compete with lenode. That is an option from the super micro manual. We know that there are some other configurations that do support gpus for this uh g-raid. As a GPU based raid solution, you know I’m surprised that the number of vendors that are building in solutions to use pcie accelerators to be able to do nvme raid the traditional way.

I have some thoughts about that, but those thoughts I will share in a different video for now. This is a configuration flexible enough to support breaking away some of your PCI Express Lanes from the front and delivering more slots at the rear. It just does that with cabling and connectors, as we see at the midboard on this on this system here beyond that, there’s not really a lot else to say.

Even in excruciating grueling torture test scenarios, I was able to get full 400 watt performance out of our 96 core CPUs. I think most people are probably going to be running the frequency optimized F-series CPUs, which is going to be a 232 core system which, by the way, if you want the fastest single core clock speeds, the Dual socket configuration is probably a better choice over the single Socket configuration when we’re talking about merely 32 cores, because you’ve spread the heat load out and you’ve also doubled your memory bandwidth. That can be an optimization for some types of workloads for something like a kubernetes cluster.

The 6 D4 cores 48 64 cores something like that are probably a good fit for this platform. If you’re going for density well I mean you could go 96 cores and still have something that is insanely high performance. If you do go something that high density, you probably need something faster than ethernet.

You know: 200 400 gig infiniband I’ve even been experimenting with opening path. Those kinds of things will work better in this configuration for all of our benchmarks. We use the fironics test Suite.

It’S pretty great. Our test results are up on open benchmarking.org and for the windows side glenberry, you should check out Glenn Berry’s Channel he’s got some of the the benchmarks and results from the system when we did Microsoft, SQL Server testing and all of the configuration there and and for Microsoft, SQL Server, this system screams, basically no matter what you put in it. It’S insanely fast world record, I’m Wendell. This is level one. This has been a quick look at the super micro 2125 HS TNR. Let me know if you have any questions or you want to take a deeper look at anything, but just how cool is it? The pcie slots are connected with cables, because cables are easy to do for pcie, Gen 5.

Then a big old giant PCB foreign .