01-03-2024 09:47 PM
01-03-2024 11:11 PM - edited 01-03-2024 11:13 PM
Hi @moddingfox ,
Judging by your memory configuration you're applying EXPO and then experiencing memory instability. Save you some soak time, all you really need to do is disable EXPO and then evaluate stability. Disabling overclocking when experiencing stability issues should be troubleshooting 101. Overclocking 128GB densities can be challenging, and manual tuning may be required.
01-11-2024 06:30 PM
Okay so follow up here. I found a way to intentionally cause the crash. Intalling ffxiv and using the alt launcher 'xiv launcher' to patch from base install to latest. For some reason after about 5-7 patches into it of the 207 patches it reliability dies. Though this only seems to apply when virtualizing in qemu hosts with the cpu type as host(required part of my stack as this system is intended to be a hypervisor). Gunna run with host cpu type off for a bit and if that is the issue ill go annoy the folks on qemu fourms as at that point its probably the more appropriate spot ^.^ Im still not 100% sure if this is the only thing going on or just a part of it but its what i have been able to dig up thus far. Im also not totally convinced the red dram diagnostic light cycleing at crash is an indicator of error but part of the reset sequence some how but truthfully i havent seen any documentation that denotes the behaviors of the lights in more depth that the tldr of "read component name by red light and associate the error with that some how"(not super great in my opinion but whateves).
01-21-2024 08:10 PM
So, similar setup here. 7950X3D, 64GB GSKILL F5-6000J3040G32GX2-TZ5NR on EXPO, 980 Pro 2TB, 7900XT, NZXT Kraken 240 AIO. And I've had numerous crashes over months. I've uninstalled and reinstalled programs, updated my PSU, reseated everything, completed every memtest and free thing I can find, and I can't consistently replicate the crashes, either. I did note in my reliability monitor that ASUS' Noise Canceling Service seemed to crash at the same time as every crash, which made me think it might be related, but I'm just going to keep trying. I'm considering starting an RMA with ASUS, since it *seems* like it could be a bad capacitor related issue.
01-21-2024 10:15 PM
My setup has been stable since my last post with what is now a 10 day uptime. I still dont have a percice nail on why win vm's in qemu trigger crashes but seems it only happens when they are given access to host cpu features. I went back and checked some initial wierdness the corsair and nzxt rgb controller applications did tend to make the crash more often. If you use those maybe worth a look them as suspects. After switching to x86-64-v4 all those seemed good as well.