My current build is mostly new, CPU (Ryzen 7 5800x), RAM (Kingston 64Gb DDR4 3200), Mobo (Asus Crosshair VIII Dark Hero), Video Card (Asrock Phantom Gaming RX6900XT) and NVMe drive (2TB Samsung 970 Evo Plus) are all less than 12 months old, and some are less than 6 months. The only "old" parts of the system are the PSU (Corsair AX760i that still seems ok), the case (Corsair 900D supertower) and the SSDs and HDD. I'm running a Noctua NH-D15 Cooler and not overclocking. Though I am running the RAM with with DOCP settings (3200Mhz instead of 2000Mhz).
The PC was running stably with no crashes for several months until the end of last year. Sometime in early January I installed new audio drivers from the Asus site and around the same time I switched my old Samsung 28" 4k monitor for an MSI Optix MPG321UR-QD on Displayport. The new monitor supports 48-144Hz VRR and I've got it set to 120Hz refresh in Windows (10 Pro 64bit). I also have a second monitor (Dell S3221QS) running on HDMI at 60Hz. I should also say that I updated Radeon drivers at some point in January too. Though I'm not sure whether that was before or after the crashes started.
Since then I've had numerous, random hard lockups (only) while gaming (World of Warcraft). The PC completely locks up with repeating audio playing over headphones or the Dell speakers (depending on which I'm routing audio through). When I say complete lock up I mean nothing responds, mouse won't move, can't alt-tab or bring up task manager, have to power off the PC to reboot.
All the temps and voltages appear to be fine, I've had Corsair iCUE running in the second monitor and nothing appears out of the ordinary in that regard. CPU temps around 56-60, ditto video card. PSU temps in the mid 30s and power draw around 300W on a 760W PSU. NVMe drive temps are high 40s or low 50s. The only temperature that seems a bit iffy is the southbridge on the mobo. The heatsink on it feels quite warm. Though I can't say exactly what temperature it is running at.
My initial thought was that something weird was happening with the audio drivers. Something I thought was backed up when I saw they'd been pulled from the Dark Hero support site. So I rolled back to the previous drivers but still got random lockups. Including one that was so bad Windows was corrupted enough I had to format the NVMe and reinstall from scratch.
So now I have a completely clean install of Windows 10 Pro 64bit with latest AMD drivers, the "good" realtek audio drivers and all windows updates done - and I'm still getting the random crashes. I've literally just pulled the PC apart and tried reseating everything on the off chance that something was "loose" but I'm not holding my breath.
One thing I've noticed during a couple of the crashes is that the screen may flicker black and come back a few seconds before the lockup. I checked the event logs and I can't see anything like a "driver stopped responding" error before the crash. Basically the only error showing in the logs is "windows restarted after an unexpected shutdown".
So, any thoughts? Could it be related to running two monitors with vastly different refresh rates?
While it is summer here it's not super hot and the temps I'm monitoring in the PC don't seem that bad either. Just in windows typing this the CPU is sitting at 34.8c and the video card at 39c (though it has yet to crash in Windows). So temp problems seem unlikely.
I can run a Memtest but I'd be surprised if the RAM which was good for six months spontaneously decided to go bad.
Is there anything else I should be installing that I've overlooked? I'm pretty sure Ryzen Master was installed alongside the Radeon drivers.
I can try either stress testing the system with 3dMark or play another game for hours on end. But it's a bit difficult because the crashes appear to be random. The longest I've gone since the start of the year without a crash is probably 4 or 5 days but other times I've had two on the same day a few hours apart.
edit: I have crash dumps turned on but I can't find any. So it looks like they're not working?
Running UEFI not in compatibility mode in case that makes a difference.