cancel
Showing results for 
Search instead for 
Did you mean: 

Z690-I + argus monitor / hwinfo cause hard reboots and shutdowns

placinta
Level 7
Hi,
Looking for similar experiences and / or advice.

Built a new system using Z690-I and the system at some point hard reboots with no warning (sometimes 10 minutes after booting, sometimes hours after booting).
Happens while totally idle, as well as during medium load.

No BSOD, nothing of note in Event Viewer (just unexpected shutdown message upon next boot), no WHEA errors, just hard reset. A few times it just powered off, and pressing the power on button did nothing (audio LED lights / ethernet still ON). Had to unplug / replug power.

Things I tried:
Monitored temps, all fine, CPU under 30 idle / 60 load. GPU ~35 idle.
Running all fans at 80% speed to eliminate unreported temp issues.
Running with case open.
Tried removing GPU (using the cpu with just iGPU). Same story.
memtest86 and testmem5+ just fine.
Stress testing using cinebench / furmark / uniengine no issues, never crashed during them.
Reseating RAM / GPU (no riser).
Disabling C-States, fastboot, trying different XMP profiles.

I was /this/ close to sending the motherboard and CPU back, thinking one of them is faulty (maybe they are).

Until I noticed that the problem goes away if I don't leave Argus monitor or hwinfo running. If neither of those are running, the system runs just fine.
If at least one runs, at some point the system reboots or powers off.
I confirmed this by closing every single application except argus (or hwinfo) and just let the pc idle while running them. Eventually the system resets.
Nothing is controlled via Argus (e.g fan speed), I was just running it for sensor monitoring. Same for hwinfo.

I noticed that both of them install a system driver, and they are only loaded / running if either of the apps run.

Can a faulty driver really cause such an issue?
Even if the driver is the problem, shouldn't the motherboard firmware + windows handle any such issue more gracefully (at least a BSOD instead of a power reset)

Specs:
CPU i7 12700K (no manual overclocking done)
MB Asus RORG strix Z690-I (mITX)
RAM Corsair Vengeance 2x16G DDR5-4800
GPU Asus Tuf RTX 3080
PSU Corsair 750W
SSD Samsung 980 Pro 2GB
CPU Cooler Noctua NH-C14S + case 4 fans
Windows 11 latest drivers installed manually via ROG website, no Armoury Crate
BIOS 0811
542 Views
7 REPLIES 7

Deke06
Level 7
Z690-I owner here. Similar config w/ 12700K (stock), 2x16GB G. Skill DDR5 6000, Founders 3080Ti, 2x2GB Samsung 980 Pros watercooled with a custom loop in an Ncase M1 v6.1. Using BIOS 1003 for the past week.

0811 seemed a bit problematic for me, but to be fair I was still tuning the RAM at that time. 1003 has been rock stable.

I know there are issues with running two hardware monitoring programs concurrently and this can lead to system instability and unpredictable crashes. It seems you're not running them simultaneously but still having issues, correct?

I run HWinfo all the time, however I use the portable version, so it's not installed per se on the system. The only thing it's not monitoring is the the legacy Asus embedded controller. The mouseover tooltip for that controller states that monitoring it can lead to instability in rare cases.

I'd suggest taking a chance on BIOS 1003 and disabling PCI Express Native Power Management in BIOS Advanced\ Platform Misc Configuration to avoid the WHEA 17 errors that plague the Asus boards right now when this feature is enabled. Additionally, completely uninstall and and clean Argus and HWinfo. Use the portable HWinfo with the Asus EC sensor disabled if you need monitoring. A clean Windows 11 install may give you peace of mind if you want to be sure everything is scrubbed.

Retest stability. My minimum for DDR5 is Karhu with 6400% coverage followed by 20 cycles of TM5 using anta777 extreme profile.

Final nitpicky point: Are you using an older PSU from a previous system. The Corsair SFX750 had problems. Units manufactured between October of 2019 to March of 2020 – with lot codes 194448xx to 201148xx were recalled. You may want to look at the sticker just to be sure if applicable. Ref: https://forum.corsair.com/forums/topic/160988-sf-series-voluntary-product-replacement/

Sounds like a frustrating problem. Hope you get it sorted out.

Hi!

> I know there are issues with running two hardware monitoring programs concurrently and this can lead to system instability and unpredictable crashes. It seems you're not running them simultaneously but still having issues, correct?

Yes, the issue happens with just one of the two running, doesn't matter which one.

> Asus EC sensor disabled

I will try to disable it and see how the system behaves. It was enabled before. I didn't know it was legacy, i thought it's actually something new.

> I'd suggest taking a chance on BIOS 1003 and

I will first try the various suggestions with the older BIOS first, so i have more info on what helps.

> disabling PCI Express Native Power Management

Will give it a try.

> Retest stability. My minimum for DDR5 is Karhu with 6400% coverage followed by 20 cycles of TM5 using anta777 extreme profile.

I ran TM5 + anta for 5 cycles. I read elsewhere that more than that is overkill, unless extremely paranoid.

> Are you using an older PSU from a previous system
No, it's a newly bought unit. Serial number doesn't match the defective lot.

Thanks for the suggestions, I'll report back after some more additional testing.
It's hard to know whether something helps because the resets can happen only on the 2nd or 3rd day, depending on my luck.

I had the exact issue here with Z690 Hero (latest BIOS 1003).
I did lots of hardware troubleshooting months because I thought this should be a hardware issue (no event log recorded).
I also replaced my PSU (Seasonic TX1000), but no luck.

Last week I figured the last thing I've never checked is, I run Argus monitor to control my fans and AIDA64 to get a monitoring screen, and that could be hardware/firmware (EC) related.

So I tested w/o having the two software running, and now almost a week passed, no random shutdown anymore.

For now, I just use Armoury Crate to control all the fans, and maybe I will test to run AIDA64 in the background only, to see if I still get this issue w/o Argus monitor.
Also, I may try disabling ASUS EC support in the software because that could be related to the issue.

I've googled a lot, seems only you and I are getting this issue, maybe an individual motherboard issue?

Adrian1983
Level 10
That's rather odd. I'm on and Asus Z690 but not that board, I'm on the Strix Z690 Gaming A and have both Argus monitor and HWinfo since release of this board and many different bios's but never had that issue, I am not sure what the issue is there though.

placinta
Level 7
I can confirm that after disabling the Asus EC sensor in argus, i did not get a single random reboot anymore.

placinta wrote:
I can confirm that after disabling the Asus EC sensor in argus, i did not get a single random reboot anymore.


Did you disable both EC support and ATK service or leave one enabled?

placinta
Level 7
Both ATK and EC are disabled.