07-14-2023 04:27 PM - edited 09-27-2023 12:03 PM
Update (Original post at bottom):
There is some weird interplay between LibreHardwareMonitor (which powers FanControl), Aquasuite's service (which controls my pump and flow meter), and the Nuvoton controller integrated on the board. Running LibreHardwareMonitor and the Aquasuite service together overwhelms the Nuvoton in some way that causes certain functions to stop responding, including the CPU temp monitoring. The temperature is monitored elsewhere on the board as well, so the system still functions fine, but that specific reporting channel, which is used by the display on the motherboard to display temperature, basically stops functioning.
There is no solution, the developers of the applications would have to work together (good luck) to figure that out, but not running both concurrently is a sufficient work around.
Original Post:
Pretty much exactly as the title says...
Running an Strix X670E-E with a 7950X with a some light negative CO values and memory at 6200 with fairly aggressive timings (bad cores, good IMC). System is very stable under all scenarios as is (hours, or maybe even days at this point, of TestMem5, Kharu, OCCT, and y-cruncher) so I don't think it's an overall system stability issue. The system is cooled by a custom loop so it will only get hot under stress tests.
It was fine about a month ago when I left for a work trip. I left my cousin to house-sit and de-tuned the system a bit just in case there was any instability I hadn't found - CPU to the higher Eco Mode setting, and RAM to the board 6200MT pre-set, but with lower voltages because even with tighter timings it doesn't need the voltages from the preset. He had no issues that he noticed, but when I spun it up, after a few minutes, FanControl (the software by RemOo) seemed to suddenly ramp down my fans so I looked over and saw that the Q-Code wasn't showing a temperature, but a solid "23" and not moving. A while later I heard the same and it was now showing "01" on the Q-LED.
Now, without fail, within a few minutes of starting up, the Q-LED temp display will "fail" and and show one of these two codes. I was on an earlier beta BIOS (1412), so I cleared the settings and upgraded to the official 1416 release, but there is no difference.
I don't care too much about the temperature display, but my concern is that something more serious may be going on under the hood. Has anyone else ever seen this behavior or have an explanation for what it could mean?
07-15-2023 12:45 AM
Hello,
The Q-LED isn't primarily for temperature display. This is just a subsequent use that can be enabled after POST.
It sounds like the system could be unstable at the applied overclock and falling over so you would need to return to Optimised Defaults, disabling EXPO or any overclocking profiles to confirm whether the behaviour is replicable. Does the system crash when the Q-LED hangs on temperature whilst in the OS?
07-16-2023 01:30 PM
Sorry, I was not clear - this happens even with everything at defaults as well. The only things changed in the BIOS is to default to the advanced view and to disable the Armory Crate auto-install.
07-16-2023 08:29 PM
Haven’t used the fan control software you mention but my first suggestion would be to uninstall it and use Q-FAN.
07-16-2023 09:53 PM
There isn't really an issue with the fan control software, it continues to work fine. I only mentioned it because it will show a momentary fan slow down on two of the four fan channels in use (using CPU, and system fans 1, 3, and 4 - 3 and 4 are the ones that will slow down momentarily) when the Q-LED glitches out. The fans spin right back up as programmed and everything in the system runs normally.
07-16-2023 09:57 PM - edited 07-16-2023 10:35 PM
You've not really been clear on the behaviour, either.
Does the system or any application actually crash when this happens? Anything in the event logs (Software/System)
I'm not sure why you'd assume the fan software isn't the issue given the behaviour. Would you not at least try? Given the fact using anything OS side will induce some sort of overhead - what exactly is it that the software does that you aren't able to control via Q-Fan?
07-17-2023 10:52 AM
The behavior is that the Q-LED display will stop displaying temperature and instead display one of 3 codes instead (01, 17, or 23). A secondary behavior is that fan control through the OS stops for about a second or two, falling back to the BIOS-programmed fan speeds for 2 out of the 4 connected fan headers for that time frame. There are no other behaviors - including overall system functionality - the rest of the system behaves normally with no indications in any of the Windows Event Viewer logs.
FanControl is a relatively simple open source program that is based on LibreHardwareMonitor and adds highly customizable fan speed control on to it. It interfaces with my water pump, temperature, and flow sensors to control my whole cooling system based on water tempterature. It runs no persistent services, so disabling it for testing is simply a matter of shutting the program down. To test with Q-Fan, I have attached a thermal probe to the outside of my radiator, which gives a "close enough" temperature reading to Q-Fan using "T_Probe" as a source. The result is the same - after about 3 minutes with no monitoring or fan control software running, I get the same Q-LED display failure - again, with no other system issues.
My concern is perhaps a bit too technical for the forum, as it probably comes down to some weird embedded controller bug or defect. I'd bet only the internal engineers who actually design the motherboard know what could be causing it, but hey... you never know.
07-17-2023 11:27 AM - edited 07-17-2023 11:35 AM
You can control fan sources based on water temperature from the UEFI. I’m not sure how much more control one needs when running the fans based on water temperature as the ramp times will be quite slow. You can also control the water pump this way. Each to their own, but needless to say it’s necessary for third party control to be removed from the equation. It’s technical in the sense you’re adding additional layers to fan control, the rest is speculative lol.
Are you able to confirm any other reports of this behaviour? If it were a firmware bug as allude it to be, it would be replicable.
To me, it sounds like something else is polling the fans and causing a conflict. Do you have HWInfo or any other application polling the system when it occurs?
07-17-2023 11:24 AM
OK - Super weird...
After the last test just a short while ago, I restarted monitoring and fan control, then walked over to my work system to take care of something. Come back just now and the Q-LED is showing temps again. No restart, no changes... it just... came back to life.
I'm gonna say this is just some straight up weirdness at this point.