02-26-2019 03:40 PM - last edited on 03-06-2024 07:42 PM by ROGBot
I fired up the my Maximus XI Gene + 9900K to see if I could replicate your behavior.
Core = 4.7G
Cache = 4.4G
P95 29.1 FMA3 Small FFTs 15K
LLC=6, Vcore set = 1.130V, Vcore read = 1.066V: 1 thread failed after 6 minutes
LLC=6, Vcore set = 1.140V, Vcore read = 1.074V: pass 20m+
LLC=8, Vcore set = 1.075V, Vcore read = 1.074V: 1 thread failed after 2 minutes
LLC=8, Vcore set = 1.085V, Vcore read = 1.083V: 1 thread failed after 4 minutes
LLC=8, Vcore set = 1.095V, Vcore read = 1.092V: 1 thread failed after 2 minutes
LLC=8, Vcore set = 1.105V, Vcore read = 1.101V: 1 thread failed after 9 minutes
LLC=8, Vcore set = 1.115V, Vcore read = 1.110V: 1 thread failed after 6 minutes
LLC=8, Vcore set = 1.125V, Vcore read = 1.119V: 1 thread failed after 2 minutes
LLC=8, Vcore set = 1.135V, Vcore read = 1.137V: pass 1h+
I repeated it again with LLC=6, Vcore set = 1.140V, Vcore read = 1.074V and 1 thread failed after 14 minutes. Probably 10-20mV extra would pass for 1h+.
This seems even worse than what you reported. One aspect to think of is with higher voltages the temperatures will increase and worsen stability, resulting in even higher voltage required. It would be interesting with a direct comparison with the same CPU/cooling but different boards. I could possibly try to get an Aorus Master next week. I've got a Maximus XI Apex, but I doubt that would perform much better for this specific test case. Primarily the additional VRM components should yield lower VRM temperature.
02-26-2019 08:37 PM
02-26-2019 09:00 PM
09-16-2019 05:10 PM
Falkentyne wrote:
Hi Shamino (and Raja), I know you're very busy and may not have enough time, but I hope I can borrow you for a few minutes.
You wrote a post about MCE and AVX offsets and guardband voltage range(?) and then mentioned something about transient response and LLC. I was messing around with Ultra Extreme (yuck) LLC (Gigabyte Aorus Master), aka LLC8 on the Asus boards, at extremely lower voltages, trying to see where the VMIN was, but I found some pretty unstable stuff happening:
4.7 ghz, 4.4 ring:
LLC5 (high): Bios voltage: 1.235v, load VR VOUT: 1.109v, FMA3 15K fixed prime95: 2H+ stable. (that vdroop though). Max amps=155A
LLC6 (turbo): Bios voltage: 1.195v, load VR VOUT: 1.133v, FMA3 15K fixed prime95: sometimes 2H stable, sometimes thread #6 crashes between 45min-1 hour 40 min (temps are fully stabilized after 15 minutes). Temps and amp draw are higher however. max amps: 162.5A
LLC8 (Ultra Extreme): Bios: 1.165v. Load VR VOUT: 1.165v, 15K FMA3: temps much higher and amps 170A+, eventual BSOD or random thread crashes(?)
Elmor was interested in my results and did some tests on his 9900K:
I actually did your 'transient ratio switching' test at a setting i knew would pass AVX easily (maybe not FMA3), at 4.7 ghz again:
the 1.195v set in bios with LLC Turbo, 15K FFT with AVX this time (AVX fully stable if not FMA3, but AVX), LLC Turbo (6), and I used throttlestop to change the CPU ratio from 47 to 46. When I went back to 47, multiple random threads would crash sometimes, or I would get a CPU L0 error. One time I got a system service exception. Another time a WHEA uncorrectable error. That was fun.
If I did 14K with AVX and FMA3 disabled (15K is not a valid FFT with AVX disabled), there were no crashes.
I did another test that I already knew in advance would pass 1-2 hours of "AVX 1344K in place fixed FFT's" prime95 of 1.275v, LLC Turbo (6) at 5 ghz (4.7 ghz cache):
Using the brand new prime95 (version 29.6 build 2) that lets you disable AVX And AVX2 in the stress option dialog, as well as improved load testing, I did the following test:
two instances of prime95:
12 threads smallest FFT (starts at 4k) AVX disabled. (This was required: Full on AVX small FFT would be instacrash or 100C if voltage is raised up).
4 threads AVX 512K-8192K, RAM size 5100, time for test:0 minutes (AVX2 disabled, AVX enabled).
An AVX thread literally instacrashed.
Went up 5mv at a time, AVX threads kept instacrashing up to 1.30v, where it actually lasted longer than a few seconds. Temps hit 99C at this point when it made it to AVX pass 2, (then a thread crashed again) so I couldn't go any higher thanks to temps.
So I tried what you said in your sticky:
Steepened (lowered?) the LLC and raised the VID (bios voltage?).
I set Bios voltage to 1.345v and LLC to High (LLC5).
Vdroop was massive, and I forgot the VR VOUT, but I think it (CPU On-die sense, or VR VOUT) was equal to the LLC6+1.285v test, but it actually made it 4 passes without crashing, then I stopped the test because I got too scared my fun would be ruined by a crash.
So I was curious what was happening here for sure.
Is it the transient response (causing the load voltage to keep dropping below VMIN at such high LLC, even though the average voltage is higher?).
Does PWM switching frequency have any effect on this? I have it maxed out to 500 khz.
I could find no difference at all with "PWM Phase control".
If I had a good oscilloscope, would I be able to see what is happening with the "VMIN" with these AVX/FMA3 LLC5 and LLC6 tests?
Sorry for the huge wall of text and thank you for your help.