cancel
Showing results for 
Search instead for 
Did you mean: 

Help! System hangs/freezes when overclocked

Thireus
Level 7
Hi,

I recently got the following configuration setup:

CPU - i9-7980XE
RAM - 8x16GB = 128GB of RAM - G.Skill F4-3200C14Q2-128GTZS (https://www.gskill.com/en/product/f4-3200c14q2-128gtzsw)
MOTHERBOARD - ASUS Rog Rampage VI Extreme
BIOS 1004


I am currently trying to diagnose system freezes/hangs/crash that seems to occur after several hours of uptime (tested on Windows 10 or other OS). I am not able to diagnose exactly where the issue is coming from, but from the BSOD dumps it appears this is memory related. I've tried both the CPU settings set to "Auto" and to Manually set a Multiplier and Core Voltage in the BIOS, but still the same issue occurs. The only thing I can think of now is XMP.

I am trying to narrow down where the issue could be coming from, so any suggestions would be greatly appreciated.

After running 24h of memtest86+ 5.0.1 (which did one and a half pass), with XMP enabled, no issues were detected.
After running 5-13 hours of Prime95 the system would eventually freeze or crash at some point, and the BSOD dumps are of type MEMORY_CORRUPTION_LARGE and MEMORY_CORRUPTION_ONE_BIT_LARGE.


The BSOD information states that the issue is memory-related, which makes me think that the Voltage set by default when XMP is enabled is incorrect. When enabling XMP in the BIOS, it sets the Voltage to 1.350V which is indeed the correct value that the RAM is supposed to get at 3200Mhz. However the following happens:



As you can see there are two issues:

1. The Voltage reading is incorrect and too high: it reads 1.360V instead of 1.350 --> Why? Is this normal?
2. The second DRAM Voltage value flickers between 1.344V and 1.360V --> Any clue about what's happening there?


I have tried to swap all the RAM from CHC&CHD DIMM slots to CHA&CHB DIMM slots, but the same happens, the CHC&CHD DRAM Voltage fluctuates... Could it be related to my issue?

As for now, I have decided to bring the XMP settings back to disabled and run Prime95 again for the next 24h (unless the system freezes again) to double check that it indeed comes from the XMP settings and no anything else. I will report back here.

If anyone has any idea about what I should check/set to resolve the issue, I would really appreciate it. Been trying to resolve this for a week now. 😞

Thank you!
18,865 Views
16 REPLIES 16

Raja
Level 13
What you are seeing is normal. The reported value is tapped off the power plane, so isn't going to be exactly indicative of the voltage at the dram module pads. And the SIO reading does fluctuate due to the way it samples. This is not the cause of instability.

Establish whether your system is stable at stock settings. Load defaults, do not overclock the cpu or memory, and check if it still freezes.

Raja@ASUS wrote:
What you are seeing is normal. The reported value is tapped off the power plane, so isn't going to be exactly indicative of the voltage at the dram module pads. And the SIO reading does fluctuate due to the way it samples. This is not the cause of instability.

Establish whether your system is stable at stock settings. Load defaults, do not overclock the cpu or memory, and check if it still freezes.


Thank you Raja for your quick reply. So far I have now conducted the following testing:

- Various CPU tweaking with XMP enabled:

Prime95 crash after 4 hours, system was never stable past 13h of uptime (that's been happening for a week now, when I first enabled XMP)

- Everything back to Auto with XMP enabled:

Prime95 crash after 4 hours

- Everything Auto without XMP enabled:

Prime95 still running as I write this message, for more than 12 hours now... So it seems to be all good without XMP.

Which seems to confirm that the issue is XMP related. I'm also going to follow xarot's advise and try to reproduce crashes with specific RAM testing tools when XMP is enabled and compare to when XMP is disabled. The currently issue I'm facing for troubleshooting this is that it takes several hours for the system to either crash or freeze with the Prime95 testing methodology.

I will also attempt to reproduce the issue with CMOS cleared and everything back on stock speed. But that will require another 2 days of testing at least.

xarot
Level 11
If you want to test RAM, try this new RAM test software instead of Memtest86+. It can detect memory-related errors very fast. Try to reach at least 5000 % coverage. It detected errors in 2 minutes when I tried to get my Corsair Dominator 4000 kit to run on the Rampage VI Extreme using XMP (it couldn't run). 😛

https://www.karhusoftware.com/ramtest/

What version of Prime95 and are you overclocking? Using AVX on or off? Also, tried to set VCCSA a tad higher? Say 0.95 - 1.0 V?
Main: i9-10980XE - Rampage VI Extreme Encore - 64 GB G.Skill Trident Z Royal 3600 CL16 - Strix RTX 3090 - Phanteks Enthoo Primo - Corsair AX1500i - Samsung 960 PRO 1 TB + Intel 600P 1TB - Water cooling
HTPC: i7-6950X - X99-M WS - 32 GB G.Skill RipjawsV DDR4-2400 - GTX1050TI - Bitfenix Pandora - Corsair AX860 - Intel 750 400 GB + Samsung 1 TB 850 EVO
All around: i9-7980XE - Rampage VI Extreme - 64 GB G.Skill 4000 CL18-19-19-39 - Strix RTX3090 - Phanteks P500A - Samsung 960 EVO 512 GB - Water cooling

Raja
Level 13
Yep, isolate the memory, and tune things from there.

Raja@ASUS wrote:
Yep, isolate the memory, and tune things from there.


Hi Raja,

I would like to give a quick update about my troubleshooting progress...

I think I was able to make the system freeze occur much sooner than before by using a combination of fast Prime95 blend test and memtest, here is what I did:

- Customise Prime95 26.6 with "Time to run each FFT size (in minutes):" to a value of "1" instead of "15", so that all tests run much faster
- Run one instance of memtest


The following happened:

- With XMP enabled I had a system freeze after about 1h30 running
- Without XMP enabled, I had no system freeze at all after more than 15h as previously discussed


I thus decided it was time to reset my BIOS back to a clean 1004. So I re-flashed my 1004 version with the same version, which cleared all the settings for me. Before I did that, I made sure to save my XMP-enabled settings that were causing freezes. I then ONLY enabled XMP with this new cleared BIOS (everything else to stock).

These are the non-stock settings when XMP enabled would cause the system to freeze after ~1h30 of custom Prime95 in combination with memtest:

AVX Instruction Core Ratio Negative Offset [15]
AVX-512 Instruction Core Ratio Negative Offset [15]
CPU SVID Support [Disabled]
CPU Load-line Calibration [Level 3]
CPU Current Capability [140%]
CPU Power Phase Control [Extreme]
Autonomous Core C-State [Enabled]
Enhanced Halt State (C1E) [Enabled]
CPU C6 report [Enabled]
Package C State [C6(non Retention) state]
Intel(R) Speed Shift Technology [Enabled]
MFC Mode Override [OS Native Support]
Fast Boot [Disabled]
Boot from Network Devices [Ignore]


After re-flashing the BIOS and ONLY enabling XMP, the above previous settings are the following:

AVX Instruction Core Ratio Negative Offset [Auto]
AVX-512 Instruction Core Ratio Negative Offset [Auto]
CPU SVID Support [Auto]
CPU Load-line Calibration [Auto]
CPU Current Capability [Auto]
CPU Power Phase Control [Auto]
Autonomous Core C-State [Auto]
Intel(R) Speed Shift Technology [Auto]
MFC Mode Override [MFC Driver Override]
Fast Boot [Enabled]
Next Boot after AC Power Loss [Normal Boot]
Boot from Network Devices [Legacy only]


Now on these stock+XMP-enabled settings my stress tests have been running for about 2 hours (with the Prime95 settings I mention above and 2 instances of memtest running) and so far no freezes or crashes…

I will let things run like this for several hours and see if the issue is resolved. But I would still like to be able to reproduce the behaviour I noticed on the non-default BIOS options I initially had that were causing XMP not to function properly.

Could you please have a look at the above setting differences and tell me if you think there is something incompatible with XMP that I had enabled that could be the cause of the freezes? So I can try to reproduce it and will know that I should avoid this setting for future reference.

Thank you!

P.S.: I also found a bug, when you re-flash the BIOS on the same 1004 version, after the first boot the system will ask to press F1, once the settings are saved the system will reboot and when selecting (F8) a UEFI drive for the first time on these new settings the system would immediately crash and reboot. This happens on the first boot only, then on next reboots UEFI is working properly. I think I've seen another member on the forum reporting the same behaviour. Also, do you know where I could post this kind of bug report?

Nixon2992
Level 7
You try change DRAM Phase control to EXTREME for A/B & C/D chanels and reduce DRAM switching frequncy to minimum value in Digi+ VRM.This action reduce fluctuations votage.
You system may freeze if undevoltage VCCIO rail.

Thireus
Level 7
Thank you Nixon2992, I think you're right, there is something about the voltage that isn't set properly. Once I isolate the setting that causes instability I will try what you suggest if it's still relevant.

After countless hours of troubleshooting the issue to isolate which settings are incompatible with XMP, I have successfully isolated a bunch of settings that are responsible for the system freezes:

STOCK +
XMP Enabled
Autonomous Core C-State [Enabled]


These settings will cause the machine to freeze after several hours. With my previous Prime95+memtest stress test methodology I am able to make the system freeze after about 1h and 30 minutes.

I have confirmed that with Autonomous Core C-State set to STOCK default, there is no freeze happening when XMP is enabled.

I'm still trying to narrow it down even further, but if anyone has an idea why these settings are incompatible please let me know.

Is anyone at Asus trying to reproduce the above? Or am I doing all the debugging work? 🙂

Edit: Was able to narrow it even further. LLC set to Auto (default) doesn't resolve freezes.

Edit2: CPU Power Phase Control [Extreme] is not what's causing the system freezes, I have set that one to default.

Thireus wrote:
Thank you Nixon2992, I think you're right, there is something about the voltage that isn't set properly. Once I isolate the setting that causes instability I will try what you suggest if it's still relevant.


That advice isn't accurate. Lowering the switching frequency reduces the transient response of the circuit, resulting in more fluctuation. Even then, the point is moot because you won't be able to monitor such fluctuations via software. The deviations occur in the nS to uS region, which falls outside the capabilities of a standard multimeter, too. As I said before, what you're observing is primarily an error on the monitoring IC and where it senses the voltage from. On top of that, a small amount of transient deviation occurs that a standard SIO IC could not measure even if the built-in ADC had a higher bit-depth. I would stop focusing on the fluctuation aspect of this because it isn't related in the assumed way.

Thireus
Level 7
I finally have the answer to the freezing issues I've been having! And I’m quite confident this is an issue with the Asus Rampage VI Extreme motherboard or maybe with the BIOS!

On the Asus Rampage VI Extreme, enabling both XMP and Core C-State will cause the system to freeze or crash after a few hours. When on stock settings, these two options are incompatible, at least on BIOS 1004. Therefore, I would like to request Asus to have a look at this technical issue, which clearly appears to me to be a board instability.

The following BIOS setting is responsible for the system freezes when XMP is enabled:

Autonomous Core C-State [Enabled]

All other settings (except XMP and the above) should be on stock!

To reproduce the issue and cause system freezes that occur as soon as possible after system boot on Windows 10, one can run Prime95 v26.6 with the following Custom Torture Test settings:

Min FFT: 8
Max FFTP: 4096
Run FFTs in-place: Checked
Time to run each FFT size (in minutes): 1


Also, running 2 instances of memtest from HCI Design: http://hcidesign.com/memtest/, with the maximum available memory might help trigger the freeze when running Prime95 in parallel. Although, when freezes occur memtest should not find any memory errors! (unless you have bad memory, which is not my case as I've already tested my memory)

Anyway, with the above C-State enabled, XMP enabled, and everything else on stock, the system should freeze after several hours of stress test, which should freezes after a maximum of 2h elapsed on the stress test which should be equivalent to 230% of memtest coverage. However, I have noticed that with other options enabled, such as the ones I mention in my first post, these freezzes can instead be system crashes (BSOD) in which case the Windows minidump after BSOD will be mentioned that this is memory related, sometimes it will be MEMORY_CORRUPTION_LARGE, sometimes MEMORY_CORRUPTION_ONE_BIT_LARGE. Crashes/Freezes can also occur much sooner too! If you wish to test your system stability with XMP and C-States enabled, I would recommend to please make sure to have everything on stock settings before enabling both XMP and Autonomous Core C-State only in the BIOS, that way you’ll only get system freezes!

As for now, I will leave "Autonomous Core C-State" on Auto (which I assume is in reality disabled). I will also write Asus an email about the issue, and hopefully their technical team will be able to fix this bug in a future BIOS release.

I was also curious to know if others had similar issues and decided to Google about it, and I found people who reported in the past that enabling C-States was incompatible with XMP and would cause their system to crash after several hours, which happens to be the issue I have with the Rog Rampage VI Extreme: https://vip.asus.com/forum/view.aspx?board_id=1&model=P7P55D+LE&id=20150520045828228&page=1&SLanguag...

Does anyone understand why, when enabling C-States on my config and with XMP enabled, the system would freeze after several hours? (When XMP is disabled C-States are perfectly working fine)

---------

@Raja, do you know if this is something that can be looked at by the technical team? That appears to be a bug to me, more than just incompatibility. Or at least the BIOS should prevent the use of C-States (or maybe these specific settings) when XMP is enabled.

Also, I would like to add that from what I can remember from my previous build which involved an x99 Prime Deluxe II with an i7-6950X processor I didn’t have any problems having C-States and XMP work together!