cancel
Showing results for 
Search instead for 
Did you mean: 
Silent_Scone
Super Moderator

Introduction

Many users enter the world of DRAM overclocking with high expectations, especially after purchasing pre-binned kits that promise extreme speeds and tight timings. The logic seems simple: if a manufacturer rates a kit for 8000 MT/s, it should work, right? Unfortunately, reality can occasionally throw a curveball. Variability in BIOS versions, CPU memory controller quality, and overall system compatibility means that even a supposedly "guaranteed" kit might not be fully stable in every system.

It's understandable why some users fall for the misconception that overclocking, especially with XMP and EXPO, comes with ironclad guarantees. These profiles are designed to make higher memory speeds more accessible, but their success still depends on various system factors. Enabling these profiles is still overclocking—running not just the DRAM bus but also various subsystems outside of specifications—which introduces an element of unpredictability due to part-to-part variance. 

 

Understanding DRAM Overclocking

Overclocking DRAM, including the use of XMP (Extreme Memory Profile) and EXPO (Extended Profiles for Overclocking), involves running memory modules beyond their officially rated specifications and outside the JEDEC standard. This means that performance, stability, and reliability can vary from system to system due to differences in silicon quality, motherboard, and CPU memory controller capability.

Since memory manufacturers bin their chips for specific speeds and timings, even modules rated for the same speed may not behave identically. Furthermore, overclocking is never guaranteed to be completely stable, as pushing components beyond their stock settings increases the likelihood of errors. It's important to understand that no overclock is 100% stable.

Even a stock system that adheres to JEDEC standards is capable of "flipping a bit". This is why ECC (Error Correction Code) memory exists in the commercial space. Once one acknowledges this, the prospect of testing memory stability for upward of 24 hours or more on a gaming system seems somewhat fruitless. For instance, what if the system throws a violation at 25 hours or 32 hours? In reality, we can't account for all permutations - and we test enough to ensure sufficient stability guardband for our use case!

You can read more regarding EXPO/XMP here: Memory Overclocking - What you may or may not know

 

Leave other Domains at Default

It's crucial not to overclock other clock domains while we focus on establishing memory overclocking stability. This approach ensures that, in the event of a stress test failure, we can quickly pinpoint the source of instability. For instance, overclocking cache/uncore domains increases the data passing through the bus. When we overclock the memory, it further amplifies the data transfer, which can lead to errors. By prioritizing memory stability first, we reduce the risk of introducing additional variables and can more easily isolate any issues that arise during testing.

 
 

Stability Testing for DRAM Overclocking

Unstable DRAM can manifest in various ways, ranging from immediate crashes and failed boots to more subtle issues like application errors, data corruption, or random system freezes. In extreme cases, an unstable memory overclock may lead to blue screen errors (BSODs) in Windows or unexpected reboots under heavy loads. Stability testing helps identify these potential problems before they cause disruptions in everyday use.

 

I've enabled XMP/EXPO and now my system won't boot!

It's also important to distinguish between instability that prevents a system from successfully completing POST (Power-On Self Test) and instability that only emerges once the operating system is loaded. Failure to POST when overclocking is often the result of memory training failure. Training is more difficult to pass than operating system-based tests due to its strict pass/fail criteria. During training, the electrical signals between the memory modules and the memory controller are calibrated to stay within a predefined, programmable margin by conducting a comprehensive number of read-and-write tests. If any signal encroaches upon this margin, the process fails outright.

 

How long does memory training take?

The duration of memory training varies depending on several factors, including the platform, memory IC type, and total installed capacity. On AM5 platforms, it's common for the system to display Q-CODE 15 for an extended period—sometimes five minutes or longer—especially after a first boot or a CMOS clear. This is expected behaviour, so don’t worry if your system seems to be taking its time. Interrupting the process by restarting the system or powering it off will restart the process, so it’s best to let it run its course before assuming something's wrong.

 

I've enabled XMP/EXPO and now my applications/system is crashing!

Stability issues within the operating system, such as crashes under load or corrupted files, indicate that while the memory settings were sufficient to pass training, they lacked the stability required for sustained operation. This is why stress testing is necessary to ensure reliability beyond simply booting into the OS.

Operating system-based tests determine stability by verifying whether data can be written and read correctly over time. These tests do allow for some degree of waveform misalignment, as data may still remain valid for the duration of testing. However, this does not necessarily indicate long-term reliability, as slight timing inconsistencies could manifest under different workloads or system conditions, leading to instability over time. This is why it can be important to incorporate more than one stress test in your routine, as broadening the different data patterns used within different suites will help confirm stability.

Several testing tools are available to determine the stability of an overclocked memory configuration. Each tool has strengths and limitations, and no single test can provide absolute certainty of stability.

 

Active Cooling

When pushing frequency and stability margins, a fan can be beneficial in keeping DIMM temperatures in check. It's important to remember that, for a gaming system, the DRAM bus won't be loaded nearly as much as running stress tests. Stress tests use synthetic data patterns with the key focus of trying to expose violations over time, whilst real-time applications will only send read-and-write operations that they need to.

A 120mm fan placed over the modules can reduce the reported SPD temperature sensor by 10c. There are some misconceptions about what temperatures are acceptable and the generalisation that under 50c is a golden rule. Whilst keeping below a certain temperature has benefits, the temperature at which the DRAM is stable largely depends on the overclock and how conditional the stability is. You can read more here regarding temperature dependencies and how granular control can help. DIMM Flex - Granular Control

Use HWiNfO to monitor DRAM SPD temperature.

Silent_Scone_0-1739425315162.png

Temperature taken at idle

 

Karhu RAM Test

Karhu Ram Test is a paid for tool with a simple-to-use interface. The tool automatically assigns the necessary amount of memory and can be run from within the operating system. Link

Karhu is able to find some violations within 5 minutes that may go undetected for several complete passes of Memtest86+ (45 minutes to over an hour). Making it an invaluable tool in stress testing your memory.

Silent_Scone_0-1739253034190.png

Coverage information

Karhu's FAQ states the following coverage time for error detection rates:

Below are the error detection rates by test duration based on over 100,000 test runs and roughly 8 years worth of non-stop RAM Test*:

 

  • ≤ 1 min: 47.42 %
  • ≤ 5 min: 65.05 %
  • ≤ 10 min: 74.89 %
  • ≤ 30 min: 87.62 %
  • ≤ 1 h: 92.97 %
  • ≤ 3 h: 97.84 %
  • ≤ 6 h: 99.30 %
  • ≤ 12 h: 99.91 %
  • ≤ 24 h: 99.99 %

I've personally been using Karhu for over 5 years, in this time I have never once tested a system for over 12 hours. For a gaming system, 3 to 6 hours should be ample - but it's entirely down to personal preference. There's no fail-safe amount of coverage when overclocking.

https://www.karhusoftware.com/ramtest/

 

Google Stress App Test (Windows)

Google stressapp test via Linux Mint (or another compatible Linux disti) or via Windows is one the best memory
stress test available. Google used this stress test to evaluate memory stability of their servers
- nothing more needs to be said about how valid that makes this as a stress test tool.

  • Install Bash Terminal: https://msdn.microsoft.com/en-gb/commandline/wsl/install_guide
  • Install the Google Stress App test by typing: sudo apt-get install stressapptest
  • Once installed open “Terminal” and type the following: stressapptest -W -s 3600
  • You can add argument "-M" and add the amount of memory you wish to assign to the test (90% of available memory)
  • This will run the stressapp for one hour. The test will log any errors as it runs.

 

Google Stress App Test (Linux Mint) (GSAT)

To bring up system info within Mint Terminal, type: sudo dmidecode type 17 and scroll to the relevant info.

 

HCI Memtest Pro

HCI Memtest Pro is widely adopted as an industry standard by motherboard and memory vendors alike and is a paid for, easy to use tool. There is also a Deluxe version which contains a bootable function for testing outside of the operating system.

Memtest Pro is also quite good at catching certain cache violations on some platforms, making it an invaluable tool for testing overclocks where multiple subdomains are overclocked.

https://hcidesign.com/memtest/

 

Post your experiences or ask for assistance with any of the tools posted here 👍Hardware & Build Advice - Memory Stability Thread

 

Related Articles & Links

Memory Kits - Overclocking and What You May Not Know

DIMM Flex: Realtime DRAM Optimisatrion

DIMM Fit - Final Fine Tuning

CDK, CUDIMM & Memory Gears - What You Need to Know

Software Links

Karhu Ram Test

Google Stress App Test (GSAT)

HCI Memtest Pro

HWiNFO

 

 

3 Comments