03-30-2024 01:41 AM - edited 03-30-2024 01:44 AM
I thought I'd share some results on Intel 14th gen and the ROG Z790 Apex Encore with memory scaling whilst investigating use cases for DIMM Flex.
As most will know, in the exploration of DDR5 it's crucial to recognise that substantial CPU and memory gains often manifest more prominently at lower resolutions due to what's referred to as CPU bottlenecking. When venturing into the pursuit of the "bleeding edge", it's common for users to gravitate towards higher resolutions. Whilst the impact of DDR5 may be more nuanced at elevated resolutions due to being GPU-bound, it's still interesting to see how pushing DRAM impacts performance.
System & Configuration
ASUS TUF Gaming GeForce RTX™ 4090 24GB
ROG MAXIMUS Z790 APEX ENCORE
Intel i9 14900KF Raptor Lake
GSKILL 8000MHz 2x24GB (F5-8000J4048F24GX2-TZ5RS
Crucial T700 1TB PCIe Gen5 NVMe M.2 SSD
Cooling
HEATKILLER® IV PRO CPU Waterblock
Thermal Grizzly 13th Gen Contact Frame
EK-Quantum Vector² Strix/TUF RTX 4090 D-RGB Waterblock
EK-DDC Series Pump
EK-Vadar EVO 2400RPM Fans
EK-CoolStream XE 360mm
EK-CoolStream PE 360mm
Koolance QDC3 Nickel Fittings
Setup
Windows 11 Pro 23H2
Nvidia Driver 551.76
ROG Apex Encore UEFI 1001
4800MT to the Moon
Without muddying the water too much, what's key to pay attention to is the comparison between our tuned 7200MT profile and XMP on the GSKILL 8000MHz 2x24GB memory kit (GSKILL 8000MHz 2x24GB (F5-8000J4048F24GX2-TZ5RS). Much like yearning for 5GHz on Skylake was, 8000MT is often considered the "must have" frequency now that it's been validated for 4 DIMM motherboards as an absolute maximum. Before we even consider tuning beyond XMP, hitting max validation is often tough due to less signal/overclocking margin, and tuning for these things can become difficult due to the voltage validity window for stability becoming narrower. If the system is at the very fringe of what a motherboard / CPU is capable of, sometimes it can be as little as 20mV on some of the IO/memory voltages, and it becomes a rather frustrating exercise only for the stability to be invalidated under certain conditions or stress tests.
It's no surprise that memory timings and frequency are intrinsically related, so finding platform sweet spots should outweigh opting for outright frequency.
4800MT [Manual frequency - All timings board controlled]
5600MT [JEDEC all auto]
7200MT [manually tuned timings]
8000MT [XMP I - Board controlled timings]
Shadow of the Tomb Raider
In terms of the benchmark metrics, the game thread is responsible for executing all gameplay logic. After each frame, it synchronizes the positions and states of objects within the game world with the Render Thread. The Render Thread handles rendering logic, ensuring objects are displayed correctly. Generally, the CPU Game thread serves as the primary thread in this process.
At 1080P High preset, SOTR shows a significant 27% increase in average FPS when transitioning from 4800MT to 8600MT. While XMP I may provide convenience, manual tuning proves superior performances with a 4% average framerate uplift tuning memory latency at 7200MT.
Red Dead Redemption 2
The RDR2 benchmark stands out as one of the most demanding for GPUs. This is evidenced by the results, which reveal subtle performance gains. All benchmarks for RDR2 are conducted using the Vulkan API. The game features cutting-edge rendering technologies such as Global Illumination and Volumetric Fog, which contribute to its stunningly realistic lighting and atmospheric effects. The nature of this benchmark can result in rather inconclusive results as the test runs for roughly 5 minutes, but we can still observe scaling.
Farcry 6
Far Cry 6 invites players to a lush yet troubled Caribbean island, where rebellion and revolution collide in a fight for freedom and justice. The game incorporates ray tracing for realistic lighting and reflections and dynamic weather systems for immersive atmospheric effects.
Farcry 6 is quite receptive to memory latency. Moving from JEDEC to our 7200MT profile results in an uplift of 17% in average FPS and 14% to minimums. A further 5% and 4% are found to average and minimum FPS respectively when moving to our 8600MT profile.
Even at Ultra settings at 1440P, memory scaling can be seen in FarCry 6 with a 16.6% uplift from JEDEC to 8600MT in our average frame rate and a massive 21.1% increase to our minimums!
Forza Horizon 5
FH5 utilises advanced rendering techniques including Ray Tracing and Variable Rate Shading (VRS) to enhance visual fidelity and performance. Notably, the benchmark results present two metrics: "CPU Simulation" and "CPU Render." While "CPU Simulation" likely accounts for computational tasks related to gameplay mechanics, physics, and AI, "CPU Render" likely measures the processing power required for rendering graphics and visual elements. This distinction provides insights into how the game utilises CPU resources for different aspects of its operation.
Metro Exodus Enhanced Ed
Metro Exodus features advanced rendering technologies such as Ray Tracing, Global Illumination, and DLSS (Deep Learning Super Sampling) for improved visuals and performance, providing players with a more immersive experience in the post-apocalyptic world of Metro Exodus.
Moving from stock to 8600MT configuration results in increases of approximately 6%, 10% and 8% respectively for average, maximum and minimum FPS
At low detail, our tighter 7200MT profile lifts the minimum FPS floor by 5.5% over XMP 8000MT. JEDEC to our 8600MT profile results in an uplift of 10.4% average fps and 15% to minimums.
Uncore Tests
This is just a glimpse into some of the presets I've been running, but is an interesting window into scaling to the uppermost echelon for what can be achieved on a daily overclock (8800-9000MT to follow).
While core overclocking remains king, it won't come as much of a surprise to some that optimizing memory and cache domains can yield benefits to our minimum FPS floor / 1% lows. Achieving an impressive 8600MT may be challenging and restricted to specific hardware configurations like the ROG MAXIMUMS Apex Encore, however, tuning 7200MT showcases notable enhancements with a substantial improvement in average and minimum framerates respectively. I've purposefully left out some of the more e-sport and competitive titles here to follow with some more insight.
04-03-2024 02:02 PM
awesome job. If i am reading this correctly, sweet spot is 7200. Got good score and cheap. 8000+ can only handle by Apex board.
04-05-2024 01:46 AM
Have you thought about testing what Gear 1 can do, with DDR4 lower latency always trumped high m/t for .1% fps. Would love to see if DDR5 can scale that same way. Havent seen anyone do any benchmarking that, yet.
04-15-2024 04:13 AM
Gear 1 isn't possible in this case and is reserved for DDR4
05-03-2024 04:56 AM
Your approach, method, and analysis are outstanding.
It would be easier to grasp the relationships between different settings' results if the x-axis on the charts did not vary so often, from one chart to the next. I understand that legibility is very important, and I also understand that the most important function of each individual chart is to show the differences between the different timing settings when applied to the same game mode and screen resolution. To prioritize those things is a good decision. Yet consistency from chart to chart would also enable the reader to gain valuable insights. For example, when I look at the three Red Dead Redemption 2 charts 1080p (Max Settings DLSS Quality), 1080p (Lowest Quality), and 1440p (Max Settings DLSS Quality), it is difficult to gauge the data when comparing the results on one chart to the results on an adjacent chart, due to the charts' x-axis scales all being different from each other.
Thank you, again, for sharing the results of your great work.
05-07-2024 07:40 AM
if there’s more analysis or any particular games people would like to see scaling on up to 8800MT, please post in the thread and I’ll see what I can do 👍
05-07-2024 12:51 PM - edited 05-07-2024 12:53 PM
Thanks Scone. I would be interested in IL2 Great Battles flight sim although it's admittedly less popular than other titles. Thanks!!
05-07-2024 01:10 PM
is there any vsy y cruncher or tm5 / kuru stability tests done on these configs ?
alof of syetems freeze when the ram timings chnage inside of windows aka - you hit your temp limit 40c on the ram and it switches to the looser config ......
05-07-2024 02:34 PM - edited 05-07-2024 02:36 PM
all configs are Karhu stable. Only 8600MT had a configured DIMM Flex profile for which performance was tested successfully without crashing at the level 2 gradient. I neglected to post the timing gradient as this will be system specific depending on the capability of the CPU and DRAM modules, but I intend to follow up with a deeper look at the impact of some subtimings (it just takes more time).
06-24-2024 09:18 AM - edited 06-24-2024 09:20 AM
A continuation:
A few more tests were conducted to show 1% lows with frequency/timing adjustment.
AC: Valhalla
AC: Valhalla responds surprisingly well to memory latency. Between our JEDEC speeds and 6800 tCAS 32 profile, we see an 11% improvement to our 1% lows and 5% to the average frame rate.
At 720P, increasing frequency to 7600MT and beyond eases the bottleneck, with a 13% uplift in 1% lows from stock frequency and 11% increase to average FPS.
CS2
Counter-Strike 2 at 720P shows us how things are done to induce a CPU bottleneck. Despite this, we still see an 11% improvement to 1% minimums over stock operation and 2.6% to the average frame rate.
1080P we see a surprising uplift of 34 % to our 1% minimum when moving from stock to XMP 8000 and a peak of 42% at 8400MT.
Borderlands 3
Borderlands 3 memory scaling seems to behave differently to other games tested, benefiting greatly from 1% lows on my test system once exceeding 7600MT with a whopping 127% increase to 1% lows over stock.
Hitman 2
Hitman 2 shows minimal gains with a small bump to our 1% lows and average framerate, with just 8% moving from 5600MT to our tuned 6000MT profile.
Futuremark: Steel Nomad
Futuremark's new GPU-intensive benchmark pushes the 4090 to the max, however, the test barely hits the CPU or memory domains at all.
Futuremark TimeSpy CPU Test
Futuremark's TimeSpy CPU is Havok physics-based, which is known for being sensitive to memory performance. Here we can see a score improvement of 27.6% over stock operation at 8600MT.
Refresh Interval
Here we test the benefits of increasing our DRAM refresh interval to *decrease* the frequency in which the refresh operation takes place (see DIMM Flex overview).