cancel
Showing results for 
Search instead for 
Did you mean: 

Exploring the impact of DDR5 in 2024 [Intel 14th Gen]

Silent_Scone
Super Moderator

I thought I'd share some results on Intel 14th gen and the ROG Z790 Apex Encore with memory scaling whilst investigating use cases for DIMM Flex.

As most will know, in the exploration of DDR5 it's crucial to recognise that substantial CPU and memory gains often manifest more prominently at lower resolutions due to what's referred to as CPU bottlenecking. When venturing into the pursuit of the "bleeding edge", it's common for users to gravitate towards higher resolutions. Whilst the impact of DDR5 may be more nuanced at elevated resolutions due to being GPU-bound, it's still interesting to see how pushing DRAM impacts performance.

 

System & Configuration

ASUS TUF Gaming GeForce RTX™ 4090 24GB
ROG MAXIMUS Z790 APEX ENCORE
Intel i9 14900KF Raptor Lake
GSKILL 8000MHz 2x24GB (F5-8000J4048F24GX2-TZ5RS
Crucial T700 1TB PCIe Gen5 NVMe M.2 SSD

Cooling

HEATKILLER® IV PRO CPU Waterblock
Thermal Grizzly 13th Gen Contact Frame
EK-Quantum Vector² Strix/TUF RTX 4090 D-RGB Waterblock
EK-DDC Series Pump
EK-Vadar EVO 2400RPM Fans
EK-CoolStream XE 360mm
EK-CoolStream PE 360mm
Koolance QDC3 Nickel Fittings

Setup
Windows 11 Pro 23H2
Nvidia Driver 551.76
ROG Apex Encore UEFI 1001

 

 

240316084855.jpg

 

Silent_Scone_0-1711563201839.jpeg

4800MT to the Moon

Without muddying the water too much, what's key to pay attention to is the comparison between our tuned 7200MT profile and XMP on the GSKILL 8000MHz 2x24GB memory kit (GSKILL 8000MHz 2x24GB (F5-8000J4048F24GX2-TZ5RS). Much like yearning for 5GHz on Skylake was, 8000MT is often considered the "must have" frequency now that it's been validated for 4 DIMM motherboards as an absolute maximum. Before we even consider tuning beyond XMP, hitting max validation is often tough due to less signal/overclocking margin, and tuning for these things can become difficult due to the voltage validity window for stability becoming narrower. If the system is at the very fringe of what a motherboard / CPU is capable of, sometimes it can be as little as 20mV on some of the IO/memory voltages, and it becomes a rather frustrating exercise only for the stability to be invalidated under certain conditions or stress tests.

It's no surprise that memory timings and frequency are intrinsically related, so finding platform sweet spots should outweigh opting for outright frequency. 

4800MT [Manual frequency - All timings board controlled]

4800MT.png

 

5600MT [JEDEC all auto]

JEDEC [Auto].png

7200MT [manually tuned timings]

7200 tweaked.png

 

8000MT [XMP I - Board controlled timings]

8000 XMP.png

 

Shadow of the Tomb Raider

In terms of the benchmark metrics, the game thread is responsible for executing all gameplay logic. After each frame, it synchronizes the positions and states of objects within the game world with the Render Thread. The Render Thread handles rendering logic, ensuring objects are displayed correctly. Generally, the CPU Game thread serves as the primary thread in this process. 

Silent_Scone_1-1711563500668.png

Silent_Scone_2-1711563657982.png

At 1080P High preset, SOTR shows a significant 27% increase in average FPS when transitioning from 4800MT to 8600MT. While XMP I may provide convenience, manual tuning proves superior performances with a 4% average framerate uplift tuning memory latency at 7200MT.

Silent_Scone_0-1711692903086.png

Red Dead Redemption 2

The RDR2 benchmark stands out as one of the most demanding for GPUs. This is evidenced by the results, which reveal subtle performance gains. All benchmarks for RDR2 are conducted using the Vulkan API. The game features cutting-edge rendering technologies such as Global Illumination and Volumetric Fog, which contribute to its stunningly realistic lighting and atmospheric effects. The nature of this benchmark can result in rather inconclusive results as the test runs for roughly 5 minutes, but we can still observe scaling. 

Silent_Scone_1-1711569792661.png

Silent_Scone_2-1711569810301.png

Silent_Scone_1-1711692968879.png

 

Farcry 6

Far Cry 6 invites players to a lush yet troubled Caribbean island, where rebellion and revolution collide in a fight for freedom and justice. The game incorporates ray tracing for realistic lighting and reflections and dynamic weather systems for immersive atmospheric effects.

 

Silent_Scone_3-1711570280273.png

Farcry 6 is quite receptive to memory latency. Moving from JEDEC to our 7200MT profile results in an uplift of 17% in average FPS and 14% to minimums. A further 5% and 4% are found to average and minimum FPS respectively when moving to our 8600MT profile.

Silent_Scone_5-1711570320047.png

Silent_Scone_2-1711693024239.png

Even at Ultra settings at 1440P, memory scaling can be seen in FarCry 6 with a 16.6% uplift from JEDEC to 8600MT in our average frame rate and a massive 21.1% increase to our minimums!

Forza Horizon 5

FH5 utilises advanced rendering techniques including Ray Tracing and Variable Rate Shading (VRS) to enhance visual fidelity and performance. Notably, the benchmark results present two metrics: "CPU Simulation" and "CPU Render." While "CPU Simulation" likely accounts for computational tasks related to gameplay mechanics, physics, and AI, "CPU Render" likely measures the processing power required for rendering graphics and visual elements. This distinction provides insights into how the game utilises CPU resources for different aspects of its operation.

 

Silent_Scone_1-1711609064070.png

 

Silent_Scone_2-1711609135228.png

 

Silent_Scone_3-1711693090707.png

Metro Exodus Enhanced Ed

Metro Exodus features advanced rendering technologies such as Ray Tracing, Global Illumination, and DLSS (Deep Learning Super Sampling) for improved visuals and performance, providing players with a more immersive experience in the post-apocalyptic world of Metro Exodus.

Silent_Scone_4-1711693374556.png

Moving from stock to 8600MT configuration results in increases of approximately 6%, 10% and 8% respectively for average, maximum and minimum FPS

Silent_Scone_5-1711693770948.png

At low detail, our tighter 7200MT profile lifts the minimum FPS floor by 5.5% over XMP 8000MT. JEDEC to our 8600MT profile results in an uplift of 10.4% average fps and 15% to minimums.

Silent_Scone_6-1711694118765.png

 

Uncore Tests

Silent_Scone_0-1711694863865.png

Silent_Scone_1-1711694904078.png

Silent_Scone_2-1711694924195.png

Silent_Scone_3-1711694945477.png

Silent_Scone_5-1711694991980.png

 

This is just a glimpse into some of the presets I've been running, but is an interesting window into scaling to the uppermost echelon for what can be achieved on a daily overclock (8800-9000MT to follow).

While core overclocking remains king, it won't come as much of a surprise to some that optimizing memory and cache domains can yield benefits to our minimum FPS floor / 1% lows. Achieving an impressive 8600MT may be challenging and restricted to specific hardware configurations like the ROG MAXIMUMS Apex Encore, however, tuning 7200MT showcases notable enhancements with a substantial improvement in average and minimum framerates respectively. I've purposefully left out some of the more e-sport and competitive titles here to follow with some more insight.

 

 

 

 

9800X3D / 6400 CAS 28 / ROG X870 Crosshair / TUF RTX 4090
7,870 Views
18 REPLIES 18

Shinchan0125
Level 11

awesome job.  If i am reading this correctly, sweet spot is 7200.  Got good score and cheap.  8000+ can only handle by Apex board.

Taint3dBulge
Level 10

Have you thought about testing what Gear 1 can do, with DDR4 lower latency always trumped high m/t for .1% fps. Would love to see if DDR5 can scale that same way. Havent seen anyone do any benchmarking that, yet.

Gear 1 isn't possible in this case and is reserved for DDR4

9800X3D / 6400 CAS 28 / ROG X870 Crosshair / TUF RTX 4090

catsmoke
Level 9

Your approach, method, and analysis are outstanding.

It would be easier to grasp the relationships between different settings' results if the x-axis on the charts did not vary so often, from one chart to the next. I understand that legibility is very important,  and I also understand that the most important function of each individual chart is to show the differences between the different timing settings when applied to the same game mode and screen resolution. To prioritize those things is a good decision. Yet consistency from chart to chart would also enable the reader to gain valuable insights. For example, when I look at the three Red Dead Redemption 2 charts 1080p (Max Settings DLSS Quality), 1080p (Lowest Quality), and 1440p (Max Settings DLSS Quality), it is difficult to gauge the data when comparing the results on one chart to the results on an adjacent chart, due to the charts' x-axis scales all being different from each other.

Thank you, again, for sharing the results of your great work.

Silent_Scone
Super Moderator

if there’s more analysis or any particular games people would like to see scaling on up to 8800MT, please post in the thread and I’ll see what I can do 👍

9800X3D / 6400 CAS 28 / ROG X870 Crosshair / TUF RTX 4090

Thanks Scone. I would be interested in IL2 Great Battles flight sim although it's admittedly less popular than other titles. Thanks!!

https://il2sturmovik.com/

MZ790AE Bios 1801, GSkill F5-8400J4052G24GX2-TR5S, 14900KS, EKWB D5 TBE 300, Seasonic Prime TX-1600 ATX 3.1 Noctua Edition, Asus Strix 4090 w/ HK block, Phanteks Enthoo Elite, Asus Claymore 2, Asus Gladius 3, Asus XG349C, Crucial T705, Windows 11 Pro

bass_junkie_xl
Level 12

is there any vsy y cruncher or tm5 / kuru stability tests done on these configs ? 

alof of syetems freeze  when the ram timings chnage inside of windows  aka -  you hit your temp limit 40c on the ram and it switches to the looser config ......

Rig # 1 - 14900Ks SP-124 | 90 MC @ 6.0 GHZ | 5.2 R | 4.7 E | DDR5 48GB @ 8,600 c36 | Strix RTX 4090 | PG27AQN 1440P 27" 360 Hz G-Sync ULMB 2

Rig # 2 - 14900Ks-SP-118 | 89 MC @ 5.9 GHZ | 5.2 R | 4.7 E | DDR4 32GB @ 4,533 c16 | Strix RTX 3080 | Aoc 1080P 25" 240 Hz G-Sync

all configs are Karhu stable. Only 8600MT had a configured DIMM Flex profile for which performance was tested successfully without crashing at the level 2 gradient. I neglected to post the timing gradient as this will be system specific depending on the capability of the CPU and DRAM modules, but I intend to follow up with a deeper look at the impact of some subtimings (it just takes more time).

9800X3D / 6400 CAS 28 / ROG X870 Crosshair / TUF RTX 4090

Silent_Scone
Super Moderator

A continuation:

A few more tests were conducted to show 1% lows with frequency/timing adjustment.

AC: Valhalla

Silent_Scone_0-1719245848171.png

 

AC: Valhalla responds surprisingly well to memory latency. Between our JEDEC speeds and 6800 tCAS 32 profile, we see an 11% improvement to our 1% lows and 5% to the average frame rate.

 

Silent_Scone_1-1719245848225.png

 

 

At 720P, increasing frequency to 7600MT and beyond eases the bottleneck, with a 13% uplift in 1% lows from stock frequency and 11% increase to average FPS.

 

CS2

Silent_Scone_2-1719245848069.png

 

Counter-Strike 2 at 720P shows us how things are done to induce a CPU bottleneck. Despite this, we still see an 11% improvement to 1% minimums over stock operation and 2.6% to the average frame rate.

Silent_Scone_3-1719245848172.png

 

1080P we see a surprising uplift of 34 % to our 1% minimum when moving from stock to XMP 8000 and a peak of 42% at 8400MT.

Borderlands 3

Silent_Scone_4-1719245848486.png

 

Silent_Scone_5-1719245847944.png

 

Borderlands 3 memory scaling seems to behave differently to other games tested, benefiting greatly from 1% lows on my test system once exceeding 7600MT with a whopping 127% increase to 1% lows over stock.

 

Hitman 2

Silent_Scone_6-1719245848035.png

 

 

Silent_Scone_7-1719245848506.png

 

Hitman 2 shows minimal gains with a small bump to our 1% lows and average framerate, with just 8% moving from 5600MT to our tuned 6000MT profile.

Futuremark: Steel Nomad

Silent_Scone_8-1719245848055.png

 

Futuremark's new GPU-intensive benchmark pushes the 4090 to the max, however, the test barely hits the CPU or memory domains at all.

 

Futuremark TimeSpy CPU Test

Silent_Scone_9-1719245848171.png

 

Futuremark's TimeSpy CPU is Havok physics-based, which is known for being sensitive to memory performance. Here we can see a score improvement of 27.6% over stock operation at 8600MT.

 

Refresh Interval

Here we test the benefits of increasing our DRAM refresh interval to *decrease* the frequency in which the refresh operation takes place (see DIMM Flex overview).

 

Silent_Scone_10-1719245848055.png

 

Silent_Scone_11-1719245848172.png

 

Silent_Scone_12-1719245848171.png

 

Silent_Scone_13-1719245848384.png

 

Silent_Scone_14-1719245848172.png

 

Silent_Scone_15-1719245848175.png

 

Silent_Scone_16-1719245848367.png

 

 

9800X3D / 6400 CAS 28 / ROG X870 Crosshair / TUF RTX 4090