It?s been a while since Skylake launched and a whole new world opened for DDR4 tweaking . People posted various results and general guidelines are set however the most important things are still hidden for all world to see.
Some people might forget that getting some results it is sometimes only a matter of having a decent methodology and enough time at disposal and not being a genious, so this has to change as hiding subtimings and proclaiming the next Messia on overclocking world will bring nothing but more people leaving and enjoy other hobbies more exciting compared to ours.
Ranting being spit off I will try to uncover and help some of you with various tips and maybe give you hints on how to gain the last bits of performance from your memory sets.
I?ve been playing with various Asus boards this generation so all tests were performed on these.
Memory Used: Kingston 2666c15 Fury (Hynix Mfr) Samsund D-die Gskill 3000 15-15-15 Ripjaws Kingston A-die 3466 c16 (ES) Samsung E-die Gskill 4000&4266 Trident-z . Motherboards used: Maximus VIII Gene & Impact ,also Extreme to confirm Gene results. CPU : Intel I7-6700K retail,ambient water cooling. Testing method : SuperPI 32M ran three loops at 4.00 GHZ and getting an average between them. *Please note that this install of Windows Xp is not optimized for performing great in Low Clock Challenge,rather I tried to have a precise measurement tool as good as I could.For example running a D-C Wazza of 2000MB is yielding a difference of 0.100 and similar seconds for a few runs which is good enough in my book for seeing small performance gains.
So what is this all about? You must had noticed that skylake had far more many options in UEFI for tuning ram.Now how this works it?s very hard to understand for a new user of this platform without either relying on build-in profiles or copying settings from other users.
I can?t go over DDR4-3600 on Maximus 4-dimm boards Encountered this while trying to bench Samsung D-die and E-die with tertiary on auto using Maximus VIII Gene and Extreme . The culprit to blame are TRDWR_sg TRDWR_dg and TRDWR_dr + TRDWR_dd . When going higher than 3600 these need to be set on same level as CAS value otherwise board will give you a nice 55 POST CODE to look at. This is not true for Impact because Impact cand handle lower values and higher speeds so it will have no problem booting 4133 on AUTO. Websmile was first to notice so I give him credit for this.
I am stuck at 55 41 78 or 3E POST CODE
Encountered this a lot,usually: 3E means too low Write Recovery / read to precharge time 78 means too tight TRCD/TRP 41 Means you went too tight on tertiary or you pushed too much voltage on Hynix MFR 55 well this means a lot,from not enough voltage for give settings,too tight TRFC ,improper tertiary.
ISSUE # 3
General behavior of various IC
Hynix MFR is the classic of X99 and actually the WORST performer on AIR . It has many problems on AIR,meaning hard to get high frequency and tight CAS . Worst voltage tolerant IC , creates many problems like you can pass DDR4-3200 12-15-15 at 1.58 V, 1.54V gives can?t train 55 and 1.62V gives you 41 because the voltage is too high . The best sticks to look for are the ones that support higher voltages at higher speeds , good sticks can bench DDR4-3200 12-15-15 under 1.6V and with good voltage tolerance they might get you to DDR4-3333 12 +
Hynix AFR is the improved die from Hynix . Fixes many of MFR flaws and it?s a huge improvement . Can tolerate high voltage at high speeds and run great on X99 and Z170 . They will be available very soon on Kingston HyperX memory and other vendors will implement too for sure.
Typical benching scenarios are DDR4-3600 12-17-17 at 1.65 DDR4-3733 12-18-18 at 1.75 DDR4-3866 13-18-18 @1.85 DDR4-4000 firstname.lastname@example.org
Of course voltage might vary and be aware they are harder to clock compared to Samsung,my few sticks can?t manage over DDR4-3733 on any board but Impact ( 1 dimm per channel board)
Samsung D-Die K4A4G085WD
First die from Samsung has good voltage tolerance and can be found on a lots of modules from different vendors from Gskill 3000,3200 Ripjaws,early 3466 and 3600 Ripjaws-V and Trident-z to Corsair lineup and so on . Can go up to 2-2.1V on air and scaling is linear . Typical benching scenarios are at DDR4-3733 15-19-19 for worser kits with 1.8-1.9V to better sticks doing even 13-18-18 .
Samsung E-DIE K4A4G085WE
Second revision from Samsung gained huge improvements . Voltage tolerance is great,taking up to 2.1V at 4200+ speeds ,trcd limits have gone lower and overall it looks like a very solid IC. Typical benching scenarios are: DDR4-3600 11-17-17 at 1.9V and under DDR4-3866 11-19-19 at 2.05V and under DDR4 -4000 12-20-20 at 2V and under
So far to be found on G.skill Trident-Z and Ripjaws-V kits on week 38 and higher , Teamgroup inferior bin of 3866 18-22-22 and newly Corsair 4000c19 . The best modules to aim for are low TRCD ones, generally aim for DDR4-3600 TRCD/TRP 17 and DDR4-4000 TRCD/TRP 19 . They are the easiest to clock modules and also can go lower in tertiary/secondary ,TRFC of 280 at 4200 speed should not be a problem for good sticks.
ISSUE # 4
2 DIMM vs 4 DIMM
4 DIMM is ALWAYS faster than 2 DIMM at similar clocks/timings . Early results in XTU benchmark showed that , however this puts more stress on IMC and result in looser timings sometimes. The only timing to be adjusted is TWRWR_DD which has to be 8 otherwise platform will not start. Keeping identical secondary/tert timings at same clock speed on 4 ghz 32M test this is about 1 second faster at DDR4-3466 C12 which is quite a difference in 32M world.
Please NOTE that AUTO RTL/IOL on 4 DIMM scanarios will give you very loose IOL,this is because Maximus boards will automatically set IOL_latency offset 15 instead of default 21 when 2 dimms are used. Easy way to fix this is to manually set 21 for better performance.
ISSUE # 5
BAD RTL Training
Sometimes the memory training is missed and you can see big difference in RTL/IO,for example instead of 50/51/7/6 you will have 58/51/14/6 and similar.This WILL hurt the performance a lot in 32M and XTU too,It is always best when finding proper RTL/IOL combo to manually lock them by overriding AUTO so you will have same values every time.
Example : Samsung D 3733 C14 normal RTL vs fail RTL :
The TWRWR_DR and TWRWR_DR timings .
Being advocated to go as low as it can this is actually a big lie as tested on all ic available I had found that going lower actually hurts and optimum value for high and medium speeds is actually 8 . Tested on MFR,Samsung E-die and Kingston AFR,4 4 is worser than 8 8 period . AFR DDR4-3866 13-18-18 TWRWR_DR TWRWR_DD 8 8
TWRRD_DR and TWRRD_DD Actually encountered this issue when I left this timing on AUTO and went from DDR4 3466 C12 to C11 and got a worser score !. After a few tries I found that AUTO gave me 6 6 and after lowering this to 5-5 performance gain was normal . It actually can go lower to 4 4 and 3 3 but performance was similar to 5-5 on Hynix AFR and worser on Samsung E-die and stability was a little bit worse so I would recommend 5-5 in all cases period.
General rule is that they go tighter with CAS going lower and they are linked with Write to Read Delay L and S . Write to Read Delay L and S can go lower than 6 even to 1 but show no performance increase whatsoever,except when lowering them by TWRRD_SG and TWRRD_DG.
For example the settings pictured above resulted in showing 7 7 for write to read delay L and S although they were set at 6 in UEFI .
Lowering TWRRD_DG from 22 to 18 resulted in showing 7 and 3 :
The performance boost is little but might matter on
Since we are talking about Write to Read Delay(not L and S), this timing can go as low as 1 and see little performance boost
TRDRD_dr and TRDRD_dd
This is actually interesting because of conflicting results.While testing IC from Hynix and Samsung I noticed that Hynix prefers this two timings at 0 0 gaining almost half a second boost while Samsung prefers the other way around having the best results at 5 6 .Even complementary values like 4-5 5-5 6-7 gave worser results so my findings are that if you own a Samsung based memory this have to be set at 5-6 while Hynix likes them at 0-0.
I?ve found no problem lowering these timings to these values on all IC and getting better performance than default profiles will give .Also they are GTG to 4200+ speeds . Therefore:
Refresh Cycle Time (TRFC) can go as low as 270 for E-die on even 4200+ speeds,AFR needs this at 340+ when going over 3800,D-die also needs it higher than E-die and needs to be tested on your kit. Performance boost is minimal but would not hurt for mental comfort to get it as low as you can.
tREFI set to max(65535) gives slight better performance and does not hurt stability therefore no problems here.
Read to Precharge (TRTP) + Write Recovery Time (TWR) this comes as pack since you can?t adjust one independent from the other,can set to lowest 6 and 12 which will actually be reported as 13 in timing reader ,performance boost is decent actually .
Four Activate Window (tFAW) Usually setting this as 16 gained little performance,can go lower actually but not found any benefit at all so u should try also in your case.
RAS TO RAS DELAY (L and S) Typical values of 7 5,you can get them lower at lower speeds and also when using tight CL like C11 at higher speeds by lowering TRDWR_SG and DG.
WRITE TO READ DELAY (L AND S) As explained before the values shown are connected to TWRRD_sg and dg . Hynix does not like them set in bios lower than 6 opposed to Samsung which can go to 1 but performance is identical so nothing to gain here except lower them with the help of Twrrd_sg and dg .Too tight might cause problems so try to start with 7 7 and after reaching given speed try lower.
WRITE TO READ DELAY Can go usually to 1 from default 6 or 7 ,performance gain is small but can be seen .
CAS WRITE LATENCY Can be set as low as 8 however the benefit in my tests was 0 compared to 9 . When going lower can help you tighten the TWRRD_sg and dg ..
ISSUE # 11
The sweetspot?.So where do we have the best performance ?
Actually TRCD/TRP do matter in terms of 32M and bandwith tests like Aida64,will also help you gain few points in XTU also . Hynix IC set on cold will help you lower this values to 16-17 at speeds of 3800+ which coupled with C11 and tight subs/tertiary will give you nice performance. On air it is no contest when doing lower CAS,proper Samsung E-die can go as high as 4133 C11+ but the wicked TRCD/TRP cannot go as low as Hynix . Bottom line is the sweetspot seems to be around 3900~4000 . I benched at 4240 too with the same secondary/tertiary as 3960 for example but ofcourse higher TRCD and the performance is worse,this might be the case where the bandwith is already enough just tighter timings need to filled in. For example DDR4-4133 12-20-20 is marginally faster compared to DDR4-4000 12-20-20 at identical tertiary/rtl however DDR4-3960 12-19-19 is faster than both .Also 3866 13-18-18 can be close enough to conclude that TRCD/TRP still have important role in 32M so they need to be taken into consideration.
DDR4-3866 13-18-18 Tightest and most efficient for HYNIX AFR :
DDR4-4000 12-20-20 Tightest and most effcient set of timings for samsung e-die:
DDR4-3960 12-19-19 Best overall performance :
I also made a small graph with the best performance from all IC and some more combo?s I tried .
Bottom line is Skylake is for sure a very interesting platform and the thing that DDR4 is still in early time makes it even more interesting . I made this tests for helping some people who are eager to fine-tune their ram on Asus platforms but lack the time needed for this .I think my findings should be repeatable on other motherboards however this needs to be tested for sure. I tried to make this as accurate as I could , I had to repeat tests after changing different parameters again and so on,also having no official guidelines made this search more difficult.
I got G.Skill Ripjaws V 3600 16-16-16-36 memory with Samsung chips marked SEC 549 K4A8G08 5WB BCRC. I want to comment on issue #1. In my case when overclocking over 3600 Maximus VIII Hero tries to set those four timings equal to 10-10-10-11, which is unstable. Setting them equal to CL does not work either. Memory only accepted all even numbers like 12, 14 or 16. Odd numbers = no boot.
Raising voltage really helped with lowering timings, however I still can't get it stable over 3800. Current timings are 15-15-15-32-1T and raising primary timings does not let it go any higher than 3800.