X670E-E Cold Boot Stability issues

Level 8
I'm a computer tech by trade and I helped my son build a new machine.
He's running the x670e-e with the 7950x, 64GB of G.Skill F5-6000J3040G32GX2-TZ5NR using the EXPO profile @ 6000Mhz (3,000Mhz actual clock).
CPU heatsink is the Ice Giant and we underclocked the CPU a bit using PBO to get thermals under control and Mclk is locked to the FClk.

The system is seemingly stable as it passes every test we throw at it, OCCT CPU stress, OCCT Linpack AMD64, OCCT Memory test, CoreCycler, Karhu mem test, all tests seemingly pass without issues AFTER the second reboot.

The problem he's experiencing (and it is reproducible) is after the initial cold boot the system will bluescreen reboot with the following error.....

012923-19890-01.dmp	1/29/2023 4:21:46 AM	DRIVER_POWER_STATE_FAILURE	0x0000009f	00000000`00000003	ffffc78f`a04f1d40	ffffc602`1b11f1d0	ffffc78f`b0e828f0	ntoskrnl.exe	ntoskrnl.exe+3fa1d0					x64	ntoskrnl.exe+3fa1d0					C:\Windows\Minidump\012923-19890-01.dmp	32	15	19041	4,000,444	1/29/2023 4:26:28 AM	

After a simple reboot the system is stable and will run fine without incident.

I've tried turning the memory speed down to 4800Mhz (2400 actual clock), bumping the SOC to 1.35v, turning curve optimizer off and a few other things I can't recall at the moment but we can't seem to narrow down where the problem is at. The only thing I can figure is it's a BIOS issue, driver issue or a hardware issue of some kind. Granted that could explain 90% of most people's issues but you get the point.
One thing I pointed out to him to make him a bit more patient with the problem is that he's running cutting edge hardware, something is going to bleed and bleed it is.

Another thing I noticed he's getting in his Windows System Event Logs is the following ACPI Event ID 15 error....

: The embedded controller (EC) returned data when none was requested. The BIOS might be trying to access the EC without synchronizing with the operating system. This data will be ignored. No further action is necessary; however, you should check with your computer manufacturer for an upgraded BIOS.

At the time of a system reboot and obvious Kernel-Power event in the event log the system also give the following e2fexpress Event ID 27 error...

The description for Event ID 27 from source e2fexpress cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Intel(R) Ethernet Controller (3) I225-V

The message resource is present but the message was not found in the message table

I did at one time assume that maybe the NIC driver was to blame so did dig around and make sure all the drivers were up to date but the problem still persists.

Anyone have any idea what's going on here?

chemie wrote:
I traced my issue to the iGPU. Disable in Windows or BIOS and I will get occasional crash when waking from sleep. Enabling it no crashes. I am not using it and it has a known issue that Steam takes 1 minute to start up with the iGPU on, but better than crashing I guess.

I have not had stability issues booting from power on state...only from sleep.

I did not know the igpu is causing the steam load time issue, that is good to know it has been identified. Hopefully there is a fix

F1Aussie wrote:
I did not know the igpu is causing the steam load time issue, that is good to know it has been identified. Hopefully there is a fix

It has been there for 6 months so not holding my breath

Level 7

hi, is there any solution to this, I have the same problem on ASUS TUF x670E with 7800X3D - mainly the first cold boot in the morning doesn't POST, screens stay black, vents and leds are running, the QLED is solid red on CPU (I have tried reseating the CPU and unplugging/plugging all of the PSU cables)

each subsequent boot works as expected and the system is completely stable, nothing overclocked as BIOS is all default

It sounds sh*tty but the only thing I can advise you is to put your PC to sleep instead of shutting it down completely, if possible

Level 7

yeah, thats sadly the most reasonable solution for now

but no way am I paying 1000EUR for CPU+MBO+RAM to be a beta tester for this POS Asus board and AMD ZEN4 platform, if it doesnt get fixed by bios update in the next month im selling it off and getting 13700k, never getting another asus board again

Level 11

I can understand you completely. Usually, 6 months were enough for a new platform to mature and clear most of the issues but it doesn't seem to be the case now, probably because of introducing relatively new technologies like DDR5 and X3D chips.

The only reasons that I would avoid Intel at the moment are the unreasonably high power requirements (Zen architecture is much more optimized, in this respect) and the need for a powerful cooler once the PL limits are lifted (in many reviews Intel chips easily hit 115 C in synthetic benchmarks once overclocked). 13700K, however, seems to be an optimal solution.

Level 7

What bios was having same issues reverted back to 1003 and no problems at all now