05-17-2023 10:53 AM
Hello, my router froze up overnight, so I rebooted it, and ever since the logs are showing that it is going OOM about every 10 minutes. First it starts spamming:
May 17 10:44:26 kernel: [ 5872.720019] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:26 kernel: [ 5873.235497] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:27 kernel: [ 5873.747543] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:27 kernel: [ 5874.259510] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:28 kernel: [ 5874.771556] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
Then it looks like a high memory killer kicks in and kills a process called asd:
May 17 10:44:30 kernel: [ 5876.819570] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:31 kernel: [ 5877.219109] mcsd invoked oom-killer: gfp_mask=0x24004c0, order=2, oom_score_adj=0
May 17 10:44:31 kernel: [ 5877.219144] CPU: 2 PID: 20987 Comm: mcsd Tainted: P 4.4.60 #1
May 17 10:44:31 kernel: [ 5877.225567] Hardware name: Generic DT based system
May 17 10:44:31 kernel: [ 5877.232709] [<8022001c>] (unwind_backtrace) from [<8021c8c4>] (show_stack+0x10/0x14)
May 17 10:44:31 kernel: [ 5877.237296] [<8021c8c4>] (show_stack) from [<803b8590>] (dump_stack+0x78/0x98)
May 17 10:44:31 kernel: [ 5877.245195] [<803b8590>] (dump_stack) from [<802a7d38>] (dump_header+0x44/0x164)
May 17 10:44:31 kernel: [ 5877.252241] [<802a7d38>] (dump_header) from [<802a825c>] (oom_kill_process+0xcc/0x448)
May 17 10:44:31 kernel: [ 5877.259775] [<802a825c>] (oom_kill_process) from [<802a8924>] (out_of_memory+0x2e4/0x354)
May 17 10:44:31 kernel: [ 5877.267520] [<802a8924>] (out_of_memory) from [<802ac4d4>] (__alloc_pages_nodemask+0x67c/0x738)
May 17 10:44:31 kernel: [ 5877.275752] [<802ac4d4>] (__alloc_pages_nodemask) from [<802ac5a0>] (__get_free_pages+0x10/0x24)
May 17 10:44:31 kernel: [ 5877.284275] [<802ac5a0>] (__get_free_pages) from [<80225d70>] (pgd_alloc+0x18/0x144)
May 17 10:44:31 kernel: [ 5877.293301] [<80225d70>] (pgd_alloc) from [<80228478>] (mm_init+0xcc/0x138)
May 17 10:44:31 kernel: [ 5877.301027] [<80228478>] (mm_init) from [<802e1854>] (do_execveat_common+0x284/0x5f4)
May 17 10:44:31 kernel: [ 5877.307697] [<802e1854>] (do_execveat_common) from [<802e1bf0>] (do_execve+0x2c/0x34)
May 17 10:44:31 kernel: [ 5877.315695] [<802e1bf0>] (do_execve) from [<80209bc0>] (ret_fast_syscall+0x0/0x34)
May 17 10:44:31 kernel: [ 5877.323544] Mem-Info:
May 17 10:44:31 kernel: [ 5877.331211] active_anon:161197 inactive_anon:1485 isolated_anon:0
May 17 10:44:31 kernel: [ 5877.331211] active_file:90 inactive_file:127 isolated_file:7
May 17 10:44:31 kernel: [ 5877.331211] unevictable:0 dirty:52 writeback:1 unstable:0
May 17 10:44:31 kernel: [ 5877.331211] slab_reclaimable:875 slab_unreclaimable:31662
May 17 10:44:31 kernel: [ 5877.331211] mapped:220 shmem:1513 pagetables:602 bounce:0
May 17 10:44:31 kernel: [ 5877.331211] free:4778 free_pcp:4 free_cma:0
May 17 10:44:31 kernel: [ 5877.331557] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:31 kernel: [ 5877.365795] Normal free:18404kB min:3752kB low:4688kB high:5628kB active_anon:644788kB inactive_anon:5940kB active_file:648kB inactive_file:916kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:898048kB managed:881836kB mlocked:0kB dirty:208kB writeback:4kB mapped:1176kB shmem:6052kB slab_reclaimable:3500kB slab_unreclaimable:126648kB kernel_stack:1432kB pagetables:2408kB unstable:0kB bounce:0kB free_pcp:544kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_
May 17 10:44:31 kernel: [ 5877.395533] lowmem_reserve[]: 0 0 0
May 17 10:44:31 kernel: [ 5877.421214] Normal: 4597*4kB (U) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 18388kB
May 17 10:44:31 kernel: [ 5877.431371] 2010 total pagecache pages
May 17 10:44:31 kernel: [ 5877.432161] 0 pages in swap cache
May 17 10:44:31 kernel: [ 5877.435905] Swap cache stats: add 0, delete 0, find 0/0
May 17 10:44:31 kernel: [ 5877.439280] Free swap = 0kB
May 17 10:44:31 kernel: [ 5877.444308] Total swap = 0kB
May 17 10:44:31 kernel: [ 5877.447454] 224512 pages RAM
May 17 10:44:31 kernel: [ 5877.450298] 0 pages HighMem/MovableOnly
May 17 10:44:31 kernel: [ 5877.453163] 4053 pages reserved
May 17 10:44:31 kernel: [ 5877.457046] Out of memory: Kill process 9710 (asd) score 684 or sacrifice child
May 17 10:44:31 kernel: [ 5877.459867] Killed process 9710 (asd) total-vm:625440kB, anon-rss:621348kB, file-rss:60kB
Looking back at when the router locked up initially overnight, it seems like it tried to look for a firmware update automatically around 4am, (which is strange because I have auto updates turned off), then started spamming the 'NBUF alloc failed' message:
May 17 03:48:01 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7649)]fimrware update check first time
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7682)]no need to upgrade firmware
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7654)]fimrware update check once
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7654)]fimrware update check once
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7649)]fimrware update check first time
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7682)]no need to upgrade firmware
May 17 03:57:55 kernel: [1382843.922889] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:55 kernel: [1382844.435933] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:56 kernel: [1382845.012988] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:56 kernel: [1382845.525014] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
Any help greatly appreciated!
05-18-2023 09:21 AM
I just received this email from Asus support. Looks like they think the issue should be resolved on their end.
Thank you for contacting ASUS Product Support.
My name is Joren A., and I understand that you are experiencing router connection issue. While I am sorry to hear you are experiencing this issue, I assure you it will be my pleasure to help you resolve it!
Based on the error you described, we recommend that you perform the below initial troubleshooting steps if you have not already done so. If the steps below do not help resolve the problem, please reach out to our technical support team here: Chat with Us . This will allow us to gather more details on the description of the issue you are experiencing, especially on more complex and targeted situations.
During routine security maintenance, our technical team discovered an error in the configuration of our server settings file, which could potentially cause an interruption in network connectivity on part of the routers.
Our technical team has urgently addressed the server issue, and impacted routers should be returning to normal operation. If your device is still affected, we recommend the following:
1. Manually reboot your router
2. If rebooting does not resolve the issue, please save the settings file, perform a hard reset (factory default), and then re-upload the settings file (follow the directions in the https://www.asus.com/support/FAQ/1050464)
If there are any further developments around this issue, we will immediately update our users.
We deeply apologize for any inconvenience this incident may have caused and are committed to preventing such an incident from happening again. Thank you for your understanding and support.
05-18-2023 09:34 AM
Is this for new firmware? Can we just download a new version instead of going through all that?
05-18-2023 09:36 AM
Thanks for posting @AllYourBass . Questionable response about misconfigured "backend configuration" files, IMO. I assume there is at least minimum checks for newer firmware that happen even if auto update is off. I am sure these may be more if one has other services turned up (security, cloud drive, etc.). For me, I have NOTHING running - no wifi, no AI/Security stuff, no USB/Disk or 4G backup connection, no VPN service. Just a simple router providing connection to an Arris modem and handing out DHCP leases.
A hard reset and reconfigure the router will probably resolve this. However, I would like to read/hear from Asus exactly what is the cause of our router process asd writing so much to memory. This feels like an exploit of some type. But, another day or two, I will be off the stock ASUS firmware on onto FreshTomato.
05-18-2023 09:23 AM
@Rockstonicko - that should work too. I was going to do that but I want to know the root cause of this problem and have little faith with ASUS addressing this, therefore for me, do a factory reset and reconfiguring might invite the problem back (which we don't know what caused this).
Therefore, I am out and getting a router I can put Tomato/Merlin/WT/DD-WRT on to it for the simple reason that the community publishes updates (firmware) frequently.
05-18-2023 09:48 AM
If this is bug, then it's not so bad. But if Asus is pushing out changes to your router without you clicking "Upgrade Firmware" that's a major major problem.
The fact that it is happening to all of us at once means this is not user error. Asus needs to provide an honest and clearly written root cause with transparency exposing the technical details or risk losing future business.
05-18-2023 09:55 AM
If I'm reading this thread correctly, so far Asus has not pushed out a new firmware - I'm on the beta but I don't want to stay on it.
It's pretty wild that Asus hasn't posted an update yet.
I have the RT-AX89X btw.
05-18-2023 10:00 AM
Same here.
05-18-2023 10:14 AM
I think you are reading correctly. For me, I am not holding my breath for an update for my Asus RT-AC1200GE. It is a lower end router in their line. I am on version: 3.0.0.4.382.52482, which was made available 1/21/2021. As it has been 2+ years since an update, that is why I am not optimistic I will get a fix. I like ASUS hardware, just not the stock OS/firmware. This problem has given me a good reason to get another low end ASUS router (RT-AC66U-B1) which will run Tomato OS. I used to use Linksys routers many years ago (until Cisco bought them and fubar'd them) that supported Tomato and I liked the interface and reporting better. And, like DD-WRT/Merlin, Tomato (FreshTomato branch) has a robust open source community always providing very timely updates and enhancements. I suggest checking any of these out. Disclaimer: Probably going out of warranty using these OS'es, but I have never had a problem, in my experience, with ASUS hardware.
05-18-2023 02:03 PM
Same, I used the beta and been stable all day so far, but who knows where the ****** beta came from or if we will have a legit one from Asus any time soon. I dont even know what was in the code of the beta, it was just from some rando's storage on the asus storage site lol Come on Asus, get it together!
05-18-2023 10:09 AM
Hello
for those still having the issue, looks not related with firmware version
it's more like a corupted definitions files for an antivirus or something like that
deleting the file and reboot the router will force to download a new one
cd /jffs/asd/
rm chknvram2023xxxxx
reboot
in my case, still ok after 6h