05-17-2023 10:53 AM
Hello, my router froze up overnight, so I rebooted it, and ever since the logs are showing that it is going OOM about every 10 minutes. First it starts spamming:
May 17 10:44:26 kernel: [ 5872.720019] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:26 kernel: [ 5873.235497] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:27 kernel: [ 5873.747543] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:27 kernel: [ 5874.259510] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:28 kernel: [ 5874.771556] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
Then it looks like a high memory killer kicks in and kills a process called asd:
May 17 10:44:30 kernel: [ 5876.819570] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:31 kernel: [ 5877.219109] mcsd invoked oom-killer: gfp_mask=0x24004c0, order=2, oom_score_adj=0
May 17 10:44:31 kernel: [ 5877.219144] CPU: 2 PID: 20987 Comm: mcsd Tainted: P 4.4.60 #1
May 17 10:44:31 kernel: [ 5877.225567] Hardware name: Generic DT based system
May 17 10:44:31 kernel: [ 5877.232709] [<8022001c>] (unwind_backtrace) from [<8021c8c4>] (show_stack+0x10/0x14)
May 17 10:44:31 kernel: [ 5877.237296] [<8021c8c4>] (show_stack) from [<803b8590>] (dump_stack+0x78/0x98)
May 17 10:44:31 kernel: [ 5877.245195] [<803b8590>] (dump_stack) from [<802a7d38>] (dump_header+0x44/0x164)
May 17 10:44:31 kernel: [ 5877.252241] [<802a7d38>] (dump_header) from [<802a825c>] (oom_kill_process+0xcc/0x448)
May 17 10:44:31 kernel: [ 5877.259775] [<802a825c>] (oom_kill_process) from [<802a8924>] (out_of_memory+0x2e4/0x354)
May 17 10:44:31 kernel: [ 5877.267520] [<802a8924>] (out_of_memory) from [<802ac4d4>] (__alloc_pages_nodemask+0x67c/0x738)
May 17 10:44:31 kernel: [ 5877.275752] [<802ac4d4>] (__alloc_pages_nodemask) from [<802ac5a0>] (__get_free_pages+0x10/0x24)
May 17 10:44:31 kernel: [ 5877.284275] [<802ac5a0>] (__get_free_pages) from [<80225d70>] (pgd_alloc+0x18/0x144)
May 17 10:44:31 kernel: [ 5877.293301] [<80225d70>] (pgd_alloc) from [<80228478>] (mm_init+0xcc/0x138)
May 17 10:44:31 kernel: [ 5877.301027] [<80228478>] (mm_init) from [<802e1854>] (do_execveat_common+0x284/0x5f4)
May 17 10:44:31 kernel: [ 5877.307697] [<802e1854>] (do_execveat_common) from [<802e1bf0>] (do_execve+0x2c/0x34)
May 17 10:44:31 kernel: [ 5877.315695] [<802e1bf0>] (do_execve) from [<80209bc0>] (ret_fast_syscall+0x0/0x34)
May 17 10:44:31 kernel: [ 5877.323544] Mem-Info:
May 17 10:44:31 kernel: [ 5877.331211] active_anon:161197 inactive_anon:1485 isolated_anon:0
May 17 10:44:31 kernel: [ 5877.331211] active_file:90 inactive_file:127 isolated_file:7
May 17 10:44:31 kernel: [ 5877.331211] unevictable:0 dirty:52 writeback:1 unstable:0
May 17 10:44:31 kernel: [ 5877.331211] slab_reclaimable:875 slab_unreclaimable:31662
May 17 10:44:31 kernel: [ 5877.331211] mapped:220 shmem:1513 pagetables:602 bounce:0
May 17 10:44:31 kernel: [ 5877.331211] free:4778 free_pcp:4 free_cma:0
May 17 10:44:31 kernel: [ 5877.331557] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:31 kernel: [ 5877.365795] Normal free:18404kB min:3752kB low:4688kB high:5628kB active_anon:644788kB inactive_anon:5940kB active_file:648kB inactive_file:916kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:898048kB managed:881836kB mlocked:0kB dirty:208kB writeback:4kB mapped:1176kB shmem:6052kB slab_reclaimable:3500kB slab_unreclaimable:126648kB kernel_stack:1432kB pagetables:2408kB unstable:0kB bounce:0kB free_pcp:544kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_
May 17 10:44:31 kernel: [ 5877.395533] lowmem_reserve[]: 0 0 0
May 17 10:44:31 kernel: [ 5877.421214] Normal: 4597*4kB (U) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 18388kB
May 17 10:44:31 kernel: [ 5877.431371] 2010 total pagecache pages
May 17 10:44:31 kernel: [ 5877.432161] 0 pages in swap cache
May 17 10:44:31 kernel: [ 5877.435905] Swap cache stats: add 0, delete 0, find 0/0
May 17 10:44:31 kernel: [ 5877.439280] Free swap = 0kB
May 17 10:44:31 kernel: [ 5877.444308] Total swap = 0kB
May 17 10:44:31 kernel: [ 5877.447454] 224512 pages RAM
May 17 10:44:31 kernel: [ 5877.450298] 0 pages HighMem/MovableOnly
May 17 10:44:31 kernel: [ 5877.453163] 4053 pages reserved
May 17 10:44:31 kernel: [ 5877.457046] Out of memory: Kill process 9710 (asd) score 684 or sacrifice child
May 17 10:44:31 kernel: [ 5877.459867] Killed process 9710 (asd) total-vm:625440kB, anon-rss:621348kB, file-rss:60kB
Looking back at when the router locked up initially overnight, it seems like it tried to look for a firmware update automatically around 4am, (which is strange because I have auto updates turned off), then started spamming the 'NBUF alloc failed' message:
May 17 03:48:01 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7649)]fimrware update check first time
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7682)]no need to upgrade firmware
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7654)]fimrware update check once
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7654)]fimrware update check once
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7649)]fimrware update check first time
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7682)]no need to upgrade firmware
May 17 03:57:55 kernel: [1382843.922889] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:55 kernel: [1382844.435933] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:56 kernel: [1382845.012988] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:56 kernel: [1382845.525014] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
Any help greatly appreciated!
05-18-2023 06:19 AM
Thank you very much palito007! This seems to have solved the issue for me.
For those who want to try this on Windows 10 like I did, Here are the steps:
1. You need to access the router's web interface to enable Telnet. Go to administration -> System -> Service and click YES to Telnet
2. Enable Telnet on your Windows install. I searched the start menu for Telnet and it pointed me to Windows features, I scrolled and found Telnet, checked it and a few seconds later it was installed
3. Open a Windows Command Prompt (does NOT have to be admin) Run the command "Telnet 192.168.1.1" (Or whatever your router's IP address is)
4. At the prompt, type the command "cd /jffs/asd/" then press <Enter>
5. Type the command "rm chknvram20230516" to delete the offending file and press <Enter>
6. Type "Exit" <Enter>
7. Reboot the router
8. Fingers crossed that it fixed your problem like it seems to have fixed mine!
05-18-2023 07:48 AM
Hello,
Just to inform you all that I had the issue on a RT-AX89X (I am in France) and that seems to be solved by the deletion of the chknvram20230516 file.
I just followed the process described by Warchant.
Thank's to all that participate to elaborate this solution.
05-19-2023 12:29 AM - edited 05-19-2023 12:31 AM
First I like to thank @palito007 for pointing out to delete chknvram20230516.
Secondly, thank you @Warchant for your step-by-step guide on going through the presses with all the details you provided.
Now my weird issue, I did all of this and I found that chknvram20230516 did not exist on my end, put in mind that I went through his process after doing a reset to my router and uploading my old saved configuration, I did this yesterday after hours trying to figure out what was the issue, and I had to do this again this morning because the issue came back again.
Now, is this normal? do I leave the router as it is functional and wait for the issue to return, go through it, and check for chknvram20230516 to appear and delete it?
I only found out an hour ago about this post, because after re-uploading my old saved profile, I assumed it fixed it and stayed that way until 2~3am this morning when I was asleep.
Also, where is this beta version for RT-AX89X everyone speaking about, am I missing something?
05-19-2023 01:15 AM
You can find the info about the Beta version here https://rog-forum.asus.com/t5/gaming-network-products/rt-ax89x-going-out-of-memory-every-10-minutes-...
However, I'm 20 hours up without issues and with the latest FW 47468 following these steps https://rog-forum.asus.com/t5/gaming-network-products/rt-ax89x-going-out-of-memory-every-10-minutes-...
I have also plugged 2 USB HDDs (2 Tb and 4 Tb) and running steady on 75-80% RAM, CPU AVG around avg around 5% - so all is good
05-19-2023 01:22 AM
Thanks for pointing out the beta link!
05-18-2023 06:17 PM
run the bellow commands fixed the issue, for now
cd /jffs/asd/
rm chknvram20230516
Thanks palito007
Thanks
05-18-2023 08:27 AM
I have a ZenWifi ET8 Mesh and it had exact similar issue since that 5/16 night I woke up to all my mesh routers blinking red or blue. hard reset, soft resets, resetups, firmware manual, etc.
This ssh into the folder and delete the NVRAM file fixed so far for my ET8 AIMesh driven network. Still monitoring but it hasn't locked up every 10 minutes since then. Running for a couple of hours now.
05-18-2023 04:21 AM - edited 05-18-2023 04:43 AM
Same here! RT-AC85P
May 18 14:38:13 kernel: Start Seq = 00000000
May 18 14:38:13 kernel: infosvr invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
May 18 14:38:13 kernel: CPU: 0 PID: 375 Comm: infosvr Tainted: P O 3.10.14 #1
May 18 14:38:13 kernel: Stack : 804f7cb2 0000003f 00000000 80490000 00000000 000201da 8041d850 804f3c1c
May 18 14:38:13 kernel: 8d2c7ca8 8048fc67 00000000 000201da 8048fd60 0000bcb9 00000002 803aa334
May 18 14:38:13 kernel: 80496750 80027e08 00000000 00000000 8041f5c0 8d2d1b3c 8d2d1b3c 8041d850
May 18 14:38:13 kernel: 00020100 00000000 00000000 00000000 00000000 00000000 00000000 00000000
May 18 14:38:13 kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 8d2d1ad0
May 18 14:38:13 kernel: ...
May 18 14:38:13 kernel: Call Trace:
May 18 14:38:13 kernel: [<80009070>] show_stack+0x64/0x7c
May 18 14:38:13 kernel: [<803aae9c>] dump_header.isra.12+0x80/0x1cc
May 18 14:38:13 kernel: [<80088a44>] oom_kill_process+0x3c0/0x630
May 18 14:38:13 kernel: [<800892f8>] out_of_memory+0x334/0x3ac
May 18 14:38:13 kernel: [<8008cb90>] __alloc_pages_nodemask+0x668/0x67c
May 18 14:38:13 kernel: [<80087874>] filemap_fault+0x1fc/0x550
May 18 14:38:13 kernel: [<800a5a98>] __do_fault+0x88/0x564
May 18 14:38:13 kernel: [<800a9740>] handle_pte_fault+0x124/0x9d4
May 18 14:38:13 kernel: [<800aa0f0>] handle_mm_fault+0x100/0x15c
May 18 14:38:13 kernel: [<800109b4>] do_page_fault+0x114/0x410
May 18 14:38:13 kernel: [<80003a20>] ret_from_exception+0x0/0xc
May 18 14:38:13 kernel: Mem-Info:
May 18 14:38:13 kernel: Normal per-cpu:
May 18 14:38:13 kernel: CPU 0: hi: 90, btch: 15 usd: 0
May 18 14:38:13 kernel: CPU 1: hi: 90, btch: 15 usd: 67
May 18 14:38:13 kernel: CPU 2: hi: 90, btch: 15 usd: 9
May 18 14:38:13 kernel: CPU 3: hi: 90, btch: 15 usd: 36
May 18 14:38:13 kernel: active_anon:52585 inactive_anon:52 isolated_anon:0
May 18 14:38:13 kernel: active_file:4 inactive_file:11 isolated_file:0
May 18 14:38:13 kernel: unevictable:0 dirty:0 writeback:0 unstable:0
May 18 14:38:13 kernel: free:1071 slab_reclaimable:309 slab_unreclaimable:4043
May 18 14:38:13 kernel: mapped:15 shmem:83 pagetables:172 bounce:0
May 18 14:38:13 kernel: free_cma:0
May 18 14:38:13 kernel: Normal free:4132kB min:4096kB low:5120kB high:6144kB active_anon:210340kB inactive_anon:208kB active_file:68kB inactive_file:176kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:262144kB managed:254512kB mlocked:0kB dirty:0kB writeback:0kB mapped:60kB shmem:332kB slab_reclaimable:1236kB slab_unreclaimable:16172kB kernel_stack:720kB pagetables:688kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
May 18 14:38:13 kernel: lowmem_reserve[]: 0 0
May 18 14:38:13 kernel: Normal: 231*4kB (UEM) 41*8kB (UM) 6*16kB (M) 2*32kB (UM) 0*64kB 1*128kB (R) 0*256kB 0*512kB 1*1024kB (R) 1*2048kB (R) 0*4096kB = 4612kB
May 18 14:38:13 kernel: 142 total pagecache pages
May 18 14:38:13 kernel: 0 pages in swap cache
May 18 14:38:13 kernel: Swap cache stats: add 0, delete 0, find 0/0
May 18 14:38:13 kernel: Free swap = 0kB
May 18 14:38:13 kernel: Total swap = 0kB
May 18 14:38:13 kernel: 65536 pages RAM
May 18 14:38:13 kernel: 1866 pages reserved
May 18 14:38:13 kernel: 801715 pages shared
May 18 14:38:13 kernel: 60140 pages non-shared
May 18 14:38:13 kernel: Out of memory: Kill process 262 (asd) score 758 or sacrifice child
May 18 14:38:13 kernel: Killed process 262 (asd) total-vm:204560kB, anon-rss:200592kB, file-rss:0kB
05-18-2023 05:01 AM
Same problem and start about the same time. but my one sometime will work for more than an hour and sometime less than 10mins. ASUS fix this ASAP