05-17-2023 10:53 AM
Hello, my router froze up overnight, so I rebooted it, and ever since the logs are showing that it is going OOM about every 10 minutes. First it starts spamming:
May 17 10:44:26 kernel: [ 5872.720019] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:26 kernel: [ 5873.235497] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:27 kernel: [ 5873.747543] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:27 kernel: [ 5874.259510] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:28 kernel: [ 5874.771556] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
Then it looks like a high memory killer kicks in and kills a process called asd:
May 17 10:44:30 kernel: [ 5876.819570] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:31 kernel: [ 5877.219109] mcsd invoked oom-killer: gfp_mask=0x24004c0, order=2, oom_score_adj=0
May 17 10:44:31 kernel: [ 5877.219144] CPU: 2 PID: 20987 Comm: mcsd Tainted: P 4.4.60 #1
May 17 10:44:31 kernel: [ 5877.225567] Hardware name: Generic DT based system
May 17 10:44:31 kernel: [ 5877.232709] [<8022001c>] (unwind_backtrace) from [<8021c8c4>] (show_stack+0x10/0x14)
May 17 10:44:31 kernel: [ 5877.237296] [<8021c8c4>] (show_stack) from [<803b8590>] (dump_stack+0x78/0x98)
May 17 10:44:31 kernel: [ 5877.245195] [<803b8590>] (dump_stack) from [<802a7d38>] (dump_header+0x44/0x164)
May 17 10:44:31 kernel: [ 5877.252241] [<802a7d38>] (dump_header) from [<802a825c>] (oom_kill_process+0xcc/0x448)
May 17 10:44:31 kernel: [ 5877.259775] [<802a825c>] (oom_kill_process) from [<802a8924>] (out_of_memory+0x2e4/0x354)
May 17 10:44:31 kernel: [ 5877.267520] [<802a8924>] (out_of_memory) from [<802ac4d4>] (__alloc_pages_nodemask+0x67c/0x738)
May 17 10:44:31 kernel: [ 5877.275752] [<802ac4d4>] (__alloc_pages_nodemask) from [<802ac5a0>] (__get_free_pages+0x10/0x24)
May 17 10:44:31 kernel: [ 5877.284275] [<802ac5a0>] (__get_free_pages) from [<80225d70>] (pgd_alloc+0x18/0x144)
May 17 10:44:31 kernel: [ 5877.293301] [<80225d70>] (pgd_alloc) from [<80228478>] (mm_init+0xcc/0x138)
May 17 10:44:31 kernel: [ 5877.301027] [<80228478>] (mm_init) from [<802e1854>] (do_execveat_common+0x284/0x5f4)
May 17 10:44:31 kernel: [ 5877.307697] [<802e1854>] (do_execveat_common) from [<802e1bf0>] (do_execve+0x2c/0x34)
May 17 10:44:31 kernel: [ 5877.315695] [<802e1bf0>] (do_execve) from [<80209bc0>] (ret_fast_syscall+0x0/0x34)
May 17 10:44:31 kernel: [ 5877.323544] Mem-Info:
May 17 10:44:31 kernel: [ 5877.331211] active_anon:161197 inactive_anon:1485 isolated_anon:0
May 17 10:44:31 kernel: [ 5877.331211] active_file:90 inactive_file:127 isolated_file:7
May 17 10:44:31 kernel: [ 5877.331211] unevictable:0 dirty:52 writeback:1 unstable:0
May 17 10:44:31 kernel: [ 5877.331211] slab_reclaimable:875 slab_unreclaimable:31662
May 17 10:44:31 kernel: [ 5877.331211] mapped:220 shmem:1513 pagetables:602 bounce:0
May 17 10:44:31 kernel: [ 5877.331211] free:4778 free_pcp:4 free_cma:0
May 17 10:44:31 kernel: [ 5877.331557] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:31 kernel: [ 5877.365795] Normal free:18404kB min:3752kB low:4688kB high:5628kB active_anon:644788kB inactive_anon:5940kB active_file:648kB inactive_file:916kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:898048kB managed:881836kB mlocked:0kB dirty:208kB writeback:4kB mapped:1176kB shmem:6052kB slab_reclaimable:3500kB slab_unreclaimable:126648kB kernel_stack:1432kB pagetables:2408kB unstable:0kB bounce:0kB free_pcp:544kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_
May 17 10:44:31 kernel: [ 5877.395533] lowmem_reserve[]: 0 0 0
May 17 10:44:31 kernel: [ 5877.421214] Normal: 4597*4kB (U) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 18388kB
May 17 10:44:31 kernel: [ 5877.431371] 2010 total pagecache pages
May 17 10:44:31 kernel: [ 5877.432161] 0 pages in swap cache
May 17 10:44:31 kernel: [ 5877.435905] Swap cache stats: add 0, delete 0, find 0/0
May 17 10:44:31 kernel: [ 5877.439280] Free swap = 0kB
May 17 10:44:31 kernel: [ 5877.444308] Total swap = 0kB
May 17 10:44:31 kernel: [ 5877.447454] 224512 pages RAM
May 17 10:44:31 kernel: [ 5877.450298] 0 pages HighMem/MovableOnly
May 17 10:44:31 kernel: [ 5877.453163] 4053 pages reserved
May 17 10:44:31 kernel: [ 5877.457046] Out of memory: Kill process 9710 (asd) score 684 or sacrifice child
May 17 10:44:31 kernel: [ 5877.459867] Killed process 9710 (asd) total-vm:625440kB, anon-rss:621348kB, file-rss:60kB
Looking back at when the router locked up initially overnight, it seems like it tried to look for a firmware update automatically around 4am, (which is strange because I have auto updates turned off), then started spamming the 'NBUF alloc failed' message:
May 17 03:48:01 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7649)]fimrware update check first time
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7682)]no need to upgrade firmware
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7654)]fimrware update check once
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7654)]fimrware update check once
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7649)]fimrware update check first time
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7682)]no need to upgrade firmware
May 17 03:57:55 kernel: [1382843.922889] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:55 kernel: [1382844.435933] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:56 kernel: [1382845.012988] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:56 kernel: [1382845.525014] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
Any help greatly appreciated!
05-18-2023 05:57 AM
Very happy to hear I’m not the only one with a bricked RT-AX89X $400 paperweight! Thanks to those that posted a step by step on how to roll back the FW version as a temp fix till ASUS issues a new FW that solves it. I will say, ASUS support was useless. They wanted me to send it in as an RMA instead of telling me that this issue is known and they are working on a FW fix.
05-18-2023 06:19 AM
I just created an account to come here and say big thanks to everyone who posted and shared!!
I thought my router was broken. I was freaking out yesterday in the middle of project deployment. It costed me a half day to get nothing done under deployment pressure.
Luckily I still keep one of my old routers at home. I was switching to that to survive the project.
Earlier this morning I was searching through the internet and checked for any information about the same problem I was experiencing. Very luckily, I found here, a lot of posts with the same issue I had been through.
I finally picked the proposal from palito007, i.e. deleting the chknvram20230516 file and rebooting, and so far, I'm not getting the issue anymore.
Again, thanks for everyone to save my router, which should make my day today.
05-18-2023 06:19 AM
I had the same exact problem with my RT-AX89X after I updated the firmware today to version 3.0.0.4.386.47468.
After the update, the CPU was working at least duble of normal and the RAM was filling slowly until the router crashed (every 30 min or so).
The solution was: Do a backup of my settings, downgrad manuel to the previous firmwareversion 3.0.0.4.386.47191 (it still didn't work after this), factory reset and install old settings. Now there's a normal CPU load and the RAM are steady at 53% use.
I've followed more or less what KLRdude and jbasol did:
This bad firmware was released in October last year AND ASUS STILL HAVEN'T FIXED IT. COME ON GUYS, WAKE UP!
05-18-2023 06:52 AM
Me too; I upgraded to today firmware and had the same problem again.
I had to downgrade again to FW_RT_AX89X_3.0.0.4.386.47191
05-18-2023 06:41 AM - edited 05-18-2023 09:50 PM
By the way guys, I didn't downgrade and it's working well for me (except RAM usage still being 78%).
I am using RT-AX89X on FW 3.0.0.4.386_47468-g73fe1fe
1. I went through the code (as mentioned earlier in this topic)
kill -SIGSTOP $(ps | grep '[a]sd$' | awk '{print $1}')
which fixed the erratic CPU while the RAM kept on spiking and the router crashing.
2. I saved my configuration
I erased the NVRAM in telnet and rebooted - this consistently lowered the RAM usage to 60 %
I reimported my configuration
3. I plugged in my USB drives and that put under stress the router again; CPU stabilized after a minute or so, but RAM now is steady around 78% (in any case, I always had high RAM usage in the past)
4. Monitoring the space used in /JFFS seems fine
# df -h /jffs
Filesystem Size Used Available Use% Mounted on
/dev/ubi0_5 125.4M 2.0M 123.5M 2% /jffs
I'm disabling the daily rebooting for the time being, waiting to see if I can manage without downgrading or else if an update is issued.
A couple of days ago an update for the RT-AX86U was released, which I use as mesh node - and honestly I should apologize to it, cause this morning when I saw the mess I immediately accused it of being the culprit 😄
Thanks again to all the knowledgeable guys that shared their knowledge in an easy to understand way and helped us too to get through with some fixes.
05-18-2023 09:55 PM
Hi guys,
Have been stable almost 17 hours now
I've always had quite high RAM usage, I was even surprised to read someone has it 50%
Cheers
05-18-2023 07:09 AM
Same issue, same symptoms. Started May 17th.
If ASUS is reading this please know it's impacting my business.
05-18-2023 07:33 AM
I forgot to mention in my earlier post that my router models are RT-AC88U and they are affected by this. This is obviously a widespread issue impacting a large amount of their models.
05-18-2023 07:50 AM
In same boat as you with RT-AC88U
05-18-2023 07:48 AM
Hello All
I'm sorry that you have encountered such a problem, please refer to the following steps to try to troubleshoot.
How to erase all data and recover previous setups?
If you are considering initializing the settings of your ASUS router, but still like to keep your previous setups after resetting. Please follow the steps below.
Please login to ASUS Web GUI to complete all the setups.
[Wireless Router] How to enter the router setting page(Web GUI) (ASUSWRT) ?
Step 1: Export Config File
1-1. Go to: Administration > Restore/Save/Upload Setting.
1.2. Click “Save setting” to download your current config file.
*The config file should be like this.
Step 2: Restore To Factory Settings
2-1. Go to: Administration > Restore/Save/Upload Setting.
2-2. Check the box of “Initialize all the settings, and clear all the data log for AiProtection, Traffic Analyzer, and Web History”.
And Press “Restore” to restore your router to the initial factory setting.
Step 3: Import the config file
3-1. Click “Advanced Settings” on the welcome page.
3-2. Upload the config file you just downloaded.
Notice: Router will automatically reconnect after the config file is uploaded. Please reenter to ASUS Web GUI if you need any further setups.
Note: You could also press “Create A New Network" and follow the QIS process to create a new network.