cancel
Showing results for 
Search instead for 
Did you mean: 

RT-AX89X going out of memory every 10 minutes (asd process?)

shievan
Level 8

Hello, my router froze up overnight, so I rebooted it, and ever since the logs are showing that it is going OOM about every 10 minutes. First it starts spamming:

May 17 10:44:26 kernel: [ 5872.720019] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:26 kernel: [ 5873.235497] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:27 kernel: [ 5873.747543] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:27 kernel: [ 5874.259510] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:28 kernel: [ 5874.771556] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430

Then it looks like a high memory killer kicks in and kills a process called asd:

May 17 10:44:30 kernel: [ 5876.819570] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:31 kernel: [ 5877.219109] mcsd invoked oom-killer: gfp_mask=0x24004c0, order=2, oom_score_adj=0
May 17 10:44:31 kernel: [ 5877.219144] CPU: 2 PID: 20987 Comm: mcsd Tainted: P                4.4.60 #1
May 17 10:44:31 kernel: [ 5877.225567] Hardware name: Generic DT based system
May 17 10:44:31 kernel: [ 5877.232709] [<8022001c>] (unwind_backtrace) from [<8021c8c4>] (show_stack+0x10/0x14)
May 17 10:44:31 kernel: [ 5877.237296] [<8021c8c4>] (show_stack) from [<803b8590>] (dump_stack+0x78/0x98)
May 17 10:44:31 kernel: [ 5877.245195] [<803b8590>] (dump_stack) from [<802a7d38>] (dump_header+0x44/0x164)
May 17 10:44:31 kernel: [ 5877.252241] [<802a7d38>] (dump_header) from [<802a825c>] (oom_kill_process+0xcc/0x448)
May 17 10:44:31 kernel: [ 5877.259775] [<802a825c>] (oom_kill_process) from [<802a8924>] (out_of_memory+0x2e4/0x354)
May 17 10:44:31 kernel: [ 5877.267520] [<802a8924>] (out_of_memory) from [<802ac4d4>] (__alloc_pages_nodemask+0x67c/0x738)
May 17 10:44:31 kernel: [ 5877.275752] [<802ac4d4>] (__alloc_pages_nodemask) from [<802ac5a0>] (__get_free_pages+0x10/0x24)
May 17 10:44:31 kernel: [ 5877.284275] [<802ac5a0>] (__get_free_pages) from [<80225d70>] (pgd_alloc+0x18/0x144)
May 17 10:44:31 kernel: [ 5877.293301] [<80225d70>] (pgd_alloc) from [<80228478>] (mm_init+0xcc/0x138)
May 17 10:44:31 kernel: [ 5877.301027] [<80228478>] (mm_init) from [<802e1854>] (do_execveat_common+0x284/0x5f4)
May 17 10:44:31 kernel: [ 5877.307697] [<802e1854>] (do_execveat_common) from [<802e1bf0>] (do_execve+0x2c/0x34)
May 17 10:44:31 kernel: [ 5877.315695] [<802e1bf0>] (do_execve) from [<80209bc0>] (ret_fast_syscall+0x0/0x34)
May 17 10:44:31 kernel: [ 5877.323544] Mem-Info:
May 17 10:44:31 kernel: [ 5877.331211] active_anon:161197 inactive_anon:1485 isolated_anon:0
May 17 10:44:31 kernel: [ 5877.331211]  active_file:90 inactive_file:127 isolated_file:7
May 17 10:44:31 kernel: [ 5877.331211]  unevictable:0 dirty:52 writeback:1 unstable:0
May 17 10:44:31 kernel: [ 5877.331211]  slab_reclaimable:875 slab_unreclaimable:31662
May 17 10:44:31 kernel: [ 5877.331211]  mapped:220 shmem:1513 pagetables:602 bounce:0
May 17 10:44:31 kernel: [ 5877.331211]  free:4778 free_pcp:4 free_cma:0
May 17 10:44:31 kernel: [ 5877.331557] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:31 kernel: [ 5877.365795] Normal free:18404kB min:3752kB low:4688kB high:5628kB active_anon:644788kB inactive_anon:5940kB active_file:648kB inactive_file:916kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:898048kB managed:881836kB mlocked:0kB dirty:208kB writeback:4kB mapped:1176kB shmem:6052kB slab_reclaimable:3500kB slab_unreclaimable:126648kB kernel_stack:1432kB pagetables:2408kB unstable:0kB bounce:0kB free_pcp:544kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_
May 17 10:44:31 kernel: [ 5877.395533] lowmem_reserve[]: 0 0 0
May 17 10:44:31 kernel: [ 5877.421214] Normal: 4597*4kB (U) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 18388kB
May 17 10:44:31 kernel: [ 5877.431371] 2010 total pagecache pages
May 17 10:44:31 kernel: [ 5877.432161] 0 pages in swap cache
May 17 10:44:31 kernel: [ 5877.435905] Swap cache stats: add 0, delete 0, find 0/0
May 17 10:44:31 kernel: [ 5877.439280] Free swap  = 0kB
May 17 10:44:31 kernel: [ 5877.444308] Total swap = 0kB
May 17 10:44:31 kernel: [ 5877.447454] 224512 pages RAM
May 17 10:44:31 kernel: [ 5877.450298] 0 pages HighMem/MovableOnly
May 17 10:44:31 kernel: [ 5877.453163] 4053 pages reserved
May 17 10:44:31 kernel: [ 5877.457046] Out of memory: Kill process 9710 (asd) score 684 or sacrifice child
May 17 10:44:31 kernel: [ 5877.459867] Killed process 9710 (asd) total-vm:625440kB, anon-rss:621348kB, file-rss:60kB

 Looking back at when the router locked up initially overnight, it seems like it tried to look for a firmware update automatically around 4am, (which is strange because I have auto updates turned off), then started spamming the 'NBUF alloc failed' message:

May 17 03:48:01 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7649)]fimrware update check first time
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7682)]no need to upgrade firmware
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7654)]fimrware update check once
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7654)]fimrware update check once
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7649)]fimrware update check first time
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7682)]no need to upgrade firmware
May 17 03:57:55 kernel: [1382843.922889] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:55 kernel: [1382844.435933] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:56 kernel: [1382845.012988] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:56 kernel: [1382845.525014] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430

 Any help greatly appreciated!

2,568 Views
235 REPLIES 235

bvegas
Level 7

Mine was still OOM’ing after downgrading to older firmware, multiple resets, and disabling wireless.

Reddit had a link to SNB Forum post  (https://www.snbforums.com/threads/asus-rt-ax89x_9-0-0-4_388_31185-beta.82276/page-6) with a link to a newer beta firmware. Info is here, if anyone is interested - it’s beta, YMMV

https://tinyurl.com/2p83cmem
This version is: RT-AX89U_9.0.0.4_388_32094-ge476ac0_for_user.trx

published on 10/05/23 in the asus storage server.

The ASUS site had the 04/2022 47191 firmware listed as RT-AX89U_XXXXX, so I gave it a try.
I get a log entry with “no need to upgrade firmware” now after the “retrieve firmware information” log entry. Uptime is 15 minutes without an OOM.

I signed up just to give my thanks for posting a link to that beta FW. It fixed my RX89X. 👍

LazyTitanNZ
Level 8

My RT-AX89X has settled down. The log is no longer filling with messages, and it's been up for over 30 minutes.

islandsailor
Level 7

I'm having the same issue.  Tried downgrading to firmware from fall 22, no luck.  Tried hard boot, no luck, tried deleting log files, no luck.

If you log in with port 22 you can see the asd log file which is in the jffs folder.  Enable ssh in Administration --> System, under Service heading.

The only line in there is: 

1684360427[chknvram_action] Invalid string

If you run top you can watch the asd process climb from 1% of memory to running out of memory within 10n minutes.  Mine is at 60.6% of memory and climbing.

Tried killing the asd process but no luck with that either so far.

 

iSarpedon
Level 7

I, as many others have been struggling with this all morning. What worked for me is completely disabling IPv6. My memory utilization immediately dropped by 40%, and the router works fine now.

islandsailor
Level 7

If you look into the asd directory within the jffs directory, you will see a file for chknvram.  Mine is named chknvram20230516.  Which for some reason makes me think it was updated on its own overnight?

billforbesiii
Level 7

I'm also struggling - abundance of caution I disabled IP6 and AIProtection to be safe, trying to decide now between the beta and downgrading the firmware?

(I have two AX89X's connected with ethernet backhaul)

No one is reporting experiencing issues with the beta, and given the last update was October, I'd wager this is a close-to-release version.  But that is all speculation. Only thing I can say for sure is that the memory leak and cpu spiking is no longer happening and everything is working as expected.

I've upgraded both my AX90X's to the beta (9.0.0.4.388_32094-ge476ac0), leaving AIProtection and IP6 off for now, but so far so good (~30 minutes without issue)

oddly, after the upgrade - my logs no longer show the May 17 series of "NBUF alloc failed" entries (but May 4 entries are still there).  so the beta also makes it look like the problem never existed?  😂

also, is the "port status" new to the beta?  I like the "replace the ethernet cable in LAN3 to Cat5e or better" notice!

Interesting, I still show the logs after the update.    As for the port status, yes that is new and it is also trying to shame me for my 10 mbps device, like it is embarrassed to have it plugged in.