cancel
Showing results for 
Search instead for 
Did you mean: 

RT-AX89X going out of memory every 10 minutes (asd process?)

shievan
Level 8

Hello, my router froze up overnight, so I rebooted it, and ever since the logs are showing that it is going OOM about every 10 minutes. First it starts spamming:

May 17 10:44:26 kernel: [ 5872.720019] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:26 kernel: [ 5873.235497] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:27 kernel: [ 5873.747543] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:27 kernel: [ 5874.259510] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:28 kernel: [ 5874.771556] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430

Then it looks like a high memory killer kicks in and kills a process called asd:

May 17 10:44:30 kernel: [ 5876.819570] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:31 kernel: [ 5877.219109] mcsd invoked oom-killer: gfp_mask=0x24004c0, order=2, oom_score_adj=0
May 17 10:44:31 kernel: [ 5877.219144] CPU: 2 PID: 20987 Comm: mcsd Tainted: P                4.4.60 #1
May 17 10:44:31 kernel: [ 5877.225567] Hardware name: Generic DT based system
May 17 10:44:31 kernel: [ 5877.232709] [<8022001c>] (unwind_backtrace) from [<8021c8c4>] (show_stack+0x10/0x14)
May 17 10:44:31 kernel: [ 5877.237296] [<8021c8c4>] (show_stack) from [<803b8590>] (dump_stack+0x78/0x98)
May 17 10:44:31 kernel: [ 5877.245195] [<803b8590>] (dump_stack) from [<802a7d38>] (dump_header+0x44/0x164)
May 17 10:44:31 kernel: [ 5877.252241] [<802a7d38>] (dump_header) from [<802a825c>] (oom_kill_process+0xcc/0x448)
May 17 10:44:31 kernel: [ 5877.259775] [<802a825c>] (oom_kill_process) from [<802a8924>] (out_of_memory+0x2e4/0x354)
May 17 10:44:31 kernel: [ 5877.267520] [<802a8924>] (out_of_memory) from [<802ac4d4>] (__alloc_pages_nodemask+0x67c/0x738)
May 17 10:44:31 kernel: [ 5877.275752] [<802ac4d4>] (__alloc_pages_nodemask) from [<802ac5a0>] (__get_free_pages+0x10/0x24)
May 17 10:44:31 kernel: [ 5877.284275] [<802ac5a0>] (__get_free_pages) from [<80225d70>] (pgd_alloc+0x18/0x144)
May 17 10:44:31 kernel: [ 5877.293301] [<80225d70>] (pgd_alloc) from [<80228478>] (mm_init+0xcc/0x138)
May 17 10:44:31 kernel: [ 5877.301027] [<80228478>] (mm_init) from [<802e1854>] (do_execveat_common+0x284/0x5f4)
May 17 10:44:31 kernel: [ 5877.307697] [<802e1854>] (do_execveat_common) from [<802e1bf0>] (do_execve+0x2c/0x34)
May 17 10:44:31 kernel: [ 5877.315695] [<802e1bf0>] (do_execve) from [<80209bc0>] (ret_fast_syscall+0x0/0x34)
May 17 10:44:31 kernel: [ 5877.323544] Mem-Info:
May 17 10:44:31 kernel: [ 5877.331211] active_anon:161197 inactive_anon:1485 isolated_anon:0
May 17 10:44:31 kernel: [ 5877.331211]  active_file:90 inactive_file:127 isolated_file:7
May 17 10:44:31 kernel: [ 5877.331211]  unevictable:0 dirty:52 writeback:1 unstable:0
May 17 10:44:31 kernel: [ 5877.331211]  slab_reclaimable:875 slab_unreclaimable:31662
May 17 10:44:31 kernel: [ 5877.331211]  mapped:220 shmem:1513 pagetables:602 bounce:0
May 17 10:44:31 kernel: [ 5877.331211]  free:4778 free_pcp:4 free_cma:0
May 17 10:44:31 kernel: [ 5877.331557] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 10:44:31 kernel: [ 5877.365795] Normal free:18404kB min:3752kB low:4688kB high:5628kB active_anon:644788kB inactive_anon:5940kB active_file:648kB inactive_file:916kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:898048kB managed:881836kB mlocked:0kB dirty:208kB writeback:4kB mapped:1176kB shmem:6052kB slab_reclaimable:3500kB slab_unreclaimable:126648kB kernel_stack:1432kB pagetables:2408kB unstable:0kB bounce:0kB free_pcp:544kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_
May 17 10:44:31 kernel: [ 5877.395533] lowmem_reserve[]: 0 0 0
May 17 10:44:31 kernel: [ 5877.421214] Normal: 4597*4kB (U) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 18388kB
May 17 10:44:31 kernel: [ 5877.431371] 2010 total pagecache pages
May 17 10:44:31 kernel: [ 5877.432161] 0 pages in swap cache
May 17 10:44:31 kernel: [ 5877.435905] Swap cache stats: add 0, delete 0, find 0/0
May 17 10:44:31 kernel: [ 5877.439280] Free swap  = 0kB
May 17 10:44:31 kernel: [ 5877.444308] Total swap = 0kB
May 17 10:44:31 kernel: [ 5877.447454] 224512 pages RAM
May 17 10:44:31 kernel: [ 5877.450298] 0 pages HighMem/MovableOnly
May 17 10:44:31 kernel: [ 5877.453163] 4053 pages reserved
May 17 10:44:31 kernel: [ 5877.457046] Out of memory: Kill process 9710 (asd) score 684 or sacrifice child
May 17 10:44:31 kernel: [ 5877.459867] Killed process 9710 (asd) total-vm:625440kB, anon-rss:621348kB, file-rss:60kB

 Looking back at when the router locked up initially overnight, it seems like it tried to look for a firmware update automatically around 4am, (which is strange because I have auto updates turned off), then started spamming the 'NBUF alloc failed' message:

May 17 03:48:01 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7649)]fimrware update check first time
May 17 03:48:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7682)]no need to upgrade firmware
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:48:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7654)]fimrware update check once
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:49:05 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7654)]fimrware update check once
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7616)]do webs_update
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7634)]retrieve firmware information
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7649)]fimrware update check first time
May 17 03:49:35 WATCHDOG: [FAUPGRADE][auto_firmware_check:(7682)]no need to upgrade firmware
May 17 03:57:55 kernel: [1382843.922889] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:55 kernel: [1382844.435933] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:56 kernel: [1382845.012988] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430
May 17 03:57:56 kernel: [1382845.525014] wlan: [0:E:QDF] NBUF alloc failed 12107B @ dp_get_ppdu_desc:3430

 Any help greatly appreciated!

72,376 Views
235 REPLIES 235

I'm in the same position. I need a better Uptime SLA. Pretty ******ed Asus hasn't said anything. 

What did you go with? I'm changing now too


@PNW-Bob wrote:

… I need a better Uptime SLA. …


Consumer network equipment generally does not come with any SLA, just a limited warranty.

If you want a SLA with guaranteed uptime, you need to buy enterprise grade network equipment (e.g. from Cisco or HPE) and a service contract.  For people that need it, you can have things like 24x7 coverage, 4 hour response times with an engineer on site, and so on; but it's expensive.

I see the 

kill -SIGSTOP $(ps | grep '[a]sd$' | awk '{print $1}')

work around and will try that until a new firmware is formally released.  Thanks for the help on this forum, you people are lifesavers!

That's not a fix.  It only covers up the issue.  For a time....

wilson89
Level 8

Same thing happening to my RT-AX55! Losing internet every 10 minutes after reset. 

axman
Level 9

This command doesn't survive a reboot:

echo "* * * * * kill -SIGSTOP $(ps | grep '[a]sd$' | awk '{print $1}')" > /tmp/var/spool/cron/crontabs/admin

Now I just need to find a script that runs on boot and is writable...

 

For fellow Linux users with a spare server- I solved it a different way.

In Advanced Settings / Administration / Service

I enabled SSH on port 22, and added my Authorized key.

Then I remotely ssh into the server via a cron that runs every 5 minutes:

ssh admin@192.168.1.1 "echo * * * * * kill -SIGSTOP \$\(ps \| grep \'\[a\]sd\$\' \| awk \'{print \$\1\}\'\) > /tmp/var/spool/cron/crontabs/admin"

This way if the router reboots, the ssh key is still saved, and the command above can be set to run every minute from the spare linux server.

It's a hack, but it will hold me over and survives router reboots indirectly.

 

MadDogStu
Level 7

Here in the UK; i my RT-AX89X is having the same issues, it froze this morning (18/5 03:10 UK time) and i am getting those out of memory errors every 10-15minutes.   The version of firmware i am on - is 3.0.0.4.386_47468-g73fe1fe .

Has anyone been able to get hold of ASUS for this issue and any ETA on a fix?

KLRdude
Level 7

I just followed the same basic process as jbasol noted and it seems to have fixed the issue for my RT-AX89X.

  1. Login to your router using router.asus.com
  2. Click on Administration, Restore/Save/Upload Setting Tab.
  3. Click on Save Setting and it will save a .cfg file.
  4. Download FW_RT_AX89X_300438647191 from  https://dlcdnets.asus.com/pub/ASUS/wireless/RT-AX89X/FW_RT_AX89X_300438647191.zip?model=RT-AX89X
  5. Click on the FW Upgrade tab (also under Administration in the router web interface).
  6. Click on the Upload hyperlink to the right of the manual firmware update text.
  7. Select the firmware you downloaded in step 4  (make sure to unzip it first), and complete the FW update process.
  8. Login to the router web interface again using you normal username and password. 
  9. Click on Administration, Restore/Save/Upload Setting Tab.
  10. Check the box next to "Initialize all the settings, and clear all the data log for AiProtection, Traffic Analyzer, and Web History." and then click the Restore button.
  11. After the restore is complete it will present you with the initial factory setup screen when opening the router web interface.  There should be an option like restore config (or similar).  Select this option and select your previously saved .cfg file. 
  12. The router will reboot after restoring the configuration file and you can login with your normal username and password and all settings should be restored.

Many thanks;

Took a backup of my settings.  I applied firmware 47191, then powered off and on my router with the WPS button pressed it to reset the NVRAM (settings).  then proceeded as per above to restore my settings.  for 15+ minutes now no CPU spikes, memory staying at approx. 49%, and no out of memory errors in my log file.     Not sure if its applicable - but also noticed a signature update - Signature version
2.352 Updated : 2023/05/18 07:29.  (times are UK time).