Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
checkma_a
Explorer

R80.40 4200 appliance cpu 100% issue

Hello all.

My customer is using R80.40 for 4200 appliance and jumbo hotfix is ​​using 94 take.


The cpu has been showing abnormalities since last week.

Traffic or memory did not change from before the anomaly. Almost identical.

As you know, the 4200 appliance has two cpu cores, and the two alternately show 100% cpu usage. Of course, there are times when both are 100% at the same time.

It's not like it's constantly at 100%, but it goes up to 100% and then goes down again.

I checked which process is occupied with the 'top' command.

slab_mcd
watchdog/1
kworker/1:0
ksoftirqd/1

The above processes are showing cpu over-occupying.

Similarly, it does not continuously occupy the cpu, but the phenomenon of decreasing after showing 100% utilization for a while is repeated.

These messages are generated in /var/log/messages.

 

Jun 15 10:42:57 2021 BD_SEC_FW kernel: <IRQ> [<ffffffff810ca5f2>] sched_show_task+0xc2/0x130
Jun 15 10:42:57 2021 BD_SEC_FW kernel: [<ffffffff810ce0d9>] dump_cpu_task+0x39/0x70
Jun 15 10:42:57 2021 BD_SEC_FW kernel: [<ffffffff81136c61>] rcu_dump_cpu_stacks+0x91/0xd0
Jun 15 10:43:19 2021 BD_SEC_FW kernel: [<ffffffff8113b727>] rcu_check_callbacks+0x477/0x780
Jun 15 10:43:19 2021 BD_SEC_FW kernel: [<ffffffff810f2750>] ? tick_sched_do_timer+0x40/0x40
Jun 15 10:43:19 2021 BD_SEC_FW kernel: [<ffffffff8109d456>] update_process_times+0x46/0x80
Jun 15 10:43:19 2021 BD_SEC_FW kernel: [<ffffffff810f20b0>] tick_sched_handle+0x30/0x70
Jun 15 10:43:19 2021 BD_SEC_FW monitord[12172]: Error: Timeout waiting for response from database server.
Jun 15 10:43:19 2021 BD_SEC_FW snmpd: Error: Timeout waiting for response from database server.
Jun 15 10:43:19 2021 BD_SEC_FW kernel: [<ffffffff810f2789>] tick_sched_timer+0x39/0x90
Jun 15 10:43:19 2021 BD_SEC_FW kernel: [<ffffffff810bb0f1>] __hrtimer_run_queues+0xf1/0x270
Jun 15 10:43:19 2021 BD_SEC_FW kernel: [<ffffffff810bb5cf>] hrtimer_interrupt+0xaf/0x1d0
Jun 15 10:43:19 2021 BD_SEC_FW kernel: [<ffffffff81059838>] local_apic_timer_interrupt+0x38/0x60
Jun 15 10:43:19 2021 BD_SEC_FW kernel: [<ffffffff817a3bcd>] smp_apic_timer_interrupt+0x3d/0x50
Jun 15 10:43:19 2021 BD_SEC_FW kernel: [<ffffffff817a0302>] apic_timer_interrupt+0x162/0x170
Jun 15 10:43:19 2021 BD_SEC_FW kernel: <EOI> [<ffffffff91ce23d0>] ? up_unified_log_send_network_chain_log+0x480/0x1640 [fw_0]
Jun 15 10:45:09 2021 BD_SEC_FW kernel: [<ffffffff91d07761>] ? up_rulebase_set_hits+0x481/0x6c0 [fw_0]
Jun 15 10:45:10 2021 BD_SEC_FW kernel: [<ffffffff91d08c9e>] ? up_rulebase_execute_match+0x12fe/0x15a0 [fw_0]
Jun 15 10:45:10 2021 BD_SEC_FW kernel: [<ffffffff91cec53e>] ? up_manager_create_net_log+0x7e/0x2c0 [fw_0]
Jun 15 10:45:10 2021 BD_SEC_FW kernel: [<ffffffff91d0d239>] ? up_manager_first_packet_rulebase_exe+0x339/0x890 [fw_0]
Jun 15 10:45:10 2021 BD_SEC_FW kernel: [<ffffffff91d260b2>] ? up_manager_fw_handle_first_packet+0x2f2/0x1a40 [fw_0]
Jun 15 10:45:10 2021 BD_SEC_FW kernel: [<ffffffff921aaf9e>] ? vpnk_om_antispoofing_applies+0x4e/0x1b0 [fw_0]
Jun 15 10:45:10 2021 BD_SEC_FW kernel: [<ffffffff9239eeb7>] ? fw_handle_first_packet.lto_priv.2535+0xa37/0x1c70 [fw_0]
Jun 15 10:45:10 2021 BD_SEC_FW kernel: [<ffffffff91b4b069>] ? cphwd_tmpl_conn_created+0x69/0x1660 [fw_0]
Jun 15 10:45:10 2021 BD_SEC_FW kernel: [<ffffffff817942be>] ? _raw_spin_unlock_bh+0x1e/0x20
Jun 15 10:45:10 2021 BD_SEC_FW monitord[12172]: Error: Timeout waiting for response from database server.
Jun 15 10:45:10 2021 BD_SEC_FW snmpd: Error: Timeout waiting for response from database server.
Jun 15 10:45:10 2021 BD_SEC_FW kernel: [<ffffffff92364caa>] ? fwchain_set_headers+0x2a/0x310 [fw_0]
Jun 15 10:45:10 2021 BD_SEC_FW kernel: [<ffffffff91b9c8d2>] ? fwconn_stats_calculate_conn_load_chain+0x112/0x8d0 [fw_0]
Jun 15 10:45:11 2021 BD_SEC_FW kernel: [<ffffffff92c4f0bc>] ? fw_try_to_match_template.constprop.1443+0x3c/0x520 [fw_0]
Jun 15 10:45:11 2021 BD_SEC_FW kernel: [<ffffffff923a60e4>] ? fw_filter_chain+0x1244/0x3180 [fw_0]
Jun 15 10:45:11 2021 BD_SEC_FW kernel: [<ffffffff8166a8da>] ? kfree_skb+0x3a/0x90
Jun 15 10:45:11 2021 BD_SEC_FW kernel: [<ffffffff929adf1d>] ? asm_stateless_verifier+0x33d/0x23a0 [fw_0]
Jun 15 10:45:11 2021 BD_SEC_FW kernel: [<ffffffff9296d2cc>] ? fw_conn_prof_context_exit_chain+0xec/0x580 [fw_0]
Jun 15 10:45:11 2021 BD_SEC_FW kernel: [<ffffffff923a99e8>] ? fwchain_do_ex+0x258/0x1a50 [fw_0]
Jun 15 10:45:11 2021 BD_SEC_FW kernel: [<ffffffff817942be>] ? _raw_spin_unlock_bh+0x1e/0x20
Jun 15 10:45:11 2021 BD_SEC_FW kernel: [<ffffffff923ab4bf>] ? fw_filter_ip_ex+0x2df/0x11b0 [fw_0]
Jun 15 10:45:11 2021 BD_SEC_FW kernel: [<ffffffff816c5520>] ? inet_del_offload+0x40/0x40
Jun 15 10:45:35 2021 BD_SEC_FW kernel: [<ffffffff923ac449>] ? fw_filter_locked+0xb9/0xa80 [fw_0]
Jun 15 10:45:35 2021 BD_SEC_FW kernel: [<ffffffff92b833f4>] ? fwmultik_process_entry+0x7d4/0x17d0 [fw_0]
Jun 15 10:45:35 2021 BD_SEC_FW kernel: [<ffffffff810e9a2a>] ? __getnstimeofday64+0x3a/0xd0
Jun 15 10:45:35 2021 BD_SEC_FW kernel: [<ffffffff817942be>] ? _raw_spin_unlock_bh+0x1e/0x20
Jun 15 10:45:35 2021 BD_SEC_FW kernel: [<ffffffff92b27582>] ? fw_kfree+0x3b2/0xa00 [fw_0]
Jun 15 10:45:35 2021 BD_SEC_FW kernel: [<ffffffff817942be>] ? _raw_spin_unlock_bh+0x1e/0x20
Jun 15 10:45:35 2021 BD_SEC_FW kernel: [<ffffffff92b27582>] ? fw_kfree+0x3b2/0xa00 [fw_0]
Jun 15 10:45:35 2021 BD_SEC_FW kernel: [<ffffffff92b5daa6>] ? fwmultik_prio_queue_flush_one_to_entry+0xa6/0x3e0 [fw_0]
Jun 15 10:45:35 2021 BD_SEC_FW kernel: [<ffffffff8102e8f6>] ? do_softirq+0x46/0x90
Jun 15 10:45:35 2021 BD_SEC_FW kernel: [<ffffffff92b84670>] ? fwmultik_process_entry_unlocked+0x280/0x280 [fw_0]
Jun 15 10:45:36 2021 BD_SEC_FW kernel: [<ffffffff92b8470d>] ? fwmultik_queue_async_dequeue_cb+0x9d/0x2c0 [fw_0]
Jun 15 10:45:36 2021 BD_SEC_FW xpand[12168]: show_asset CDK: asset_get_proc started.
Jun 15 10:45:36 2021 BD_SEC_FW kernel: [<ffffffff92b328ce>] ? kernel_thread_run+0x3ae/0xf90 [fw_0]
Jun 15 10:45:36 2021 BD_SEC_FW kernel: [<ffffffff810d309e>] ? dequeue_task_fair+0x3de/0x6c0
Jun 15 10:45:36 2021 BD_SEC_FW kernel: [<ffffffff810b7eb0>] ? wake_up_atomic_t+0x30/0x30
Jun 15 10:47:43 2021 BD_SEC_FW kernel: [<ffffffff92af2570>] ? fw_kfree_global+0x10/0x10 [fw_0]
Jun 15 10:47:43 2021 BD_SEC_FW kernel: [<ffffffff92af631e>] ? kiss_kthread_run+0x1e/0x50 [fw_0]
Jun 15 10:47:43 2021 BD_SEC_FW kernel: [<ffffffff92af258b>] ? plat_run_thread+0x1b/0x30 [fw_0]
Jun 15 10:47:43 2021 BD_SEC_FW kernel: [<ffffffff810b6f42>] ? kthread+0xe2/0xf0
Jun 15 10:47:43 2021 BD_SEC_FW kernel: [<ffffffff810b6e60>] ? insert_kthread_work+0x40/0x40
Jun 15 10:47:43 2021 BD_SEC_FW kernel: [<ffffffff8179f15d>] ? ret_from_fork_nospec_begin+0x7/0x21
Jun 15 10:47:43 2021 BD_SEC_FW snmpd: Error: Timeout waiting for response from database server.
Jun 15 10:47:44 2021 BD_SEC_FW kernel: [<ffffffff810b6e60>] ? insert_kthread_work+0x40/0x40
Jun 15 10:47:44 2021 BD_SEC_FW kernel: INFO: rcu_sched self-detected stall on CPU
Jun 15 10:47:44 2021 BD_SEC_FW kernel: 1: (299999 ticks this GP) idle=b49/140000000000001/0 softirq=365278254/365278254
Jun 15 10:47:44 2021 BD_SEC_FW kernel: (t=300000 jiffies g=209973055 c=209973054 q=0)
Jun 15 10:47:44 2021 BD_SEC_FW kernel: Task dump for CPU 1:
Jun 15 10:47:44 2021 BD_SEC_FW kernel: fw_worker_0 R running task 10296 9762 2 0x00000008
Jun 15 10:47:44 2021 BD_SEC_FW kernel: Call Trace:
Jun 15 10:47:44 2021 BD_SEC_FW monitord[12172]: Error: Timeout waiting for response from database server.
Jun 15 10:47:44 2021 BD_SEC_FW kernel: <IRQ> [<ffffffff810ca5f2>] sched_show_task+0xc2/0x130
Jun 15 10:47:44 2021 BD_SEC_FW kernel: [<ffffffff810ce0d9>] dump_cpu_task+0x39/0x70
Jun 15 10:47:44 2021 BD_SEC_FW kernel: [<ffffffff81136c61>] rcu_dump_cpu_stacks+0x91/0xd0
Jun 15 10:47:45 2021 BD_SEC_FW kernel: [<ffffffff8113b727>] rcu_check_callbacks+0x477/0x780

 


And a dump file was also created.

The symptom seems to be that the packets are not being dropped.

Very slow when connecting GW to Serial Console
I have to wait 3-4 minutes after entering the command.
It's not that slow when connecting via ssh.

Does anyone know about this phenomenon?

 

0 Kudos
2 Replies
Wolfgang
Leader
Leader

@checkma_a 

I would not do any more troubleshooting, TAC should be involved.

Timothy_Hall
Champion
Champion

Just sounds like your firewall is very busy in kernel space handling traffic, causing the process space timeouts you are seeing in /var/log/messages.  Probably caused by a burst of traffic possibly getting handled in the F2F path.  Please provide outputs from Super Seven commands, ideally while CPU is very high:

https://community.checkpoint.com/t5/Scripts/S7PAC-Super-Seven-Performance-Assessment-Commands/m-p/40...

 

"Max Capture: Know Your Packets" Video Series
now available at http://www.maxpowerfirewalls.com