Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Teddy_Brewski
Advisor

CPU Spikes - help needed

Hello,

A pair of new 9300 appliances running VSX R82 Take 91.

We're struggling with occasional 100% CPU spikes that last ~6-12 seconds and result in packet loss during the incident.
Out of 4 VSs, only two are affected and they cover publicly exposed authoritative DNS servers, so we suspect that we are under fast flood DNS attacks (we have a long history of fighting with those).

The following DDoS rules have been setup quite some time ago:

# fwaccel dos rate get
fwaccel dos rate add -i "<xxx>" -action drop -log regular destination cidr:192.168.100.1 pkt-rate 600 service any 
fwaccel dos rate add -i "<xxx>" -action drop -log regular destination cidr:192.168.100.2 pkt-rate 600 service any 
fwaccel dos rate add -i "<xxx>" -action drop -log regular destination cidr:192.168.100.3 pkt-rate 600 service any 
(3 rules found)

 

I do see "Packets Dropped" increasing occasionally for all three rules, although couldn't pinpoint at exact time of the incident yet (it's happening out of office hours and last few seconds only).

 

Jun  9 17:54:42 2026 FWVSXN01 spike_detective: spike info: type: cpu, cpu core: 12, top consumer: fwk4_0, start time: 09/06/26 17:54:30, spike duration (sec): 12, initial cpu usage: 87, average cpu usage: 73, perf taken: 1
Jun  9 17:58:36 2026 FWVSXN01 spike_detective: spike info: type: cpu, cpu core: 3, top consumer: fwk4_0, start time: 09/06/26 17:58:30, spike duration (sec): 5, initial cpu usage: 89, average cpu usage: 89, perf taken: 0
Jun  9 18:00:02 2026 FWVSXN01 spike_detective: spike info: type: cpu, cpu core: 8, top consumer: pkt_thread_6, start time: 09/06/26 17:59:55, spike duration (sec): 6, initial cpu usage: 94, average cpu usage: 94, perf taken: 1
Jun  9 18:00:53 2026 FWVSXN01 spike_detective: spike info: type: cpu, cpu core: 12, top consumer: fwk4_0, start time: 09/06/26 18:00:47, spike duration (sec): 5, initial cpu usage: 100, average cpu usage: 100, perf taken: 0
Jun  9 18:01:33 2026 FWVSXN01 spike_detective: spike info: type: cpu, cpu core: 3, top consumer: fwk4_0, start time: 09/06/26 18:01:21, spike duration (sec): 11, initial cpu usage: 94, average cpu usage: 97, perf taken: 0
Jun  9 18:01:33 2026 FWVSXN01 spike_detective: spike info: type: thread, thread id: 31933, thread name: fwk4_0, start time: 09/06/26 18:01:26, spike duration (sec): 6, initial cpu usage: 99, average cpu usage: 99, perf taken: 1
Jun  9 18:02:13 2026 FWVSXN01 spike_detective: spike info: type: cpu, cpu core: 2, top consumer: fwk4_0, start time: 09/06/26 18:02:01, spike duration (sec): 11, initial cpu usage: 83, average cpu usage: 70, perf taken: 1
Jun  9 18:03:38 2026 FWVSXN01 spike_detective: spike info: type: cpu, cpu core: 2, top consumer: fwk4_0, start time: 09/06/26 18:03:32, spike duration (sec): 6, initial cpu usage: 89, average cpu usage: 89, perf taken: 0
Jun  9 18:05:15 2026 FWVSXN01 spike_detective: spike info: type: cpu, cpu core: 2, top consumer: fwk4_0, start time: 09/06/26 18:05:03, spike duration (sec): 12, initial cpu usage: 90, average cpu usage: 65, perf taken: 1
Jun  9 18:05:15 2026 FWVSXN01 spike_detective: spike info: type: thread, thread id: 31933, thread name: fwk4_0, start time: 09/06/26 18:05:09, spike duration (sec): 6, initial cpu usage: 100, average cpu usage: 100, perf taken: 1

Jun  9 23:06:20 2026 FWVSXN01 spike_detective: spike info: type: cpu, cpu core: 10, top consumer: fwk5_0, start time: 09/06/26 23:06:14, spike duration (sec): 6, initial cpu usage: 100, average cpu usage: 100, perf taken: 1
Jun  9 23:06:32 2026 FWVSXN01 spike_detective: spike info: type: cpu, cpu core: 12, top consumer: fwk5_0, start time: 09/06/26 23:06:25, spike duration (sec): 6, initial cpu usage: 100, average cpu usage: 100, perf taken: 0
Jun  9 23:06:49 2026 FWVSXN01 spike_detective: spike info: type: cpu, cpu core: 14, top consumer: fwk5_0, start time: 09/06/26 23:06:42, spike duration (sec): 6, initial cpu usage: 100, average cpu usage: 100, perf taken: 0
Jun  9 23:07:40 2026 FWVSXN01 spike_detective: spike info: type: thread, thread id: 17787, thread name: fwk5_0, start time: 09/06/26 23:07:34, spike duration (sec): 5, initial cpu usage: 99, average cpu usage: 99, perf taken: 1
Jun  9 23:07:46 2026 FWVSXN01 spike_detective: spike info: type: cpu, cpu core: 10, top consumer: fwk5_0, start time: 09/06/26 23:07:28, spike duration (sec): 17, initial cpu usage: 99, average cpu usage: 93, perf taken: 0

 

Checking cpview shows high CPU with no noticeable increase in Concurrent Connections.

 

20260609-high-cpu01.png20260609-high-cpu02.png

Any tips/hints/ideas would be greatly appreciated!

Thank you.

0 Kudos
7 Replies
Timothy_Hall
MVP Gold
MVP Gold

Probably an occasional short-lived elephant flow traversing the Medium or Slow paths at LAN speeds.  What does fw ctl multik print_heavy_conn show as far as elephant flows for the last 24 hours?  The spike detective should run this command when a spike detection is made (along with top_conns) and log it to /var/log/spike_detective/data_spike_thread_*.  These models also have P and E cores, and if an elephant flow hits an E-core, it is much easier for that core to get spiked than a P-core.

New Book: "Max Power 2026" Coming Soon
Check Point Firewall Performance Optimization
Teddy_Brewski
Advisor

Thanks for your reply @Timothy_Hall 



[fw_0]; Conn: 177.101.43.163:40677 -> 192.168.0.254:80 IPP 6; Instance load: 61%; Connection instance load: 86%; StartTime: 09/06/26 23:03:38; Duration: 94; IdentificationTime: 09/06/26 23:05:17; Service: 6:80; Total Packets: 1673; Total Bytes: 99192; 
[fw_0]; Conn: 177.101.43.163:40677 -> 192.168.0.254:80 IPP 6; Instance load: 70%; Connection instance load: 96%; StartTime: 09/06/26 23:03:38; Duration: 13; IdentificationTime: 09/06/26 23:07:26; Service: 6:80; Total Packets: 136; Total Bytes: 7501; 
[fw_0]; Conn: 177.101.43.163:40677 -> 192.168.0.254:80 IPP 6; Instance load: 77%; Connection instance load: 95%; StartTime: 09/06/26 23:03:39; Duration: 17; IdentificationTime: 09/06/26 23:08:34; Service: 6:80; Total Packets: 121; Total Bytes: 6860; 
[fw_0]; Conn: 177.101.43.163:40677 -> 192.168.0.254:80 IPP 6; Instance load: 78%; Connection instance load: 92%; StartTime: 09/06/26 23:03:38; Duration: 14; IdentificationTime: 09/06/26 23:07:01; Service: 6:80; Total Packets: 148; Total Bytes: 8254; 
[fw_0]; Conn: 177.101.43.163:40677 -> 192.168.0.254:80 IPP 6; Instance load: 81%; Connection instance load: 91%; StartTime: 09/06/26 23:03:38; Duration: 15; IdentificationTime: 09/06/26 23:08:09; Service: 6:80; Total Packets: 126; Total Bytes: 6996; 
[fw_0]; Conn: 177.101.43.163:40677 -> 192.168.0.254:80 IPP 6; Instance load: 66%; Connection instance load: 95%; StartTime: 09/06/26 23:03:38; Duration: 12; IdentificationTime: 09/06/26 23:07:49; Service: 6:80; Total Packets: 103; Total Bytes: 5670; 

Where 192.168.0.254 is the IP assigned to the DMZ interface of the firewall.

And from the other VS:

[fw_0]; Conn: 78.120.76.69:58714 -> 192.168.0.30:80 IPP 6; Instance load: 60%; Connection instance load: 98%; StartTime: 09/06/26 17:50:15; Duration: 33; IdentificationTime: 09/06/26 18:04:46; Service: 6:80; Total Packets: 100; Total Bytes: 5528; 
[fw_0]; Conn: 78.120.76.69:58714 -> 192.168.0.30:80 IPP 6; Instance load: 65%; Connection instance load: 99%; StartTime: 09/06/26 17:50:15; Duration: 183; IdentificationTime: 09/06/26 17:52:40; Service: 6:80; Total Packets: 1230; Total Bytes: 67600; 
[fw_0]; Conn: 78.120.76.69:58714 -> 192.168.0.30:80 IPP 6; Instance load: 61%; Connection instance load: 99%; StartTime: 09/06/26 17:50:16; Duration: 132; IdentificationTime: 09/06/26 18:00:13; Service: 6:80; Total Packets: 418; Total Bytes: 23003; 
[fw_0]; Conn: 78.120.76.69:58714 -> 192.168.0.30:80 IPP 6; Instance load: 73%; Connection instance load: 99%; StartTime: 09/06/26 17:50:15; Duration: 83; IdentificationTime: 09/06/26 17:50:47; Service: 6:80; Total Packets: 1994; Total Bytes: 110540; 
[fw_0]; Conn: 78.120.76.69:58714 -> 192.168.0.30:80 IPP 6; Instance load: 60%; Connection instance load: 99%; StartTime: 09/06/26 17:50:15; Duration: 148; IdentificationTime: 09/06/26 17:56:10; Service: 6:80; Total Packets: 585; Total Bytes: 32245;

 

 

0 Kudos
Lesley
MVP Gold
MVP Gold

If you have a DNS server that can be reached from outside it acts like a big honey pot. I think it this cause you are affected by short DDOS attacks that are to short for you to notice in cpview (cpview prints out every 1 minute). Maybe the traffic logs have any hints of there is indeed an attack during the outage you have. 

To help protect, check if the relevant IPS protections are enabled. Please note inspection costs CPU and could maybe be the cause of the issue.

image.png

Check also this SK:

https://support.checkpoint.com/results/sk/sk112241

3. Allocate / configure sufficient Internet bandwidth to Security Gateway.

8. Activate and configure IPS 'Geo Protection' protection / Geo Policy.

13. Block traffic coming from known malicious IP addresses

14. Enable and configure SecureXL Penalty Box <--- This one only works for dropped traffic! (not allowed by rulebase)

-------
Please press "Accept as Solution" if my post solved it 🙂
Teddy_Brewski
Advisor

Thank you for the links @Lesley. As per 'fw ctl multik print_heavy_conn' it doesn't seem to be the case this time.

0 Kudos
Timothy_Hall
MVP Gold
MVP Gold

Those remote IP addresses appear to be in Brazil and France. Do you have operations/customers/partners/vendors in those countries?  Due to the short duration, it could also be the following, although these remote networks do not appear to be part of Check Point: 

  • sk183868: Daily short-lived 100% CPU spikes on Quantum Security Gateways
  • sk174347: Software blade updates may cause single CPU spike
New Book: "Max Power 2026" Coming Soon
Check Point Firewall Performance Optimization
CaseyB
Advisor

Upgrade to JHF-103.

I had multiple CPU spike issues on 91 and two production issues. After talking with TAC we decided to upgrade, and I have not seen the same symptoms.

Lesley
MVP Gold
MVP Gold

True could still be bug, would update indeed. And after that follow tips above. They improve security anyway, if it does not solve this issue you still have a good argument to have spent time on it. 

-------
Please press "Accept as Solution" if my post solved it 🙂
0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    Fri 12 Jun 2026 @ 09:00 AM (CEST)

    Netzwerk- & Cloud-Workshop: Wien

    Tue 16 Jun 2026 @ 09:30 AM (BST)

    DDOS MasterClass in London!
    CheckMates Events