Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
chrominek
Contributor

R81.10 Resuming IPS inspection which was bypassed due to resources shortage at MUP:0 CUP:0

Jump to solution

I have a R81.10 16200 cluster "at this moment" migrated from older cluster (R77.30)

I see in the logs:

Attack Information:        IPS Bypass Disengaged

Attack Name:               IPS Bypass

Resource Shortage:         Resuming IPS inspection which was bypassed due to resources shortage - CPU utilization / Memory utilization are:

Memory Utilization Percent:0

CPU Utilization Percent:   0

My GW CPU usage is around 10%. I see sk172305 and this message is likely a fake or real message? It means I should disable ips bypass or ignore this message? Or open a case to fix it ...

 

 

0 Kudos
1 Solution

Accepted Solutions
Timothy_Hall
Champion
Champion

As Val said you need to track down the source of the spike, probably an elephant flow which you can check with fw ctl multik print_heavy_conn.

When it comes to the IPS Bypass feature JUST SAY NO.  I've been ripping on this feature for years in my books and classes, here is the latest iteration in my 2021 IPS/AV/ABOT video series:

ipsbypass.png

 

"Max Capture: Know Your Packets" Self-Guided Video Series
available at http://www.maxpowerfirewalls.com

View solution in original post

0 Kudos
(1)
11 Replies
_Val_
Admin
Admin

Depending on actual amount of cores, 10% average utilization may mean 100% on one or several cores, while others are idle. That will trigger IPS bypass nonetheless.

Look into "cpview" history or check on "top", to see actual per core utilisation. 

Swordfish
Participant

Do you see a CPU Spike in cpview history (cpview -t) in this moment?

Speaking for R80.40 what are your IPS settings in the cluster-object (see attached image)?

 
 

 

chrominek
Contributor

ips bypass stat
IPS Bypass Under Load: Enabled
Currently under load: No
Currently in bypass: No
CPU Usage thresholds: Low: 70, High: 90
Memory Usage thresholds: Low: 70, High: 90

0 Kudos
_Val_
Admin
Admin

This is only the current state. Look at the timestamps in the logs, start historical view of cpview and go about that mentioned time to see CPU situation.

0 Kudos
chrominek
Contributor

Yes, sometimes. General I have a equal load on cores, but cores have changing in time load.

top - 10:15:28 up 1 day, 16:34, 2 users, load average: 8.01, 8.40, 7.96
Tasks: 957 total, 1 running, 956 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.0 us, 20.6 sy, 0.0 ni, 52.0 id, 0.0 wa, 2.9 hi, 24.5 si, 0.0 st
%Cpu1 : 0.0 us, 1.0 sy, 0.0 ni, 74.5 id, 0.0 wa, 2.9 hi, 21.6 si, 0.0 st
%Cpu2 : 8.7 us, 5.8 sy, 0.0 ni, 85.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 9.6 us, 6.7 sy, 0.0 ni, 82.7 id, 0.0 wa, 1.0 hi, 0.0 si, 0.0 st
%Cpu4 : 12.9 us, 8.9 sy, 0.0 ni, 78.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 15.5 us, 8.7 sy, 0.0 ni, 74.8 id, 0.0 wa, 1.0 hi, 0.0 si, 0.0 st
%Cpu6 : 7.9 us, 7.9 sy, 0.0 ni, 83.2 id, 0.0 wa, 1.0 hi, 0.0 si, 0.0 st
%Cpu7 : 0.0 us, 0.0 sy, 0.0 ni, 87.1 id, 0.0 wa, 1.0 hi, 11.9 si, 0.0 st
%Cpu8 : 9.6 us, 6.7 sy, 0.0 ni, 83.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu9 : 8.8 us, 8.8 sy, 0.0 ni, 82.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu10 : 11.7 us, 11.7 sy, 0.0 ni, 75.7 id, 0.0 wa, 1.0 hi, 0.0 si, 0.0 st
%Cpu11 : 6.9 us, 6.9 sy, 0.0 ni, 86.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu12 : 8.7 us, 7.7 sy, 0.0 ni, 82.7 id, 0.0 wa, 1.0 hi, 0.0 si, 0.0 st
%Cpu13 : 0.0 us, 1.0 sy, 0.0 ni, 65.7 id, 0.0 wa, 2.9 hi, 30.4 si, 0.0 st
%Cpu14 : 8.7 us, 8.7 sy, 0.0 ni, 80.8 id, 0.0 wa, 1.0 hi, 1.0 si, 0.0 st
%Cpu15 : 0.0 us, 0.0 sy, 0.0 ni, 78.0 id, 0.0 wa, 2.0 hi, 20.0 si, 0.0 st
%Cpu16 : 9.0 us, 8.0 sy, 0.0 ni, 83.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu17 : 10.6 us, 9.6 sy, 0.0 ni, 78.8 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu18 : 18.1 us, 12.4 sy, 0.0 ni, 69.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu19 : 7.8 us, 8.7 sy, 0.0 ni, 83.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu20 : 8.7 us, 7.8 sy, 0.0 ni, 83.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu21 : 12.0 us, 7.0 sy, 0.0 ni, 81.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu22 : 10.9 us, 9.9 sy, 0.0 ni, 79.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu23 : 7.0 us, 11.0 sy, 0.0 ni, 82.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu24 : 0.0 us, 0.0 sy, 0.0 ni, 82.0 id, 0.0 wa, 2.0 hi, 16.0 si, 0.0 st
%Cpu25 : 0.0 us, 1.0 sy, 0.0 ni, 79.2 id, 0.0 wa, 2.0 hi, 17.8 si, 0.0 st
%Cpu26 : 8.7 us, 4.9 sy, 0.0 ni, 86.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu27 : 7.0 us, 8.0 sy, 0.0 ni, 85.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu28 : 5.0 us, 5.9 sy, 0.0 ni, 89.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu29 : 12.6 us, 7.8 sy, 0.0 ni, 79.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu30 : 7.8 us, 6.9 sy, 0.0 ni, 85.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu31 : 1.0 us, 1.0 sy, 0.0 ni, 85.6 id, 0.0 wa, 1.9 hi, 10.6 si, 0.0 st
%Cpu32 : 8.7 us, 6.8 sy, 0.0 ni, 84.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu33 : 5.9 us, 4.0 sy, 0.0 ni, 89.1 id, 0.0 wa, 1.0 hi, 0.0 si, 0.0 st
%Cpu34 : 11.0 us, 9.0 sy, 0.0 ni, 80.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu35 : 5.9 us, 5.0 sy, 0.0 ni, 89.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu36 : 12.6 us, 5.8 sy, 0.0 ni, 80.6 id, 0.0 wa, 1.0 hi, 0.0 si, 0.0 st
%Cpu37 : 0.0 us, 1.0 sy, 0.0 ni, 83.3 id, 0.0 wa, 2.0 hi, 13.7 si, 0.0 st
%Cpu38 : 25.7 us, 10.9 sy, 0.0 ni, 62.4 id, 0.0 wa, 1.0 hi, 0.0 si, 0.0 st
%Cpu39 : 1.0 us, 1.0 sy, 0.0 ni, 75.7 id, 0.0 wa, 3.9 hi, 18.4 si, 0.0 st
%Cpu40 : 10.8 us, 5.9 sy, 0.0 ni, 83.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu41 : 9.9 us, 16.8 sy, 0.0 ni, 73.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu42 : 11.5 us, 10.6 sy, 0.0 ni, 76.9 id, 0.0 wa, 1.0 hi, 0.0 si, 0.0 st
%Cpu43 : 7.0 us, 7.0 sy, 0.0 ni, 86.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu44 : 12.6 us, 7.8 sy, 0.0 ni, 79.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu45 : 7.9 us, 6.9 sy, 0.0 ni, 85.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu46 : 30.1 us, 19.4 sy, 0.0 ni, 50.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu47 : 1.0 us, 1.0 sy, 0.0 ni, 98.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st

...

[11Jan2022 12:25:15] HISTORY. Use [-],[+] to change timestamp |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Overview SysInfo Network CPU I/O Software-blades Hardware-Health Advanced |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Overview Top-Protocols Top-Connections Spikes |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Host |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Overview: |
| |
| CPU type CPUs Avg utilization |
| - - - |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| CPU: |
| |
| CPU Type User System Idle I/O wait Interrupts |
| 0 CoreXL_SND 0% 0% 100% 0% 2,715 |
| 1 CoreXL_SND 0% 0% 100% 0% 2,685 |
| 2 CoreXL_FW 0% 0% 100% 0% 2,684 |
| 3 CoreXL_FW 0% 0% 100% 0% 2,673 |
| 4 CoreXL_FW 0% 0% 100% 0% 2,658 |
| 5 CoreXL_FW 0% 0% 100% 0% 2,603 |
| 6 CoreXL_FW 0% 0% 100% 0% 2,588 |
| 7 CoreXL_FW 0% 9% 91% 0% 2,497 |
| 8 CoreXL_FW 0% 0% 100% 0% 2,457 |
| 9 CoreXL_FW 0% 0% 100% 0% 2,399 |
| 10 CoreXL_FW 0% 0% 100% 0% 2,391 |
| 11 CoreXL_FW 0% 0% 100% 0% 2,425 |
| 12 BOTH 9% 0% 91% 0% 2,371 |
| 13 BOTH 0% 0% 100% 0% 2,408 |
| 14 CoreXL_FW 9% 0% 91% 0% 2,407 |
| 15 CoreXL_FW 0% 0% 100% 0% 2,451 |
| 16 CoreXL_FW 0% 0% 100% 0% 2,466 |
| 17 CoreXL_FW 0% 0% 100% 0% 2,447 |
| 18 CoreXL_FW 0% 100% 0% 0% 2,425 |
| 19 CoreXL_FW 0% 0% 100% 0% 2,452 |
| 20 CoreXL_FW 0% 0% 100% 0% 2,462 |
| 21 CoreXL_FW 0% 0% 100% 0% 2,474 |
| 22 CoreXL_FW 0% 0% 100% 0% 2,456 |
| 23 CoreXL_FW 0% 0% 100% 0% 2,474 |
| 24 CoreXL_SND 0% 0% 100% 0% 2,493 |
| 25 CoreXL_SND 0% 0% 100% 0% 2,465 |
| 26 CoreXL_FW 9% 27% 64% 0% 2,464 |
| 27 CoreXL_FW 0% 9% 91% 0% 2,437 |
| 28 CoreXL_FW 9% 18% 73% 0% 2,396 |
| 29 CoreXL_FW 0% 0% 100% 0% 2,441 |
| 30 CoreXL_FW 0% 0% 100% 0% 2,452 |
|- More info available by scrolling down --------------------------------------------

tail /var/log/messages:

...

Jan 17 10:06:58 2022 fw1 spike_detective: spike info: type: cpu, cpu core: 4, top consumer: fwk0_38, start time: 17/01/22 10:06:09, spike duration (sec): 48, initial cpu usage: 82, average cpu usage: 71, perf taken: 1
Jan 17 10:08:21 2022 fw1 spike_detective: spike info: type: cpu, cpu core: 0, top consumer: system interrupts, start time: 17/01/22 10:03:23, spike duration (sec): 297, initial cpu usage: 92, average cpu usage: 62, perf taken: 1

...

 

_Val_
Admin
Admin

Exactly that: 

| 18 CoreXL_FW 0% 100% 0% 0% 2,425 |

That was the trigger.

0 Kudos
chrominek
Contributor

In the /var/log/spike_detective/spike_detective.log  I see roughly  "a page" of spikes from today. So the conclusion is "disable ips bypass"?

By the way, I have problem with monitoring, but it looks unrelated to the IPS problems.

sessions.png

 Based on this picture I have no communication over 16s periods. It looks I have a very, very tolerant users ;-).   What may be the cause of such presentation?

_Val_
Admin
Admin

Sorry, I have a hard time to understand what you are trying to say. Where do you see traffic outage? If you 

IPS bypass kicks in when one or several CPUs are other threshold, to protect your production traffic and to make sure it still flows even if one of FWKs is very busy. 

If you disable IPS bypass mechanism, it will only decrease performance during peak time. The proper way of action here is to get the root cause of the spikes and try to work it around.


0 Kudos
chrominek
Contributor

If Accepted Packet Rate == 0 or  Bytes Throughput == 0 then  No traffic passing (on the screen - gray and blue lines). On the new console traffic flows. Monitored by snmp is OK. On Smart View Monitor - Carcassonne.

Spikes on big files are normal, because of most of the security inspections are single-threaded per single file, I think.  And CPU is to work, not to be left idle - specially if you have 60 cores. So TH is right fighting it,but every vendor has it - connectivity over security.

So the final answer is "yes, disable bypass" - ips bypass off  - fast fix, disable on cluster - permanent.

0 Kudos
Timothy_Hall
Champion
Champion

As Val said you need to track down the source of the spike, probably an elephant flow which you can check with fw ctl multik print_heavy_conn.

When it comes to the IPS Bypass feature JUST SAY NO.  I've been ripping on this feature for years in my books and classes, here is the latest iteration in my 2021 IPS/AV/ABOT video series:

ipsbypass.png

 

"Max Capture: Know Your Packets" Self-Guided Video Series
available at http://www.maxpowerfirewalls.com
0 Kudos
(1)
chrominek
Contributor

Thank you very much for this comprehensive answer!

0 Kudos