- CheckMates
- :
- Products
- :
- Quantum
- :
- Security Gateways
- :
- ips bypass
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ips bypass
I have started to get ips bypass alerts since I upgraded to r80.40 take 91. I didn't use to get IPS bypass events in take 87.
There is almost not traffic in my environment - 20 concurrent tcp sessions coming from one host I use for testing/browsing - and the cpu is idle most of the time.
I have 6 cores - 3 workers. The average cpu is 2%, occasionally goes to 20% but looking at cpview I have notices spikes that match the IPS bypass alerts - see below.
I am certain the issue has to something to do with take 91 but I was wondering if there is a way to get more verbose logging to see what is going on when the cpu usage goes over the threshold.
I am running URL filtering, Anti bot , Antivirus and IPS enabled. I have disabled HTTPS inspection recently.
I am getting about 90% of traffic through the slow path.
Spikes |
|--------------------------------------------------------------------------------------------------------------------------------------------------|
| CPU Spikes |
|--------------------------------------------------------------------------------------------------------------------------------------------------|
| Overview (last minute): |
| |
| Total Spikes: 3 |
| Average Spike Duration (Sec): 11 |
| Average Spike Usage: 95% |
| ------------------------------------------------------------------------------------------------------------------------------------------------ |
| Top Spikes (last minute): |
| |
| Start Time CPU Spike Duration (Sec) Average Usage |
| 18Feb2021 9:07:36 5 25 100% |
| 18Feb2021 9:08:41 5 5 93% |
| 18Feb2021 9:08:51 2 5 92% |
|
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have a number of ways to understand this change, will appreciate doing a remote session with you to understand the issue - I will contact you directly to arrange a remote session.
Please open an SR as well
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks.
I would say the securexl is okay. The traffic that goes through the slow path is expected, right?
[Expert@fw1:0]# fwaccel stats -p
F2F packets:
--------------
Violation Packets Violation Packets
-------------------- --------------- -------------------- ---------------
pkt has IP options 0 ICMP miss conn 0
TCP-SYN miss conn 0 TCP-other miss conn 50
UDP miss conn 197 other miss conn 0
VPN returned F2F 0 uni-directional viol 0
possible spoof viol 0 TCP state viol 0
SCTP state affecting 0 out if not def/accl 0
bridge, src=dst 0 routing decision err 0
sanity checks failed 0 fwd to non-pivot 0
broadcast/multicast 0 cluster message 10289
cluster forward 0 chain forwarding 0
F2V conn match pkts 0 general reason 0
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You mentioned that most of your traffic is the slow path - this can trigger the IPS bypass as it will cause a high load on the CPU
The statistics you showed above mean you don't have any violations in SecureXL which is good but it's unrelated to the slow path.
You can see the slow path rate at "fwaccel stats -s"
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When my testing vm is down and therefore there is only mgmt traffic meaning (gateways cluster messages, ntp, dns, snmp, syslog, http request to the checkpoint cloud through the proxy, etc) almost 100% of the traffic is not accelerated. Is this behavior expected? Should any of this traffic be accelerated?
When I browse a bit with my test vm I see the accelerated packets increase. See below
testing vm down
[Expert@fw1:0]# fwaccel stats -s
Accelerated conns/Total conns : 0/0 (0%)
Accelerated pkts/Total pkts : 0/3302 (0%)
F2Fed pkts/Total pkts : 3302/3302 (100%)
F2V pkts/Total pkts : 0/3302 (0%)
CPASXL pkts/Total pkts : 0/3302 (0%)
PSLXL pkts/Total pkts : 0/3302 (0%)
CPAS pipeline pkts/Total pkts : 0/3302 (0%)
PSL pipeline pkts/Total pkts : 0/3302 (0%)
CPAS inline pkts/Total pkts : 0/3302 (0%)
PSL inline pkts/Total pkts : 0/3302 (0%)
QOS inbound pkts/Total pkts : 0/3302 (0%)
QOS outbound pkts/Total pkts : 0/3302 (0%)
Corrected pkts/Total pkts : 0/3302 (0%)
[Expert@hqfw2b:0]# fwaccel stat
+---------------------------------------------------------------------------------+
|Id|Name |Status |Interfaces |Features |
+---------------------------------------------------------------------------------+
|0 |SND |enabled |eth0,eth2,eth3,eth5,eth6 |Acceleration,Cryptography |
| | | | |Crypto: Tunnel,UDPEncap,MD5, |
| | | | |SHA1,NULL,3DES,DES,AES-128, |
| | | | |AES-256,ESP,LinkSelection, |
| | | | |DynamicVPN,NatTraversal, |
| | | | |AES-XCBC,SHA256,SHA384 |
+---------------------------------------------------------------------------------+
Accept Templates : enabled
Drop Templates : disabled
NAT Templates : enabled
testing vm is up
[Expert@hqfw2b:0]# fwaccel stats -s
Accelerated conns/Total conns : 0/97 (0%)
Accelerated pkts/Total pkts : 4215/9061 (46%)
F2Fed pkts/Total pkts : 4846/9061 (53%)
F2V pkts/Total pkts : 110/9061 (1%)
CPASXL pkts/Total pkts : 0/9061 (0%)
PSLXL pkts/Total pkts : 4215/9061 (46%)
CPAS pipeline pkts/Total pkts : 0/9061 (0%)
PSL pipeline pkts/Total pkts : 0/9061 (0%)
CPAS inline pkts/Total pkts : 0/9061 (0%)
PSL inline pkts/Total pkts : 0/9061 (0%)
QOS inbound pkts/Total pkts : 0/9061 (0%)
QOS outbound pkts/Total pkts : 0/9061 (0%)
Corrected pkts/Total pkts : 0/9061 (0%)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just reading at sk32578
When SecureXL is enabled, all packets should be accelerated, except packets that match the following conditions:
All packets that match a rule, whose source or destination is the Security Gateway itself.
So I guess in my environment with only one user establishing connections, the percentage of accelerated traffic is expected to be low.
And if this user is down, then pretty much 100% of the packets should be non accelerated.
I guess it would still be interesting to double check it if it is possible.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When your testing VM is down, the traffic you have is only local connections - this is a slow path (by design)
So it looks like you don't have any issue here related to SecureXL but indeed something triggers a high load which cause IPS to enter bypass - we will continue offline to analyze it
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Generally enabling the IPS bypass feature is not a good idea. When monitoring the CPUs if even one of them hits the CPU % threshold, IPS functions on ALL CPUs are bypassed. This was fine when firewalls only had a few cores, but not really appropriate with many cores. Really the IPS Bypass feature should average the CPU utilization of all the workers when making the decision of whether to bypass. See here:
sk107334: IPS Bypass is triggered even when CPU utilization is not over the defined threshold
As @shais said it looks like something in T91 is causing occasional high CPU and triggering the IPS bypass; so the IPS bypass activating is just a symptom of your problem but not the cause. Normally the next step is to figure out in what mode the CPU spikes are (kernel vs. process space - us/sy/si/hi in top); you can use sar for that but it looks like the spikes are too short for sar to reliably pick up. You'll have to catch whatever it is "in the act" with top, or look in the spike detective logs here: /var/log/spike_detective/spike_detective.log
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yeah absolutely.
/var/log/spike_detective/spike_detective.log doesn't say too much though. Just the duration of the spike and the core id.
Sar seems to capture stats only every 10 mins and the spikes last between 10 and 20 secs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Luis_Miguel_MigI have the same problem, did you find the root cause?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It happens when the gateway loads the antibot/antivirus signatures at the times where it is scheduled in the smartconsole configuration. You can reproduce it with fw load_sigs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is expected behavior but it only spikes a single core, so the chances of affecting traffic handling are pretty low: sk174347: Software blade updates may cause single CPU spikes
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yeah, it is a known issue now and it can affect only the traffic going through the firewall instance/cpu core were the signature loading process is running.
So if you have 4 example 4 cores/fw instances, 25% of the traffic can be affected.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Timothy_Hall: Thank you for the SK. But the customer's problem is that the IPS is going to Bypass and the traffic is not inspected by IPS for 1 minute because of a litle signature update on 72 core appliance with low network traffic. BTW the Anti-Bot/Anti-Virus Blades are off, only the IPS blade is on (and FW of course).
Is this an expected behavior and we can't change it, or maybe I can turn off signature updates somehow?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Luis_Miguel_Mig: Thank you !!! That was my guess, but I did not find how to reproduce by hand. Thanks again 😊
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was wondering if we could use affinity settings to make this process run in a specific cpu core. I have more cpu cores than fw workers
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You could cause affinity to do that, but it won't matter to the IPS Bypass feature as all it takes is one saturated core (regardless of type) for IPS to get disabled. The IPS Bypass feature was a good idea in the days when firewalls only had 1-2 cores, not so much in today's world.
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That would be okay with me. I don't mind to get a IPS bypass. I may disable the IPS bypass feature altogether.
But how could set the affinity of the fw process to a specific core so fw load_sigs run on a core free of fw_workers?
