- Products
- Learn
- Local User Groups
- Partners
-
More
Join Us for CPX 360
23-24 February 2021
Important certificate update to CloudGuard Controller, CME,
and Azure HA Security Gateways
How to Remediate Endpoint & VPN
Issues (in versions E81.10 or earlier)
IDC Spotlight -
Uplevel The SOC
Important! R80 and R80.10
End Of Support around the corner (May 2021)
Hi there,
have anyone got problem with SecureXL after upgrade from R80.10 to R80.20?
At beginning I thought that it might be a problem with NAT Templates, as they are disabled on 80.10 and enabled on 80.20 but it's not. I've turned them off and issue persist.
Frankly speaking I don't understand what is going on. FW.log shows everything is fine, rules are applied and working, but physically there is no internet communication.
And here comes the miracle:
When I turn off SecureXL everything goes as it should. I have already opened a Technical Assistance Case, but it looks like they suck more than I do (except one wonderful woman with which we found that SecureXL is an issue). So I decided to ask here, have you guys faced such a crazy issue?
Regards
Arek
Hi Arkadiusz,
I was wondering if you could help us in defining the problem in a bit more detail:
1) Can I confirm that the problem involves your internal users not being able to reach the Internet?
2) If the above is true, does the problem affect a subset of users or everybody?
3) Do you have multicast routing configured by any chance?
4) Is your firewall on a multi-core platform?
5) What troubleshooting steps have you taken just we don't do any redundant work?
Many thanks.
@Nick_Doropoulos wrote:Hi Arkadiusz,
Answering to your questions:
1) Can I confirm that the problem involves your internal users not being able to reach the Internet?
Yes, but also affect Mobile users, IPSec tunnels traffic
2) If the above is true, does the problem affect a subset of users or everybody?
Everybody
3) Do you have multicast routing configured by any chance?
No
4) Is your firewall on a multi-core platform?
No
5) What troubleshooting steps have you taken just we don't do any redundant work?
Checked NAT
Checked FW by setting any any any rule
Checked Routing
Created ticket - findout that disabling SecureXL solves the problem 🙂
This is really simple config, one Standalone 3000 series, no vlan, just plain subnet. Checkpoint is a router/fw for subnet.
We are small branch
Regards
Arek
Hi,
yes, we experienced problems with SecureXL since R77.30. We had multiple cases with Checkpoint about this. There is now one on soft close since the upgrade gateways to R80.10.
Gateways are 5900 series in cluster, multicore
The problem for troubleshooting this, it happens inconsistently and rather randomly. It might be that there were at least 3 weeks between the problems. Our experience is that suddenly the checkpoint drops traffic, with ISP redundancy and cluster the gateways gave alert that their ISP gateways where unreachable although they where up and began to fail over. Last time we experienced this was 08-03-2019.
We have the following logging ready for when this happens again:
export TODAY=`/bin/date +%Y%m%d`
mkdir $HOME/$TODAY
cd $HOME/$TODAY
fwaccel conns > fwaccel-conns.txt
fwaccel stats > fwaccel-stats.txt
fwaccel stat > fwaccel-stat.txt
fwaccel tab -t connections > fwaccel-tab-connections.txt
fwaccel tab -t inbound_SAs > fwaccel-tab-inbound_SAs.txt
fwaccel tab -t outbound_SAs > fwaccel-tab-outbound_SAs.txt
fwaccel tab -t drop_templates > fwaccel-tab-drop_templates.txt
fwaccel tab -t drop_templates > fwaccel-tab-drop_templates.txt
fwaccel tab -t vpn_link_selection > fwaccel-tab-vpn_link_selection.txt
fwaccel tab -t vpn_trusted_ifs > fwaccel-tab-vpn_trusted_ifs.txt
fwaccel tab -t invalid_replay_counter > fwaccel-tab-invalid_replay_counter.txt
fwaccel tab -t if_by_name > fwaccel-tab-if_by_name.txt
fwaccel tab -t frag_table > fwaccel-tab-frag_table.txt
fwaccel tab -t reset_table > fwaccel-tab-reset_table.txt
fwaccel dos stats get > fwaccel-dos-stats-get.txt
cp /var/log/messages /home/admin/messages
cpview history export
Also from Checkpoint we got logging requirement:
# fw ctl debug 0
# fw ctl debug -buf 32000
# fw ctl debug -m fw + conn drop vm xlate xltrc nat
# fwaccel dbg -m general + drop nat del
# fwaccel dbg -m db + tcpstate ant del
# fwaccel dbg -m api + del
# sim dbg -m pkt + drop nat pkt spoof tcpstate
# sim dbg -m db + ant del
# sim dbg -m mgr + del add
# fw ctl kdebug -T -f >& /var/log/debug.ctl &
# fw ctl debug 0
# sim dbg resetall
# fwaccel dbg resetall
Collect CPInfo from both members
But because of the high CPU usage of the gateways we are not keen to follow this as this will have major impact especially on European time working hours. Also that is a bit of a problem for troubleshooting. We cannot really afford to have major downtime and with fwaccel off in a split second we have uptime again.
We have drop optimization on but NAT Templates off
We also have a VPN site2site issue at the moment where as workaround we had to follow the procedure for sk61221 and disable securexl at the end. Not really sure if securexl is to blame in combination with sk61221 for the VPN site2site problems. It may be that Checkpoint just want it disabled for other reasons like just to see the behaviour without SecureXL
We are planning a next upgrade to R80.20 for the gateways and have our fingers crossed that it might resolve the secureXL problems
What ClusterXL mode are you using? Active / Standby, Active / Active?
Have you seen this sk? It sounds like it could be relevant and seems to apply to R77.30, R80.10 and R80.20. Although, I'm not sure if this hotfix was rolled into an HFA by now.
Are you able to try disabling ISP redundancy but leave SecureXL on?
Do you see accept / allows logged for the connections that aren't working? Do you see any logs at all?
R80.20 and above:
- SecureXL has been significantly revised in R80.20. It now works in user space. This has also led to some changes in "fw monitor"
- There are new fw monitor chain (SecureXL) objects that do not run in the virtual machine.
- Now SecureXL works in user space. The SecureXL driver takes a certain amount of kernel memory per core and that was adding up to more kernel memory than Intel/Linux was allowing.
- SecureXL supportes now Async SecureXL with Falcon cards
- That's new in acceleration high level architecture (SecureXL on Acceleration Card): Streaming over SecureXL, Lite Parsers, Scalable SecureXL, Acceleration stickiness)
Solution:
Enter manual hide NAT rules and do not use automatic hide NAT rules in the gateway object. Disable drop tamplates. I had similar problems and that helped.
same issue.. but my environment is R80.20 with Gaia 3.1(brdige mode)
disable SecureXL should be work (workaround?)
from Take5 to Take 11 , the issue still exists.
About TAC, hope CheckPoint will improve professional level...
When troubleshooting a firewall-related issue, it normally comes down to one of these three things that you will need to look at while the firewall is in the problematic state. If you can isolate which one of these is the issue, your troubleshooting can be narrowed significantly:
1) Policy - Traffic is being blocked because it is not explicitly allowed in the Firewall/Network policy, has been blocked by a APCL/URLF/Threat Prevention rule, the connection has been expired by the firewall, Gaia "ate it", or in rare cases is not having stateful inspection properly applied to it. For troubleshooting these possibilities see the "roach motel" part of my CPX 2018 presentation and also look at sk101221: TCP state logging.
2) Routing - Traffic is not being forwarded correctly by non-Check Point code such as the Gaia IP Driver or the surrounding network. The usual tools such as ping, traceroute, and tcpdump are used to troubleshoot this.
3) NAT - Traffic is being NATted when it should not be, or traffic is not NATed when it should be. Obviously the firewall traffic logs can help with this. However keep in mind that when using ISP Redundancy and an ISP failover occurs, all existing connections that were NATted through the old ISP will be killed. The matching NAT rule(s) is/are determined upon receipt of the first packet of a new connection right after a matching Firewall/Network rule has issued an Accept. The NATting for an existing connection may *not* change once this has been determined.
As Heiko mentioned SecureXL was significantly overhauled in R80.20 so much of that code is new, while the ISP Redundancy code is fairly old. Sounds like something is not playing well between the two based on the descriptions so far.
Hi,
We've had this same issue when we upgraded from R80.10 to R80.20 Active/Passive cluster.
Straight after upgrade, traffic through AWS IPSec VPN stopped working. We stumbled upon disabling SecureXL which got it working and since got a TAC case open for this.
All other VPNs are fine and working, just the AWS IPSec traffic that forces us to keep SecureXL disabled.
I came across this sk152832.
Sajid
Hello
I am having the same problem. I upgraded my 23500 VSX cluster from R77.30 to R80.20. But in my case only a certain vendor traffic going through the firewall is impacted when the Secure XL is enabled. Rest all works fine. I turn off the secure XL then that vendor traffic works fine too. Checkpoint told me to install Hot fix take 118. I am installing it over this weekend and hopefully this fixes problem.
About CheckMates
Learn Check Point
Advanced Learning
WELCOME TO THE FUTURE OF CYBER SECURITY