Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted

Enabled SecureXL means no traffic

Hi there,

have anyone got problem with SecureXL after upgrade from R80.10 to R80.20?

At beginning I thought that it might be a problem with NAT Templates, as they are disabled on 80.10 and enabled on 80.20 but it's not. I've turned them off and issue persist.

Frankly speaking I don't understand what is going on. FW.log  shows everything is fine, rules are applied and working, but physically there is no internet communication.

And here comes the miracle:
When I turn off SecureXL everything goes as it should. I have already opened a Technical Assistance Case, but it looks like they suck more than I do (except one wonderful woman with which we found that SecureXL is an issue). So I decided to ask here, have you guys faced such a crazy issue?

Regards

Arek

 

13 Replies
Highlighted

Hi Arkadiusz,

I was wondering if you could help us in defining the problem in a bit more detail:

1) Can I confirm that the problem involves your internal users not being able to reach the Internet?

2) If the above is true, does the problem affect a subset of users or everybody?

3) Do you have multicast routing configured by any chance?

4) Is your firewall on a multi-core platform?

5) What troubleshooting steps have you taken just we don't do any redundant work?

Many thanks.

0 Kudos
Highlighted


@Nick_Doropoulos wrote:

Hi Arkadiusz,

Hi @Nick_Doropoulos 

Answering to your questions:

1) Can I confirm that the problem involves your internal users not being able to reach the Internet?

Yes, but also affect Mobile users, IPSec tunnels traffic

 

2) If the above is true, does the problem affect a subset of users or everybody?

Everybody

3) Do you have multicast routing configured by any chance?

No

4) Is your firewall on a multi-core platform?

No

5) What troubleshooting steps have you taken just we don't do any redundant work?

Checked NAT

Checked FW  by setting any any any rule

Checked Routing

Created ticket - findout that disabling SecureXL solves the problem 🙂

This is really simple config, one Standalone 3000 series, no vlan, just plain subnet. Checkpoint is a router/fw for subnet.

We are small branch

Regards

Arek

0 Kudos

Hi,

yes, we experienced problems with SecureXL since R77.30. We had multiple cases with Checkpoint about this. There is now one on soft close  since the upgrade gateways to R80.10.

Gateways are 5900 series in cluster, multicore

The problem for troubleshooting this, it happens inconsistently and rather randomly. It might be that there were at least 3 weeks between the problems. Our experience is that suddenly the checkpoint drops traffic, with ISP redundancy and cluster the gateways gave alert that their ISP gateways where unreachable although they where up and began to fail over. Last time we experienced this was 08-03-2019.

We have the following logging ready for when this happens again:

export TODAY=`/bin/date +%Y%m%d`

mkdir $HOME/$TODAY

cd $HOME/$TODAY

fwaccel conns > fwaccel-conns.txt

fwaccel stats > fwaccel-stats.txt

fwaccel stat > fwaccel-stat.txt

fwaccel tab -t connections > fwaccel-tab-connections.txt       

fwaccel tab -t inbound_SAs > fwaccel-tab-inbound_SAs.txt                      

fwaccel tab -t outbound_SAs > fwaccel-tab-outbound_SAs.txt   

fwaccel tab -t drop_templates > fwaccel-tab-drop_templates.txt                       

fwaccel tab -t drop_templates > fwaccel-tab-drop_templates.txt

fwaccel tab -t vpn_link_selection > fwaccel-tab-vpn_link_selection.txt                              

fwaccel tab -t vpn_trusted_ifs > fwaccel-tab-vpn_trusted_ifs.txt                                   

fwaccel tab -t invalid_replay_counter > fwaccel-tab-invalid_replay_counter.txt                             

fwaccel tab -t if_by_name > fwaccel-tab-if_by_name.txt                                                     

fwaccel tab -t frag_table > fwaccel-tab-frag_table.txt                   

fwaccel tab -t reset_table > fwaccel-tab-reset_table.txt

fwaccel dos stats get > fwaccel-dos-stats-get.txt

cp /var/log/messages /home/admin/messages

cpview history export

Also from Checkpoint we got logging requirement:

# fw ctl debug 0
# fw ctl debug -buf 32000
# fw ctl debug -m fw + conn drop vm xlate xltrc nat
# fwaccel dbg -m general + drop nat del
# fwaccel dbg -m db + tcpstate ant del
# fwaccel dbg -m api + del
# sim dbg -m pkt + drop nat pkt spoof tcpstate
# sim dbg -m db + ant del
# sim dbg -m mgr + del add

  1. Run the debug:

# fw ctl kdebug -T -f >& /var/log/debug.ctl &

  1. Give the debug a few minutes to run.
  1. Stop the debug using ctrl +c
  1. Reset the debug flags:

# fw ctl debug 0
# sim dbg resetall
# fwaccel dbg resetall

Collect CPInfo from both members

But because of the high CPU usage of the gateways we are not keen to follow this as this will have major impact especially on European time working hours. Also that is a bit of a problem for troubleshooting. We cannot really afford to have major downtime and with fwaccel off in a split second we have uptime again.

We have drop optimization on but NAT Templates off

We also have a VPN site2site issue at the moment where as workaround we had to follow the procedure for sk61221 and disable securexl at the end. Not really sure if securexl is to blame in combination with sk61221 for the VPN site2site problems. It may be that Checkpoint just want it disabled for other reasons like just to see the behaviour without SecureXL

We are planning a next upgrade to R80.20 for the gateways and have our fingers crossed that it might resolve the secureXL problems

0 Kudos
Highlighted

What ClusterXL mode are you using? Active / Standby, Active / Active?

Have you seen this sk? It sounds like it could be relevant and seems to apply to R77.30, R80.10 and R80.20. Although, I'm not sure if this hotfix was rolled into an HFA by now.

Are you able to try disabling ISP redundancy but leave SecureXL on? 

Do you see accept / allows logged for the connections that aren't working? Do you see any logs at all?

 

R80 CCSA / CCSE
Highlighted

Hi @Arkadiusz_Szyma 

R80.20 and above:
- SecureXL has been significantly revised in R80.20. It now works in user space. This has also led to some changes in "fw monitor"
- There are new fw monitor chain (SecureXL) objects that do not run in the virtual machine.
- Now SecureXL works in user space. The SecureXL driver takes a certain amount of kernel memory per core and that was adding up to more kernel memory than Intel/Linux was allowing.
- SecureXL supportes now Async SecureXL with Falcon cards
- That's new in acceleration high level architecture (SecureXL on Acceleration Card): Streaming over SecureXL, Lite Parsers, Scalable SecureXL, Acceleration stickiness)

Solution:

Enter manual hide NAT rules and do not use automatic hide NAT rules in the gateway object. Disable drop tamplates. I had similar problems and that helped.

Tags (1)
Highlighted

Quite useful, thanks ! I also recognize the disable drop templates remark. Looks like we need to upgrade the cluster gateways to R80.20 asap
Highlighted

Interesting that this issue also occurs on other firewalls.

Tags (1)
Highlighted

same issue.. but my environment is  R80.20 with Gaia 3.1(brdige mode)

disable SecureXL should be work (workaround?)

from Take5 to Take 11 , the issue still exists.

About TAC, hope CheckPoint will improve professional level...

 

Highlighted
Admin
Admin

If I'm troubleshooting this, I'd start with what connections look like from a known host when things are "working" versus when they are "not."
Based on what you've described, it seems to be an issue with NAT, but since you didn't provide a network diagram and mentioned NAT templates, that's only a guess.

Regardless, if "disabling SecureXL solves the issue" then it's most likely a bug and you'll need to work with TAC.
0 Kudos
Highlighted

When troubleshooting a firewall-related issue, it normally comes down to one of these three things that you will need to look at while the firewall is in the problematic state.  If you can isolate which one of these is the issue, your troubleshooting can be narrowed significantly:

1) Policy - Traffic is being blocked because it is not explicitly allowed in the Firewall/Network policy, has been blocked by a APCL/URLF/Threat Prevention rule, the connection has been expired by the firewall, Gaia "ate it", or in rare cases is not having stateful inspection properly applied to it.  For troubleshooting these possibilities see the "roach motel" part of my CPX 2018 presentation and also look at sk101221: TCP state logging.

2) Routing - Traffic is not being forwarded correctly by non-Check Point code such as the Gaia IP Driver or the surrounding network.  The usual tools such as ping, traceroute, and tcpdump are used to troubleshoot this.

3) NAT - Traffic is being NATted when it should not be, or traffic is not NATed when it should be.  Obviously the firewall traffic logs can help with this.  However keep in mind that when using ISP Redundancy and an ISP failover occurs, all existing connections that were NATted through the old ISP will be killed.  The matching NAT rule(s) is/are determined upon receipt of the first packet of a new connection right after a matching Firewall/Network rule has issued an Accept.  The NATting for an existing connection may *not* change once this has been determined.

As Heiko mentioned SecureXL was significantly overhauled in R80.20 so much of that code is new, while the ISP Redundancy code is fairly old.  Sounds like something is not playing well between the two based on the descriptions so far.

 

R80.40 addendum for book "Max Power 2020" now available
for free download at http://www.maxpowerfirewalls.com
Highlighted
Nickel

Hi,

 

We've had this same issue when we upgraded from R80.10 to R80.20 Active/Passive cluster.

Straight after upgrade, traffic through AWS IPSec VPN stopped working. We stumbled upon disabling SecureXL which got it working and since got a TAC case open for this. 

All other VPNs are fine and working, just the AWS IPSec traffic that forces us to keep SecureXL disabled.

I came across this sk152832.

Sajid

0 Kudos
Highlighted

Hello

 

I am having the same problem. I upgraded my 23500 VSX cluster from R77.30 to R80.20. But in my case only a certain vendor traffic going through the firewall is impacted when the Secure XL is enabled. Rest all works fine. I turn off the secure XL then that vendor traffic works fine too. Checkpoint told me to install Hot fix take 118. I am installing it over this weekend and hopefully this fixes problem.

0 Kudos