Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Nickel

r80.30 take196 - fw unable to accept new connections till fwpol is reapplied from mgmt server

Jump to solution

Has anyone else experienced issues with the fw passing traffic when updating to take 196?


Before we did not have to push policy when updating minor takes (ex 155 to 196)

However we ran into issues last night where we updated to take 196 and new connections were not being accepted until we pushed policy post patch and reboot.

Our assumption was since the fwpol is cached on the fw and loaded on reboot, that a policy push shouldn't be mandatory post patch.

 

v/r,

Jon

(we have opened a tac case for our RFO)

0 Kudos
1 Solution

Accepted Solutions
Highlighted

We saw this on our VSX upgrades recently to take 196 (three separate clusters had the issue). Our upgrades were from 155 to 196.

Also resolved with policy install. Oddly it did not impact all VS, only some.

Fortunately I found a community post at the time as I was scratching my head what had happened.

Didn't bother with TAC case as I had completed all of the required upgrades but it certainly seems to be an issue for multiple Checkpoint customers.

View solution in original post

15 Replies
Highlighted
It's strange. Where you able to run fw ctl zdegug drops to see the drop reason for new connections?
____________
https://www.linkedin.com/in/federicomeiners/
0 Kudos
Highlighted
Nickel
No, we were more concerned about uptime than debugging, plus we have had some serious impact recently when running captures.
0 Kudos
Highlighted
Nickel
Hi
Have experienced exactly the same issue.
In tree or four patch apply, have experienced the issue on the firewall with higher uptime
Highlighted

We saw this on our VSX upgrades recently to take 196 (three separate clusters had the issue). Our upgrades were from 155 to 196.

Also resolved with policy install. Oddly it did not impact all VS, only some.

Fortunately I found a community post at the time as I was scratching my head what had happened.

Didn't bother with TAC case as I had completed all of the required upgrades but it certainly seems to be an issue for multiple Checkpoint customers.

View solution in original post

Highlighted

Hi!

We had the same issue on a VSX-Cluster. We did not have an overal outage, but most allowed traffic was blocked by the last clean-up-rule. Manual policy-installation did not help. We uninstalled take 196 again and opened a TAC case.

Martin

Highlighted

Yesterday we deployed the same take (196) in our R80.30 Kernel 2.6 VSX HA Cluster of two 23500.
After installing JHF on the last member sync was broken and completly corrupted in that member (Active/Down. So far we didn't manage to uninstall JHF since we ran out of time in the maintenance window.

Case opened: 6-0002039948

____________
https://www.linkedin.com/in/federicomeiners/
Highlighted

Hi!

 

Did you get an update from the TAC-Team? What did they analyze till now?

 

Best regards

 

Martin

0 Kudos
Highlighted
Silver

Is it on 2.6 or 3.10? I've deployed Take 196 in large VSX environments without apparent issues, running R80.30 3.10, in case it would make a difference.

Highlighted
Nickel

k3.10

firewall-1 ~ # cpinfo -yall

This is Check Point CPinfo Build 914000202 for GAIA
[IDA]
No hotfixes..

[MGMT]
No hotfixes..

[CPFC]
HOTFIX_R80_30_GOGO_JHF_MAIN Take: 155

[FW1]
HOTFIX_MAAS_TUNNEL_AUTOUPDATE
HOTFIX_R80_30_GOGO_JHF_MAIN Take: 155

FW1 build number:
This is Check Point's software version R80.30 - Build 001
kernel: R80.30 - Build 159

[SecurePlatform]
HOTFIX_R80_30_GOGO_JHF_MAIN Take: 155

[PPACK]
HOTFIX_R80_30_GOGO_JHF_MAIN Take: 155

[CPinfo]
No hotfixes..

[CPUpdates]
BUNDLE_MAAS_TUNNEL_AUTOUPDATE Take: 25
BUNDLE_CPINFO Take: 50
BUNDLE_INFRA_AUTOUPDATE Take: 32
BUNDLE_DEP_INSTALLER_AUTOUPDATE Take: 13
BUNDLE_R80_30_JUMBO_HF_MAIN_3_10_GW Take: 155

[AutoUpdater]
No hotfixes..

[DIAG]
No hotfixes..

[CVPN]
No hotfixes..

[CPDepInst]
No hotfixes..

0 Kudos
Highlighted
Employee++
Employee++

Hi All.

My Name is Yifat Chen and i am managing R80.30 Jumbo releases in Check Point. 

Thanks for all the details you shared here, we have a ticket associated to the issue and we will update here ASAP with our findings 

 Release Management Group 

 

Highlighted

Hi,

are there any update regarding this issue?

We are planning to update our GW to Jumbo 196.

Thanks and best regards

Tobias

Highlighted
Nickel

TAC:
"When upgrading the jumbo hotfix on a gateway pushing policy is not a required step. The gateway should load the last successfully pushed policy post-reboot.

However, if encountering traffic issues post hot-fix installation one of the first steps recommended would be pushing policy. "

0 Kudos
Highlighted

Hi!

As decribed above, in our case this did not help.

We pushed the policy after upgrading to jumbo-take 196 + reboot, but the policy was not enforced proper. 

 

Best regards Martin

Highlighted

I just want to share that we had the same problem on all our non-VSX HA clusters and T191

So it seems like the problem was first introduced with T191 and is not VSX related.

Details if needed:

Spoiler

Source Version:

R80.30 Gaia 2.6 JHF 155

Target Version:

R80.30 Gaia 2.6 JHF T191

Symptoms excactly as decribed here:

  1. Update passive cluster member. Reboot it.
  2. Checked if passive member has loaded policy, load stabilized at about zero and sync is working fine. Even checked fw ctl multik stat for updated connection table and cphaprob syncstat for sync updates.
  3. Switch traffic to updated member by clusterXL_admin down on non-updated member.
  4. Complete outage, because updated member drops all traffic.
  5. Switched traffic back to non-updated member by clusterXL_admin up.
  6. Traffic is working again.
  7. Doing a policy install.
  8. Switch traffic to updated member again by clusterXL_admin down on non-updated member.
  9. This time, everything is working.
  10. Same problem occured on the second member. This member also needed a policy installation after first boot with T191 before it was accepting traffic.

After having the same experience with our first two clusters, we changed the workflow for all following ones to avoid the problem:

  1. Update passive cluster member. Reboot it.
  2. Checked if passive member has loaded policy, load stabilized at about zero and sync is working fine. Even checked fw ctl multik stat for updated connection table and cphaprob syncstat for sync updates.
  3. Doing a policy install.
  4. Switch traffic to updated member by clusterXL_admin down on non-updated member.
  5. Traffic is working fine.
  6. Update second cluster member. Reboot ist.
  7. Checked if passive member has loaded policy, load stabilized at about zero and sync is working fine. Even checked fw ctl multik stat for updated connection table and cphaprob syncstat for sync updates.
  8. Doing a policy install.
  9. Switch traffic to updated member by clusterXL_admin up on last updated member.

We tried a reboot later some time to reproduce the problem, but it was not reproducable. So it only occured after first boot after minor version update.

 

 

Highlighted
Nickel
so far it's happened on our 2.6 appliance and our 3.10 open FW.
Pretty certain that 196 will drop all new packets until the mgmt server pushes policy.
Tried running a fw fetch and the gw said it downloaded and applied, but the new connection issue didn't resolve the issue until we installed policy from the MGMT server.
0 Kudos