Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Timothy_Hall
Legend Legend
Legend

fw_runfilter_ex(ctx id 0): function does not exist -1

This weekend I was on-call for a customer performing an in-place R77.30 to R80.10 upgrade of a ClusterXL cluster, and I wanted to share what we found out as it was very difficult to diagnose.  The customer already had a gateway they had upgraded successfully to R80.10 beforehand at their DR site.

Customer called and said that after upgrade of one ClusterXL member to R80.10 and failing over traffic to it, nothing would pass through the upgraded gateway even though policy had been installed.  I whipped out the trusty fw ctl zebug drop command and was greeted with screenfulls of this (which was also being dumped into /var/log/messages at a rapid rate):

Feb 17 08:47:00 2018 XXX kernel: [fw4_0];FW-1: fw_runfilter_ex(ctx id 0): function does not exist -1

Feb 17 08:47:00 2018 XXX kernel: [fw4_2];FW-1: fw_runfilter_ex(ctx id 0): function does not exist -1

Quick search of SecureKnowledge/CheckMates/CPUG yields a big goose egg on this error message.  Term "filter" appearing in it did seem to imply an issue with the Firewall blade, and disabling other blades such as IPS/TP didn't have any effect on the issue.  I assumed that something went wrong in the upgrade process (even though all the upgrade logs in /opt/CPInstLog looked good) and proceeded to do a fresh load of R80.10 plus Jumbo Take 70 and load configuration of Gaia config.  Pushed policy, everything looked good, customer completed their test plan.  Fresh-loaded other gateway with R80.10 Jumbo Take 70, everything looked good, failed over and customer completed their test plan.

While trying to clear up some policy installation warnings and give the customer a "clean & green" outcome, suddenly the issue came back (accompanied I might add by a raft of expletives uttered by myself and the customer).  No amount of rebooting (including a simultaneous reboot of both cluster members) could seem to make it go away, then suddenly it stopped right after a policy install and everything started working.  We changed the settings back that made it start working again to "re-break" it and installed policy again.  Still worked.  Hmm.

Customer makes some more changes and the issue comes back again, backing out the recent changes and reinstalling policy doesn't fix it.  I cry uncle at this point and after involving Check Point support (who was excellent by the way - kudos to Efim Bliacher) we find out that this is a known issue fixed in an ongoing take but not yet documented.  The conditions that lead to this situation are:

1) More than one set of R80.10 gateways/clusters being managed by the same SMS

and

2) Both gateways/clusters are using a different IPS/TP profile

and

3) Policy is pushed to 2 or more gateways/clusters in a single operation (in our case to both the cluster and DR firewall)

So the workaround was to only push to one set of firewalls at a time, and anytime during our testing when we happened to push to both sets of gateways simultaneously the issue would come back, regardless of any other changes we were making.  Apparently the root cause was the Inspection Settings (which were split out from the IPS blade in R80.10); during a simultaneous policy push elements of each gateway/cluster's Inspection Settings were used to inappropriately populate Implied Rules on the other gateway, thus causing the other gateway to have a reference to an object/function in its Implied Rules that didn't exist.  As many of the Implied Rules are always checked first, the error would essentially drop all new connections trying to start (but existing connections would continue).  This effect did have some cosmetic similarities to this SK: sk97704: Security Gateway may stop accepting new IPv4 connections when working with Dynamic Objects ...

I had seen situations before where trying to push to more than one gateway/cluster at a time would cause policy installation failures due to IPS Profile conflicts, but never ones that would allow the policy to be installed and immediately cause a critical traffic-handling failure on the gateway/cluster.  Hopefully this writeup wasn't too long, and will help someone else.

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
7 Replies
Vladimir
Champion
Champion

Wow, this is a good one.

I am somewhat surprised that I did not yet encountered it in my lab.

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

One to note, thanks for sharing!

0 Kudos
XBensemhoun
Employee
Employee

Good to know ! Thanks.

Information Security enthusiast, CISSP, CCSP
0 Kudos
Astardzhiev
Contributor

Thanks Tim,

For sharing this experience, indeed very interesting! I haven't been in situation that requires to install same policy to three firewalls, which makes this case very difficult to catch.

0 Kudos
Timothy_Hall
Legend Legend
Legend

Check Point has created an SK for this condition:

sk123040: All traffic dropping with message "dropped by fw_runfilter_ex Reason: function does not ex...

That was quick, thanks!

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
Garrett_DirSec
Advisor

Hello Tim -- thanks for the write-up and reference to sk123040.    This must have been incredibly frustrating in the heat of conversion.  

The new 2nd edition book is great!    thanks

Timothy_Hall
Legend Legend
Legend

Just to follow up on this thread, the fix for this condition is now available in a GA R80.10 Jumbo HFA, Take 91+.

 

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events