TAC Fixing Bugs vs Finding Work Arounds

Dale_Lobb · ‎2019-02-05

Has anyone else had issues getting Checkpoint support to acknowledge an issue as one that needs to go to development or at the very least, referred to whoever maintains the documentation and SK articles?

I have three open cases, well I guess two now, as one was closed this morning, where I have found what I consider to be either bugs in the software, or erroneous documentation. But CheckPoint support seems to be totally focused on finding work-arounds to the issue rather than gathering data and fixing the problem.

Case #1: SK81740 purports to be a method for receiving e-mail notifications when ClusterXL fails over to another node. Except it isn't. The method described in the article sends notifications when a cluster member's ClusterXL status goes up or down, which does not necessarily (and actually, rarely) coincides with fail over. I've pointed out in the ticket a number of times that either the SK is wrong and needs to be updated, or if the SK actually reflects what is supposed to happen, then the software was implemented incorrectly. But Support is totally focused on providing a work around, not fixing the problem. They've pointed out SK65923 a number of times, which is a method for providing log events to an SNMP management system, from which you could have the SNMP system issue an alert. Before I go to that level of complexity, I'll write a simple bash script using "cphaprob state" and have it send an e-mail...

Case #2: Changes made to the HTTPS inspection policy are saved automatically without prompting if you close the R77.30 SmartDashboard using the windows close widget (the X in the upper right hand corner), while the HTTPS policy is displayed. As near as I can tell, all other screens issue a prompt to ask if you want to save your changes in this scenario, even if the changes were made to the HTTPS inspection policy but one is currently on another SmarttDashboard screen. Recently, a newby firewall admin accidentally left a default any-any-internet-inspect rule at the top of the HTTPS policy because he thought that closing the window would be good enough to erase his experimentation as he is learning the SmartDashboard. When the rule finally got pushed out with some other changes a couple of hours later, it resulted in an major issue for a great many users and applications. But Support seems to think that this is a desirable feature, as R80 and above gives one the ability to discard all changes as a failsafe. The work-around here seems to be: upgrade to R80 or above.

Case #3: Apparently there a nebulous set of circumstances (policy rules) that can cause a gateway to perform pre-probe bypass inspection, even if probe bypass is turned on. We have discovered one of these scenarios as we are having occasional issues with HTTPS inspection returning a firewall signed temporary certificate to a client, even though the Tracker Logs show HTTPS Bypass as the result of inspection for the session. Support has shown where if we use the built-in applications for the HTTPS inspection bypass rule, rather than a custom application object built from the certificate subjects expected, then probe bypass works as documented. However, I told them that we have had issues using the built-in objects, as they don't always work to prevent HTTPS inspection or allow access in the Application and URL filtering policy. My guess is that they aren't always up to date as the internet services are constantly changing. However, since customer have no visibility into how the built-in apps actually work and what they are looking for, we have chosen to use custom built applications, especially in the case where the built-in app has failed at least once before. After pointing this out to Support in the case, I was told that if we choose not to use the built-in apps, we will have to live with the situation as it now exists.

Why isn't Support looking to actually fix these issues?

PhoneBoy · ‎2019-02-08

Please send me the SRs related to the above in a private message and I'll have someone take a look.

Comments about an SK containing incorrect information can (and should) be reported directly on the SK itself.

There's a comment field on the very bottom.

HristoGrigorov · ‎2019-02-08

I experienced similar issue lately. Until I was the only one to report a problem TAC would insist that it is a problem in my environment although it was apparent bug in the software. Then one day I was suddenly notified that other customers have reported similar problem and they are now taking it to R&D's attention.

KennyManrique · ‎2019-02-10

Funny issue here... I think...

I reported an issue for some HTTPS sites related to certificate lenght; where the workaround I found was to put the categorization mode on APC & URLF as background instead of hold. The TAC reviewed all the debugs I sent and instead of provide a fix for my version, the said it will be solved on the next JHF for R80.20 and I should stay tunned for the updates.

First: my customer was on different version (R80.10), tough I can say to customer "let's upgrade"; it requires more planification.

Second: issue was reported a month ago and still no new version of R80.20 JHF neither a date of release.

Third: case was closed and marked as solved...WOW!

This is only one of my struggles with TAC.

Regards.

PhoneBoy · ‎2019-02-10

Can you send me the relevant SR in a private message?

Corporacion_Ame · ‎2019-02-14

well, we are not alone.

we have multiple case open for like 4 months.... with no solution and the same bot answer "install last jumbo" even after install the jumbo we still have the same issue...

Yair_Shahar · ‎2019-02-18

Hi

As for Case #1

Mail Alerts for Cluster Failover do work.

Trying to understand whether the SK is not clear enough with configuration it suggests or anything else goes wrong.

For clarification

Having following config example in global properties and setting mail alert in cluster object should do the work after installing the policy on the cluster

internal_sendmail -s Cluster_Failover_Event -t mail_server_ip -f admin@company.com user@company.com

Dale_Lobb · ‎2019-03-01

This is the solution from SK81740. Actually, it does not work, at least not the way in which the SK describes it. The reason is that it does not report cluster failover. It reports ClusterXL state changes on individual nodes, which do not necessarily, and actually, rarely, correspond with failover.

For example, the following tracker log entries occurred when we received an e-mail alert today, but the alert did not correspond to a failover event:

1) Product Family:   Network
   Type:                    System Alert
   Action:
   Date:                    1Mar2019
   Time:                    13:18:16
   Number:               11546600
   Origin:                  xxxxxxxx
   Information:         cluster_info: (ClusterXL) member 1 (xxx.xxx.xxx.xxx) is up.
   Product:               Security Gateway/Management
   Policy Info:           Policy Name: Standard
                   Created at: Fri Mar 01 13:15:02 2019
                   Installed from: xxxxxxx

2) Product Family:   Network
   Type:                    Control
   Action:
   Date:                    1Mar2019
   Time:                    13:18:16
   Number:               11546601
   Origin:                  xxxxxxx
   Information:         cluster_info: (ClusterXL) member 1 (xxx.xxx.xxx.xxx) is standby.
   Product:               Security Gateway/Management
   Policy Info:           Policy Name: Standard
                   Created at: Fri Mar 01 13:15:02 2019
                   Installed from: xxxxxxx

Notice two things:

a) Log entry (2) above shows the node reported that it was in standby mode, but it was already in standby mode, so the e-mail notification alert did not result from a failover event.

b) Log entry (2) above is a type: "Control" log entry. These do not get reported by SK81740. Log entry (1) above, of type "System Alert" is what generates the e-mail alert for SK81740. As one can see, log entry (1) above is merely reporting the state of ClusterXL on the node. That information is what shows up in the e-mail alert,: the state of Cluster XL, not that a failover event happened.

We get these e-mail alerts about ClusterXL status changes during almost every policy installation. However, as the cluster is under freeze during policy installation, cluster failover cannot, in fact, happen.

SK 81740 is titled: "How to configure a cluster to send Mail Alert upon failover", but what it actually reports is the state of ClusterXL, not actual cluster failover events.

Are you a member of CheckMates?

TAC Fixing Bugs vs Finding Work Arounds