Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
cmpwat
Participant

Check Point issues even they can’t resolve!

Jump to solution

So, I have been advised by Phoneboy to post my issues here with the hope that someone might be able to provide some help on them so here goes..... oh it’s a long one!

We were forced to upgrade from R77.30 like a lot of people so arranged for a new gateway and management server to be built on new Dell PowerEdge servers (both of which were checked by CheckPoint and confirmed as ok; they were in the compatibility list.)

Both the management server and gateway were built on R80.30 and to begin with, all was fine. However before long we noticed a memory leak in that 32gb of ram was being consumed to the point the gateway was either crashing and rebooted or it required a reboot after 12 hours. It was decided that the issue was my licence hasn’t been cut correctly and was still running an old version so needed to be renewed, which is did, but then the issue with CPUs started.

the gateway was using more CPUs than it was licences for, cue some more investigations and was told that the only way to resolve this was to upgrade to R80.40.

this was planned and carried out and more issues occurred! Our Secureid VPNs no longer worked and the work around from CP didn’t work either. So to stop the whole business from not being able to connect, I rolled back, using the CP method and this failed. We had to reset the database on the gateway and the management server to get the 2 to talk again but in doing so the threat management blade screwed up and the only to resolve is to rebuild the gateway!

CP provided us with a 5900 to use whilst we had out open server rebuilt to R80.40.  The applicable is on R80.40 but we cannot use it as the Secureid VPNs still do not work. It’s been looked at a number of times and logs taken and the developers still haven’t got a fix.

So I have:

an open server with a potential memory leak if I use my licence. If I use an eval licence then we have too many CPUs being used.  The ‘fix’ for this is to update to R80.40 but this doesn’t work as it’s stops my users connecting to the VPN as the SDCONF.rec file keeps overwriting (yes the SK has been read and doesn’t work).

I have an 5900 that I cannot use as it’s in R80.40 and has the same issues as the open server with the SDCONF.rec file. Not sure about the memory or CPU as it’s not been used in anger!

so I am hoping that R81 will be the saviour to all of this as it’s been over 6 months now and I still do not have a fully working gateway and management server that doesn’t require some sort of babysitting on a weekly, if not daily, basis.

0 Kudos
1 Solution

Accepted Solutions
_Val_
Admin
Admin

The issue is resolved.

After certain TAC and R&D efforts, we have identified the root cause and instructed the customer on how to fix it. 

When migrating from 5900 to an open server, Secret Node settings on the SecureID server were not refreshed. As the result, communication between GW and authentication server could not work. 

After setting up a new Secret Node, it is okay now.

View solution in original post

0 Kudos
40 Replies
PhoneBoy
Admin
Admin

The sdconf.rec-type integration with SecurID has been around for a long time, but I think the current way to integrate is via RADIUS.
Is there some reason you're not using RADIUS?

If you can send me the SRs in a PM/email, we can look into the memory leaks.

0 Kudos
cmpwat
Participant

The sdconf method was setup by my predecessor and works in R80.30 with no problems, so on that basis it should work on R80.40! I’m not about to change the way users authenticate in the middle of a pandemic and most are working from home.

with regards to the SRs, I do not have them, as a 3rd party is liaising with Check Point and have the SRs.

0 Kudos
_Val_
Admin
Admin

Where do you configure sdconf.rec files, management or GW? Phoneboy has a point, the best practice is to configure Radius auth on the management side. If you are doing it on per GW, please look into sk166314.

0 Kudos
_Val_
Admin
Admin

@cmpwat another note.

You are claiming, there are serious issues (quoting) "even Check Point can't resolve". To address this claim, we need to review your technical support cases. With all due respect, getting those SR numbers and giving us a fair chance to work them through should not be too complicated.

I have pulled your company data, and I do not see a single open support requests there. Your latest support calls were closed in June 2020. I am also having a hard time finding anything that would be relevant to the issues you mention in this post.

We encourage honest and transparent communication in this community. Your opinion is important, and also we would like to help as much as we can, and in any way we can, to resolve something that needs to be fixed. If required, we are in a position to ask developers and support experts for assistance.

However, the amount of actionable information in this discussion does not allow us to provide you with meaningful answers. Kindly check if you can get more from your support partner.

Thanks for your assistance,
Val

0 Kudos
PhoneBoy
Admin
Admin

Also the support partner should be able to provide the Check Point SRs in question.

0 Kudos
PhoneBoy
Admin
Admin

One major difference between R80.30 and R80.40 is the Linux kernel.
I suspect this is having something to do with the sdconf.rec issue you’re having.
Changing to radius should not impact how your users authenticate. 

0 Kudos
cmpwat
Participant

But the question is why should  change  to radius, or in another way, why should I be pushed into change my authentication method by Check Point?  

My current method worked for a long time on R77.30.  It works on R80.30 with no problem.  Its now not working on R80.40.  Surely Check Point should be looking to resolve this and not force admins to change methods.  Its not a simple change.

0 Kudos
_Val_
Admin
Admin

One again, I urge you to provide relevant SRs so we could look into this.

Thanks

Val

0 Kudos
_Val_
Admin
Admin

When looking in the SR, I can see that your support specialist has provided you the fix the last week. I am looking forward to hearing about the results. Please do inform the community about the outcome.

0 Kudos
Kaspars_Zibarts
Authority
Authority

5900 can run R80.30. Have you considered that option? It's a more mature release by far. R80.40 is still pretty "young". I would not entertain idea of R81 for business critical network. Are you running VSX, just curious

0 Kudos
cmpwat
Participant

I was told by Check Point to install R80.40 as a fix for the memory leak we are suffering on the open server, so not really had a choice about it.

as for not entertaining R81, I agree but at the moment I do not have a full, working solution on either my open server or the loan 5900

0 Kudos
_Val_
Admin
Admin

Respectfully disagree. R80.40 is the recommended version now, with lots of things that are fixed from R80.30

 

0 Kudos
Kaspars_Zibarts
Authority
Authority

@_Val_  yes "recommended" and operational reality does not always go hand in hand 🙂 plus you work for CP hehe. Reality for us was that nearly 100% of my workload a week after R80.40 deployment was spent on fault investigations and debugging. Fairly significant issues for production critical environments. Therefore I would respectfully stay with the comment "R80.30 is more mature than R80.40" 🙂

Lets park it there, it's not the topic of this thread. I respect your opinion! 👍

_Val_
Admin
Admin

I respect your opinion. Just a closing note, a release being pushed from "all" to "recommended" - that is done based on the field statistical data and telemetry. We do not do it lightly. 

0 Kudos
_Val_
Admin
Admin

Would you be so kind to send me your SR numbers in a private message?

Thanks

Val

0 Kudos
cmpwat
Participant

Just to keep everyone happy and up to date, Check Point provided a custom patch for my R80.40 appliance which, surprise surprise, didn't work.

I installed it this morning and tested the Endpoint Security client and it would still would not connect, so we are back to square one using an Open Server, running R80.30 with a knackered Threat Prevention blade caused by the Check Point recovery process and the only way to fix is rebuild, which I cannot do as this is the live firewall so would need something in its place, which is where the 5900 comes in, which is running R80.40 but doesn't allow my users to connect in remotely...... I'm getting dizzy going round and round in circles.

0 Kudos

Man, I understand your frustration but throwing stones at CheckPoint without considering their advices is not going to solve your problem at the end of the day. Why don't you just make a conference call with your support partner and TAC and you do not come up with a plan that works best for you with zero downtime.  Sorry but I do not really see how can the community here help you in this case.

cmpwat
Participant

I think after 6 months of issues, downtime, "fixes", support calls, conference calls, dead ends and out of hours debugging I am entitled to "throw stones" as you put it!

This is not about having a go at Check Point, more its asking the community if they have seen similar issues and if so who did they fix it; its about trying to utilize the collective knowledge of the Check Point community that think of something to try that hasn't been thought of already.

ADMIN NOTE: this comment is edited to comply with the community guidance. For any further clarification, reach out ot @_Val_ 
We urge all community members to maintain professional integrity, to be friendly and courteous. 

0 Kudos
PhoneBoy
Admin
Admin

While I totally understand your frustration here, let's stick to the facts. 
Clearly some customers are still using sdconf.rec (even in R80.40) as I see recent SKs being created related to it.
However, I haven't seen much discussion of this in the community. 

Regardless, we'll get the right resources engaged to get to the bottom of this.

0 Kudos
_Val_
Admin
Admin

@cmpwat Wayne, R&D is trying to get in touch with you. Could you please answer Ilya's email from Sunday?

Thanks you
Val

0 Kudos
cmpwat
Participant

@_Val_ I would if I had received said email?  I do not get emails direct from Check Point as they all go via the support partner.

0 Kudos
Ilya_Yusupov
Employee
Employee

Hi @cmpwat ,

 

I just jump the email again please try to answer to the questions in the email so we can try to assist.

 

Thansk,

Ilya 

0 Kudos
cmpwat
Participant

I have just responded.

0 Kudos
_Val_
Admin
Admin

Thanks, @cmpwat Wayne, please follow with @Ilya_Yusupov  and TAC. I am looking forward to hear it is resolved. If you need any further assistance from the community team, please PM me any time.

I have also called you and left you my number in your voicemail, if any question.

0 Kudos
FedericoMeiners
Advisor

Hello,

The sdconf.rec is generated in the RSA SecureID server. You mentioned that "Keeps overwriting". Have you tried generating this file again?

Do you have the other necessary files in the /var/ace directory of the firewall? (Such as secureid file)

Do you see any events in the event monitor of the RSA Console? Is the gateway at least trying to authenticate or communicate with your SecureID server? Keep in mind that sometimes you have to create another file (sdopts.rec) within the /var/ace directory that indicates the gateway with which IP present itself to the SecureID sever. The idea is to verify that the identity agent created in the SecureID console matches the IP. Look into RSA SecureID documentation to configure this. Identity agent IP and IP from Check Point should match.

"The sdopts.rec file can be used to override some of the RSA SecurID. The most common setting is the CLIENT_IP= setting. For a complete list of available options please consult with your RSA SecurID administrators guide."

Last but not least, it's always a good idea to contact both vendors (Check Point and RSA).

Hope it helps!

____________
https://www.linkedin.com/in/federicomeiners/
cmpwat
Participant

Hi Federico

Thanks for your input.  I have tried generating the file again, but the issue is: https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut...

We have tried the workaround without any luck and Check Point are not sure what the issue is.  We have a patch to try again and see if this works, failing that it will be update to R81 and see if this fixes the issues.

0 Kudos
CIrimia
Explorer

Hi cmpwat,

I found your post here and feel how desperate you are. Maybe I can help you a bit further. Sidenote: I was hoping too that Check Point will provide a permanent fix for the sdconf.rec problem described in sk166314. The article is from April and there's been no update since. A shift to RADIUS might be an option but as long as the classic integration is available in the Check Point product I would expect it to work. Otherwise it must be treated as broken and it should be marked obsolete and removed. Then everyone knows that this option is not available anymore. But that's off-topic here.

At one of our subsidiaries we had the same issue regarding sdconf.rec in R80.40. The Check Point partner in charge built a workaround and from what I can tell it's working. It's ugly but it works. He sent us some notes about that, credits to Protea Networks. He pointed us to sk77300 which he used. I can't provide you the bash script here publicly since I haven't asked for permission to do so.

Short for what he did:

Use the cron job from sk77300 to recreate what isn't working in sk166314 within R80.40. The good files from /var/ace2/ are copied into /var/ace/. With this workaround the broken files are restored every 5 minutes. So the maximum gap after policy install is about 5 minutes. After that users can login with RSA authentication.

Hopefully this gives you a useful path to R80.40.

Best regards

Christoph

0 Kudos
Henrik_Noerr1
Contributor

Hey,

Have you tried making the file immutable? I have done this on several files where mgmt / jumbo updates insists on overwriting.

 chattr +i /<path>/<file>

 

/Henrik

 

John_Fleming
Advisor

This was my thought as well. Have there been any updates to the issue? We have a lot of RSA out in the field and this will for sure stop upgrades. 

0 Kudos