Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Ob1lan
Collaborator

Management HA & CRL : concerns

Hi,

In regard with our project to get rid of our main datacenter and migrate everything to AWS, I've had to find a solution to migrate the Check Point Management Server (cma).

In the effort of having the migration as smooth as possible, I decided to try with the Management HA. So I launched an EC2 instance from the Check Point R81.10 management AMI (BYOL), sized it as needed and configured the Management HA. 

So far everything works great, the 'old' management is in constant sync with the new one in AWS, which I made active, and I receive most GW's logs (some need a little kick to start sending logs to the new cma). I'm also able to publish & install policies from that new management server. 

From there, I wanted to simulate the decomission of the 'old' management server, and simply issued the cpstop command in there. Then, I decided to reboot some gateways after business hours and see how they would react... Well, the issue is the IPSEC S2S tunnel between those gateways and the central one got stuck in phase 1, and logs showed 'Invalid certificate' errors... 

So I suspect the CRLs are unreachable, and/or the gateways still tries to verify it from the 'old' management server, which was inactive. Once I've issued the cpstart command in the 'old' management, everything started to work fine again, the tunnels established alright.

Checking the CRL DP in the certificate, I noticed it's using an URL with a unresolvable name : http://mgmt.company.tld:18264 (redacted).

  • How come this works if I can't resolve that name internally ?
  • How can I smoothly decomission the 'old' management without impacting our numerous (40+) IPSEC S2S tunnels from our remote gateways ?

Thanks in advance for your help, much appreciated !

0 Kudos
2 Replies
PhoneBoy
Admin
Admin

I'm assuming this is because the gateways "know" the ICA is the management server.
When you migrated the management into AWS, what is the main IP on your management server?
Is it an elastic IP or one of the private VPC IPs?

0 Kudos
Tomer_Noy
Employee
Employee

I know that this is an old thread, but I recently came across a similar issue and I may be able to assist with the solution. If not for your case, then perhaps for future people that come across this post.

Gateways fetch CRLs periodically and if they cannot fetch for over 24 hours, they stop accepting the certificate. 

The CRL fetching is done according to the "masters" file on the gateway, which tells it which management machines should control it. This list also determines who to fetch policy from after reboot or upgrade.

By default, the list contains the primary management server. Although it's possible to manually alter this file, it's not recommended or necessary in this case. You can add additional servers per gateway by modifying the list in the "Fetch Policy" page in the gateway editor.

In your case, when you took down the primary, the gateway failed to fetch the CRL and dropped VPN after 24H. If you would have added the other server to the list and pushed policy to the gateway, then the gateway would also try the other server for the CRLs.

The following SK gives more background info and instructions on how to configure:
https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut... 

0 Kudos