Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Julius_Kaiser
Participant

CP1400 Cluster "move" WAN-IP on failover?

I don't really get the "anatomy" of a CP1400 series cluster. Two questions for you guys:

  • Can I achive this setup with a 1400 series cluster config:
    (active CP cluster member holds public WAN IP; on cluster failover, IP will be bound by new active (former inactive) member)



  • Is there more comprehensive information on clustering CP 1400 series than the 1 1/2 pages in the Check Point 1430/1450 Appliance Locally Managed Administration Guide?

As always; Thanks in advance for your feedback!

10 Replies
Pedro_Espindola
Advisor

Hello Julius,

A cluster with SMB appliances is not very different from other Check Point appliances.

You need 3 static IP addresses on the same subnet for every cluster interface in order to make failover possible, 1 for each member (physical IP addresses) and a VIP address wich will change owner upon failover.

This means you would need a range of public IP adresses to connect your WAN interfaces.

What I have managed to do for costumers that had a single public IP address was to set the cluster behind the ISP router. The WAN interface would be in a private IP range (such as 192.168.0.0/24). Then I set the router to forward everything from the internet to the VIP address of the cluster. Not ideal, but it works.

0 Kudos
Julius_Kaiser
Participant

Hello Pedro,

this is exactly what I would assume and what I'm doing:

I have a range of public IP addresses (indeed a /24 public range), every physical CP appliance (cluster member) gets one of these IPs + one VIP from the same range, e.g.:

physical address 1: x.x.x.53

physical address 2: x.x.x.54

virtual cluster ip: x.x.x.55

I'm monitoring the WAN interfaces for failover.

What happens is that the virtual IP answeres for ~20 pings, than does not respond anymore. A failover (unplugging one WAN link) does not lead to the VIP responding again.

I will try to add some output here (pings, arp) but will only manage to do so in the evening..

Thanks for your feedback so far!

0 Kudos
Pedro_Espindola
Advisor

Then it seems you are having a real problem and might be best to open a ticket with support.

You might want to check the HA page on WebUI to see if all required interfaces are set for HA and click the diagnostics to see problems.

These commands might also help you:

cphaprob stat

cphaprob -a if

cphaprob -l list

Share the results with us if you wish.

0 Kudos
Maarten_Sjouw
Champion
Champion

Is this a locally managed cluster or are they managed by a management server?

When it is managed by a management server, this should just work fine, when you set it up as a Small Business Cluster.

I have no experience with a locally managed cluster but am working on getting 8 centrally managed clusters going.

Regards, Maarten
0 Kudos
HristoGrigorov

Hi,

What you want to achieve is basically what HA cluster do, so no problem with that, it is supported in 1400 appliances.

There is the ClusterXL admin guide (just google it) that describes in great details how this cluster technology works. For SMB devices there are some known limitations regarding this (check sk105380).

Btw, if you plan to use VLANs on the WAN interface you will likely run into the problem described by me somewhere earlier in this section.

From my personal experience configuring ClusterXL on 1400 with WebUI is not very robust and stable but it works at the end.  

Good luck with your setup.

Julius_Kaiser
Participant

Hello guys,

thank you very much for your input on this.

@Maarten Sjouw

Is this a locally managed cluster or are they managed by a management server?

It is a locally managed cluster.

@Pedro Espindola

You might want to check the HA page on WebUI to see if all required interfaces are set for HA and click the diagnostics to see problems.

 

These commands might also help you:

cphaprob stat

cphaprob -a if

cphaprob -l list

I will consult the ClusterXL Guide, lab the setup and provide further information.

Anyway, to this point, I never had a satitisfying experience with clustering 1400 series. I have to manage one production 1400 cluster that does not failover well (was initially configured by a former colleague - I wouldn't know what to do different though!) and my own clustering attempts on this platform. The wizard and further configuration seams pretty straight forward, not really leaving a question on what to enter where.. (and I'm managing lots of firewalls from many different vendors, so, kinda having an idea of what to setup usually)..

@Hristo Grigorov

From my personal experience configuring ClusterXL on 1400 with WebUI is not very robust and stable but it works at the end. 

This is exactly what this feels like; not at all like configuring e.g. HSRP/ VRRP and voila. I always have the feeling "it'll may work if I do things in the right order and timing", but the first failover or firmware upgrade will definitely blow things up. On the other side there're just a few web form fields, sync interfaces, even default heartbeat config shipping.

Ok, I will be glad to rtfm and get back to you guys with actual configuration and debug output.

Julius_Kaiser
Participant

Hello folks,

I finally found some time to lab this again. Once the cluster is formed, it's pretty obvious what's happening:

10.161.91.251   primary node, physical
10.161.91.252   secondary node, physical
10.161.91.254   clusterxl virtual IP

Assuming 10.161.91.252 holds active role (master node that presents configuration etc): 

  • 10.161.91.251 is always responding to ping
  • 10.161.91.252 responds to ping for approx. 30 sec, than stops responding
  • 10.161.91.254 starts responding, responds for approx. 30 sec, than stops
  • 10.161.91.252 starts responding again and so on

These 30 sec suggest that this has something to do with mac aging on the switch. (LAN interfaces terminate on a switch in a VLAN). However, setting mac aging to 10s on the switch does not lead to a shortened interval for "flapping" of the responding address.

I have already tried to stick all MAC-addresses I can see to both ports connecting to the primary and secondary member.

mac address-table static 0100.5e7f.fffa vlan 1 interface FastEthernet0/2 FastEthernet0/1
mac address-table static 0100.5e00.0016 vlan 1 interface FastEthernet0/2 FastEthernet0/1
mac address-table static 001c.7f7e.6f78 vlan 1 interface FastEthernet0/2 FastEthernet0/1
mac address-table static 001c.7f7c.e19d vlan 1 interface FastEthernet0/2 FastEthernet0/1
mac address-table static 0100.5e00.00fb vlan 1 interface FastEthernet0/2 FastEthernet0/1
mac address-table static 0100.5e00.00fc vlan 1 interface FastEthernet0/2 FastEthernet0/1
mac address-table static 0000.0000.fe00 vlan 1 interface FastEthernet0/2 FastEthernet0/1
mac address-table static 0000.0000.fe01 vlan 1 interface FastEthernet0/2 FastEthernet0/1
mac address-table static 0100.5e5A.0A64 vlan 1 interface FastEthernet0/2 FastEthernet0/1
mac address-table static 0100.5e28.0A64 vlan 1 interface FastEthernet0/2 FastEthernet0/1

I also disabled igmp snooping globally on the switch & set CCP to unicast instead of multicast. None of this changed this behaviour. There is currently no other layer 3 devices involved expect of the CheckPoints and my client host.

The switch has layer 3 capabilities (Catalyst 2960 for testing), but no IP interfaces in VLAN 1, where the LAN-interfaces of the cluster are placed.

Is this a lead to follow? Do you have any ideas on this?

As always, thanks in advance for your time and feedback!

/edit: The sync interfaces are directly connected with a straight patch cable.

0 Kudos
HristoGrigorov

I find it real strange that .252 stops to respond. Local IPs must always respond and only the cluster IP should be flapping. To me it looks like there is some problem between the switch and the appliances.

Btw, do you ping them from machine in same VLAN ?

Is it possible to try with another switch ? Preferably one that is not very intelligent.  

0 Kudos
Maarten_Sjouw
Champion
Champion

001c.7f7e.6f78 and 001c.7f7c.e19d are the physical addresses, the 0000.0000.fe00 and 0000.0000.fe01 are source MAC for CCP packets, however there should only be one source MAC per cluster as far as I know sk121953 ( I know it is for GAIA not embedded). 

The sk25977 describes the MAC Addresses used and also problems seen with them when using more than one cluster on 1 switch.

The 0100 MAC addresses are described in Destination Multicast MAC Addresses in that same SK..

In centrally managed solutions sometimes clusters can be set to Forward traffic to member to cluster IP.

On your switch you will probably need to disable all types of security on the ports, Nexus Switches in VM environments have similar problems when you try to setup a VSEC cluster. These security setting prevent the usage of more than one mac on a port from the same device.

Regards, Maarten
0 Kudos
Nick_Finney
Explorer

Julius

As discussed the issue was found to be due to the "fw_allow_simultaneous_ping" setting not being set and you were seeing the following messages in fw ctl zdebug drop

;[cpu_0];[fw4_0];fw_log_drop_ex: Packet proto=1 x.x.x.x:2048 -> x.x.x.x:9672 dropped by fw_handle_first_packet Reason: fwconn_key_init_links (INBOUND) failed;

;[cpu_0];[fw4_0];fw_log_drop_conn: Packet <dir 1, x.x.x.x:0 -> x.x.x.x:2 IPP 1>, dropped by handle_outbound_pac, Reason: connection not found;

Implementing sk26874 resolved the issue

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events