Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted

What is ClusterXL and VRRP ?

Jump to solution

Hi Team, 

What is different between High availability and Load sharing in cluster mode. ?

What is different between ClusterXL and VRRP in highavialibily ?
What is different between Multicast and Uni-cast in load sharing ?
What is the best method to use in checkpoint cluster environment ?

BR

0 Kudos
1 Solution

Accepted Solutions
Highlighted

Danny's links are great, however here are the answers to your specific questions.

 

> What is different between High availability and Load sharing in cluster mode. ?

 

High Availability is active/standby, while with ClusterXL Load Sharing all members are active.  Generally I'm not a fan of Load Sharing, but it has its uses in certain cases.  Edit: A new ClusterXL mode called Active/Active was introduced in R80.40 which is distinct and separate from ClusterXL Load Sharing.

 

> What is different between ClusterXL and VRRP in highavialibily ?

 

They both can perform active/standby quite well, but ClusterXL is considerably easier to set up and manage.  VRRP is more prone to misconfiguration that causes cluster split-brains or routing black holes.  I recommend ClusterXL over VRRP unless one has the rare need to present more than one Cluster IP (VIP) on a single interface (which VRRP can do but ClusterXL can't), or there is some external load balancing algorithm in use (like OSPF) controlling the traffic distribution with load sharing via VRRP.  Edit: The new Active/Active ClusterXL mode introduced on R80.40 can be used to work with an external load balancing mechanism.


> What is different between Multicast and Uni-cast in load sharing ?

 

The MAC address of provided to systems in ARP replies that are trying to traverse an active/active firewall cluster. If low order bit of first byte in a MAC address is 1 (i.e. it is odd 01, 03, 05) the mac address is multicast, if low order bit is 0 (i.e. it is even 02, 04, 06) it is unicast.  Some switches and routers have issues properly handling multicast mac addresses which is putting it mildly.


> What is the best method to use in checkpoint cluster environment ?

 

Just my opinion but ClusterXL wins hands down, although VRRP has its adherents (and I'm sure we'll be hearing from them shortly).

 

--
My book "Max Power: Check Point Firewall Performance Optimization"
now available via http://maxpowerfirewalls.com.

Book "Max Power 2020: Check Point Firewall Performance Optimization" Third Edition
Now Available at www.maxpowerfirewalls.com

View solution in original post

24 Replies
Highlighted
Pearl

Introduction to ClusterXL

YouTube: Understanding ClusterXL

YouTube: Understanding VRRP

YouTube: Troubleshooting ClusterXL

High Availability and Load Sharing in ClusterXL

Load Sharing Multicast Mode

Load Sharing Unicast Mode

How to configure VRRP on Gaia

VRRP FAQ

Recommended configuration for ClusterXL

ClusterXL is Check Point's own clustering protocol and therefore the default clustering protocol when setting up Check Point clusters. Check Point sees VRRP as a 3rd party cluster protocol. Check Points applications, such as SmartView Monitor, might not always shows correct values when using 3rd party solutions. Also you need to be aware of many SKs providing solutions that come up when using 3rd party solutions (e.g. sk36544, sk43321, sk98698 and so on.)

Highlighted

Thank you Danny

0 Kudos
Highlighted

Danny's links are great, however here are the answers to your specific questions.

 

> What is different between High availability and Load sharing in cluster mode. ?

 

High Availability is active/standby, while with ClusterXL Load Sharing all members are active.  Generally I'm not a fan of Load Sharing, but it has its uses in certain cases.  Edit: A new ClusterXL mode called Active/Active was introduced in R80.40 which is distinct and separate from ClusterXL Load Sharing.

 

> What is different between ClusterXL and VRRP in highavialibily ?

 

They both can perform active/standby quite well, but ClusterXL is considerably easier to set up and manage.  VRRP is more prone to misconfiguration that causes cluster split-brains or routing black holes.  I recommend ClusterXL over VRRP unless one has the rare need to present more than one Cluster IP (VIP) on a single interface (which VRRP can do but ClusterXL can't), or there is some external load balancing algorithm in use (like OSPF) controlling the traffic distribution with load sharing via VRRP.  Edit: The new Active/Active ClusterXL mode introduced on R80.40 can be used to work with an external load balancing mechanism.


> What is different between Multicast and Uni-cast in load sharing ?

 

The MAC address of provided to systems in ARP replies that are trying to traverse an active/active firewall cluster. If low order bit of first byte in a MAC address is 1 (i.e. it is odd 01, 03, 05) the mac address is multicast, if low order bit is 0 (i.e. it is even 02, 04, 06) it is unicast.  Some switches and routers have issues properly handling multicast mac addresses which is putting it mildly.


> What is the best method to use in checkpoint cluster environment ?

 

Just my opinion but ClusterXL wins hands down, although VRRP has its adherents (and I'm sure we'll be hearing from them shortly).

 

--
My book "Max Power: Check Point Firewall Performance Optimization"
now available via http://maxpowerfirewalls.com.

Book "Max Power 2020: Check Point Firewall Performance Optimization" Third Edition
Now Available at www.maxpowerfirewalls.com

View solution in original post

Highlighted

Thank you Tim

0 Kudos
Highlighted
Pearl

VRRP:

 

Cons:

  1. Decision is per interface.. Am I master or backup, one interface at a time; potential for split brain.
  2. No Health checking of the cluster peer(s).
  3. If same VRRP ID is used on all interfaces, potential to confuse switch when multiple firewall interfaces connected to same switch; multiple VLANs using same VRRP MAC.
  4. Default VRRP MAC is still effected by IGMP, same as ClusterXL CCP multicast mode. VRRP hello packets are transmitted using the VRRP MAC as the destination.
  5. Only the Master node transmits Hello packets. No status of backup cluster member, VRRP interfaces must be monitored individually to discern if layer 2 connectivity problem exists on one or more interfaces.

 

 

ClusterXL:

 

Pros:

  1. Health checks peer on every physical interface
  2. Unified interface failover; no chance of split brain
  3. Monitors policy, daemons etc.

 

ClusterXL is more robust than VRRP in its monitoring of peer nodes and failover.

Highlighted
Admin
Admin

I am afraid your answer is misleading. VRRP allows elaborate health checks as part of redundancy, FW status included. If configured correctly, a virtual router fails over properly without causing a split brain. 

VRRP is de-facto standard and classic redundancy solution. It does not allow load sharing though, but you do not want to use load sharing with Check Point anyway, unless it is VSLS.

ClusterXL is proprietary, with complex and sometimes rather questionable implementation details. It uses so-called magic_mac as cluster ID, and it is even called just that in R77.30 and R80.X installation wizards. 

I do agree with your recommendation to use ClusterXL, but I know many people with enough experience that do not share this opinion.

Highlighted
Pearl

Valeri,

The "Magic MAC" or "Virtual MAC" is actually optional in ClusterXL, at least in 77.30.

From my field experience, which is by no means as extensive as yours, I've seen issues caused by VRRP/HSRP combinations that Virtual MAC can actually address better.

SImilarly, in older networks with complex (or downright bad)  STP implementations, Virtual MAC was a better option.

In the days of Nokia appliances and when ClusterXL was not mature, I would have weighted the options of which one to implement.

Now, with vSECs in the picture, for consistency purposes it kind-of makes sense to stick with technology that is supported across all deployment scenarios.

Cheers,

Vladimir

0 Kudos
Highlighted
Admin
Admin

No sir, you are mixing magic_mac with VMAC functionality in ClusterXL. these are two different things. 

VMAC is described here: How to enable ClusterXL Virtual MAC (VMAC) mode  It is a feature that allows cluster members using a virtual mac when answering to ARP requests for VIP addresses. By default, physical MAC addresses are used instead, which means gracious ARP has to be sent in case of failover.

So-called magic_mac and magic_mac_forwarding are internal parameters of CCP used to form a ClusterXL entity. CCP is a funny protocol, and at layer 2 it uses magic_mac as a source MAC for CCP frame.  In specific scenarios magic_mac can create issues with adjacent netowrking devices. For example, if CCP is running in a multicast more and IGMP snooping is enabled on an adjacent switch, it may cause false positive IGMP issues leading to flapping interfaces on the cluster. More details about CCP are here: http://dl3.checkpoint.com/paid/44/Cluster_Control_Protocol_Reference.pdf?HashKey=1510074387_eab82a4d... 
  

0 Kudos
Highlighted
Pearl

Thank you for clarifying the difference.

I guess this is the primary reason one of the first things TAC tries when troubleshooting ClusterXL issues is to switch the CCP mode to boadcast.

Would you mind further explaining which of these two is affected by the "Cluster ID" parameter?

0 Kudos
Highlighted
Admin
Admin

Of course. As said above, CCP is using the last octet of source mac frame as cluster ID, to distinguish between different CXL entities in the same broadcast domain. 

That parameter is hardcoded to ClusterXL settings during installation time but can be changed by adding a certain FW kernel parameter called  by clusterXL during boot and registered in fwkern.conf file.

In version R77.30 (and up) Check Point developers decided to call this parameter ClusterID, to acknowledge its role in ClusterXL solution. It is now part of first time wizard if you are configuring a cluster member in CXL mode. It can also be changed later on with a CLISH command. Yet the nature is still the same, it is the value in the last octet of a source mac in CCP frames. the rest of octets there are all zeros.

Highlighted
Pearl

OK. And how is the value of the VMAC is being determined and differentiated between multiple clusters?

0 Kudos
Highlighted
Admin
Admin

VMAC is using magic_mac as one of the parameters, once more for the last octet. It also takes into account VSID and has a non-zero prefix. It brief, VMAC depends but not equals to magic_mac.

Once more, this link explains it perfectly and even has charts and examples: How to enable ClusterXL Virtual MAC (VMAC) mode 

Highlighted
Pearl

I see that you’ve removed the “so different interfaces would have different MAC addresses available for the cluster.”

Which is no longer applicable in later releases as same VMAC is being used on all interfaces.

Can you tell me if this change has any negative implications besides causing ocasional consternation for network admministrators?

0 Kudos
Highlighted
Admin
Admin

Yes, sorry, my original response was not 100% accurate. AFAIK, the main issue with VMAC is having more than one interface with the same MAC addresses for VIP. 

If more than one segment is attached to th same networking device, some additional effort may be required to tackle this. 

Highlighted

As a side note. If you still run IPSO you only have VRRP as an option..

But be aware that even if you run VRRP as HA protocol you are in fact using ClusterXL to take care of a lot things for you like keeping connection tables in Sync.

In theory you could run 2 firewalls with VRRP and not use them as cluster but as 2 individual firewalls. But I think I would not recommend unless the customer signs a waiver that the design is extremely limited in regard to failover.

Highlighted

Thank you Hugo

0 Kudos
Highlighted

To follow on from an above post if you are running the following:

IPSO - VRRP Only

SPLAT- ClusterXL Only

Gaia - ClusterXL or VRRP

If you think about it, VRRP is an open redundancy standard that you can run on Linux, Cisco, Fortinet etc etc ... The protocol itself is concerned with the ability to maintain successful routing paths through a single IP address (VIP) per network that you require in the event of hardware/software failure of a particular device node. 

It does not have native checks for the core PNotes that Check Point ClusterXL defines nor does it the the ability to customise your own PNotes. It has no method for connections and NAT table synchronisation.

At this stage you are needing to implement ClusterXL on top of VRRP anyway as has already been highlighted to achieve state synchronisation and cluster health checks.

However, with this being said, VRRP works very well, if configured correctly. Try not to use multiple VRID's for your interfaces as this has the potential to mean that only a single interface fails over to the secondary cluster member causing split traffic across cluster members. Additionally, each VRID has to be manually failed over - which can take upward of a few minutes to complete on a device with many interfaces as opposed to a single VRID fail over being "instant".

I typically see this deployed when:

1) Customer has upgraded from IPSO way back when and have simply followed the standard in place upgrades

2) Issues with upstream device.

According to sk44898 - RFC 1812 states: 

"A router MUST not believe any ARP reply that claims that the Link Layer address of another host or router is a broadcast or multicast address."

The gratuitous ARP used for ClusterXL send a Unicast IP to a Multicast Destination, breaking RFC standard. 

Rather than fight with ClusterXL, simply drop in VRRP config and it usually just works. I know you can use broadcast mode instead for CCP but I have heard concerns from customers over additional network resource use given the nature of broadcast traffic itself, however this one particular example was on a large subnet with many thousands of hosts. Also the mindset of security, there is a theoretical security concern over the broadcast of cluster data to every host on the subnet, again, a real world example I am citing from a customer. Personally I am not so sure I share those thoughts, but then it wasn't my network to make that decision for Smiley Happy

Running multiple clusters on a subnet has already been established that you can use the Magic_MAC, made much easier since R77.30 I believe, where it is part of the WebUI set up. One point that has not yet been mentioned is the counter part to this with VRRP is to use simple authentication AKA a password for the cluster member to authenticate its peer against. Even if not using another Check Point cluster, I would recommend to set this as its very possible there are other none Check Point devices using VRRP in the same network segment that can and will cause you a headache upon deployment. (Speaking from experience)

I have to say on a final note, I find pure ClusterXL clusters easier to manage and troubleshoot as almost all config is done from Dashboard and the cphaprob tool set of commands usually get the job done very efficiently. I also prefer the fail over cli options to the clish VRRP options, but this is personal preference rather than hard fact.

Highlighted

Hi John,

Thanks for your reply John

Do we sitll have IPSO these days ? 

Regards 

Prashan

0 Kudos
Highlighted

No IPSO was only available on Nokia IP appliances that have long since been out of production.

 I still see some IP appliances in the wild but I'm pretty sure that the last renewal dates for the largest of these boxes is 2018/19 so there's not much life left in these.

Also note that IP appliances are upgradeable to Gaia OS instead of IPSO assuming that the hardware is capable. There are no 64bit IP appliances and at best you will get a dual core 4gig RAM maximum. 

Not really suitable in modern networking for anything other than firewalling and VPN.

With all that being said, there's a reason that IPSO is still around today. It's a very reliable OS and can be considered stable for sure. 

There are no IPSO updates anymore. The last one I remember was the bash Shell Shock patch. Maybe some other members can confirm that ??

One last point, please never ever use the Flash based models. Your asking for a headache and a long weekend.

Highlighted

Thanx John, 

0 Kudos
Highlighted
Employee+
Employee+

Why are we still talking VRRP in 21st century, on Check Point blog? Cisco figured out it is obsolete and came up with HSRP which sucks even more than VRRP. ClusterXL works in Check Point environment, WAS DESIGNED FOR IT, as long as HSRP and VRRP lovers configure their side correctly. I thought this blog was for Check Point clustering and not for some wanna be security/routing clowns. VRRP sucks, - I had quite bit of experience with different "vendors" and "OS" implementations in my "short" career. Sorry to be harsh, but no one but CP so far follows protocols as they were designed and develop their code according to RFCs. Study, don't just 'read', Tim's book, you will find a lot there.

0 Kudos
Highlighted
Admin
Admin

The reason it's relevant here is because Gaia supports both ClusterXL and VRRP (mostly because IPSO supported VRRP).

It's also worth noting that people from Nokia wrote the initial RFC for VRRP--people I used to work with back in the day Smiley Happy

Highlighted

Hi mates,

In secure platform for high-availability there are two modes 

  1. New
  2. Legacy

In Gaia platform for high-availability there are two modes

  1. ClusterXL
  2. VRRP

Is New and Legacy same as ClusterXL and VRRP or different ?

BR

0 Kudos
Highlighted
Admin
Admin

New and Legacy are different modes of ClusterXL that are supported in SecurePlatform.

According to the ATRG for ClusterXL:

  • In High Availability Legacy (Traditional) Mode, there are no Virtual IP addresses - the cluster members share identical IP and MAC addresses, so that the Active cluster member receives from a hub or switch all the packets that were sent to the cluster IP address.

Gaia does not support ClusterXL in Legacy mode, as noted here: Member of ClusterXL HA Legacy Mode is Down after upgrading from SecurePlatform OS to Gaia OS 

However, it does support VRRP, which was how HA configurations were done on IP Appliances running IPSO.