Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Contributor

ClusterXl synchronization network bandwidth

Hello everyone.

I would like to ask if on CheckPoint ClusterXL (2 Gaia R80.10 gateways) working in HA (active/standby) mode sychronization interface/network can have less bandwith than "production" interfaces. In documentation I have found that for sync only distance (delay) matters. For example - can I use 10 Gbit links for DMZ, internal and external networks and "only" 1 Gbit for sync interface? Or maybe I would use 4x1 Gbit (bond) if 1 Gbit link is insufficient?

 

Thanks for Your precious help 

0 Kudos
13 Replies
Highlighted
Contributor

I have also found that (https://sc1.checkpoint.com/documents/R80.10/WebAdminGuides/EN/CP_R80.10_ClusterXL_AdminGuide/html_fr...😞

"Note - There is no requirement for throughput of Sync interface to be identical to, or larger than throughput of traffic interfaces (although, to prevent a possible bottle neck, a good practice for throughput of Sync interface is to be at least identical to throughput of traffic interfaces)."

So there can be bottle neck in my link configuration - I think especially in Full Sync transfers :-(.

0 Kudos
Highlighted
Admin
Admin

@Mirosław_Zimny , as you have quoted already, there are two types of synchronisation: full and delta sync. Although full sync is extensive, it is not equivalent to the passing traffic, it is just transferring all kernel tables as is from one member to another. It also can lag a bit, delaying full functionality of the cluster but not affecting production traffic. However, delta sync is a direct function of production traffic, and it is time sensitive.

There is no exact formula to calculate the required bandwidth, but it is assumed that you might need between 10 and 30% of your production bandwidth. You can have a limited control over delta sync by disabling ore delaying sync for specific services, but it does not give you lots of flexibility anyway.

In your specific case I would advise to use 10Gbps interface for sync to be on the safe side. Mind you may try bonded interfaces, but as stated in a different comment to this post, the nature of LACP does not allow you to multiply bandwidth in this specific case. 

 

Highlighted
Admin
Admin

2x1GB for sync should suffice.
0 Kudos
Highlighted
Admin
Admin

No, it will not.

 

LACP will not give you 2x1=2Gbps, because the balancing is per pair of IPs, and and the IPs are always the same. You will have 1 Gbps available for sync there.

Highlighted

I completely agree with @_Val_  here.

LACP is only used with the sync interface to make the sync fail-safe. In the beginning you could define several sync interfaces. That's no longer possible.

If you want to be 100% safe, you have to use two 10 GBit/s interfaces as bond for the sync.

 

Tags (1)
Highlighted
Leader
Leader

I never saw more then 800Mbit/s on a sync link. And these value was only seen with IPSO-clustering in forwarding mode and four fully utilized 10 GB interfaces. And these utilization was only seen in case of a full sync.

We had a some clusters running with heavy utilized 10GB links and 2x 1GB bond active/passive as sync. Highest sync utilization ever was 450 MBit/s .

That‘s my experience but maybee someone can show as some more production throughput on a sync interface.

Wolfgang

Highlighted

I agree with you @Wolfgang.

In practice, I haven't seen a firewall that has generated more than 1GBit/s sync traffic.

But if you want to be on the safe side, you have to use two 10Git/s interfaces as bond.

PS: But I also have several firewalls running with two 1 Gbit/s sync interfaces as LACP bond:-)

Tags (1)
Highlighted

Interesting would be here what R&D recommends:-)

It's just an idea. Maybe you can calculate this with the connections which are in the stat tabel. Is there a rule of thumb here?

Tags (1)
Highlighted
Contributor

Hello, 

Sorry I have a question. How do you know or how can you know the utilizacion over sync interface in VSX? 

Any command? I try to use SmartMonitor but I dont found. 

Regards, 

Julian S. 

0 Kudos
Highlighted
Admin
Admin

The rule of thumb is to calculate it based on overall bandwidth utilization of your VSX cluster. We are talking about up to 5% of overall bandwidth. If you want o be on the safe side, take up to 10%. 

0 Kudos
Highlighted
Champion
Champion

In my experience 1 Gbit is sufficient for cluster state table sync unless the cluster has an extremely high new connection rate passing through it.  Loss and re-transmits on the sync interface as reported by cphaprob syncstat are typically caused by overall high CPU load on the cluster members, not by a lack of raw bandwidth on the sync interface.  High CPU load can be mitigated with CoreXL/SecureXL tuning as described in my "Max Power" book, as long as the firewall hardware was sized appropriately.  By selectively disabling synchronization for services such as DNS, HTTP, and HTTPS the amount of sync traffic (and associated CPU utilization) can be reduced significantly.

 

R80.40 addendum for book "Max Power 2020" now available
for free download at http://www.maxpowerfirewalls.com
Highlighted
Contributor

What a great discussion we have here. I think I'll follow CheckPoint recommendation anyway.

By the way it would be very interesting to test it in CP lab.

 

Thank You all

0 Kudos
Highlighted
Admin
Admin

I am glad you guys have never seen an issue with a sync interface being a bottleneck. I did, in a couple of very special VSX related cases. That does not say you are safe with a physical cluster. 

0 Kudos