Create a Post
Showing results for 
Search instead for 
Did you mean: 

WAN Failover when BGP Peer becomes unavailable in a stretched cluster environment.

Hi all,

Looking for some advice on the best way to deal with a fault scenario in a stretched 2 site environment. It will be a stretched VMWare/VSan/NSX environment. I'm currently waiting on the interconnect between the two sites to be complete, but I have tested the basic facts here (including being able to advertise any of my BGP routes from either site) from indepedent clusters at each site. When the interconnect is done, I plan to move all Checkpoints to be managed by the same MGMT server plane.

I will have 2 sites with an independent Fiber Layer 2 interconnect provided by the ISP:
* Each site has an Ethernet direct routing peer expecting BGP from me
* I am broadcasting a /29
* Each site has a different BGP gateway

On my side:
* Each site has 2 VIRTUAL checkpoints
* Each site has its own WAN route
* Sites are interconnected to each other on a Layer 2 interconnection

Assumptions I'm making:
* It must be a 4 node cluster (1 logical cluster/2 members at each site/Cluster sync and all VIPs stretched across interconnect). If I have a VM (ie: with the default gateway of the MGMT VIP (, I need that VIP to move between the sites so the VM can still get out no matter which side the VM or the Checkpoint is on. If the VM ends up living on a site where WAN is down, the stretch should carry it over to the other side. I don't really see a way around this. If the checkpoint participate in a VLAN I need stretched, I couldn't possibly make 2 clusters (2 logical clusters/2 members at each site) since they cannot both maintain control of the VIP representing the default gateway on that VLAN. Please correct me if I'm clearly missing something w/ this assumption.

Here is my drawing:

4 Node 1 Logical Cluster VF Networking WebSafe Draft.png

Imgur link for high rez:


These Checkpoints are virtual. The only way there is a "hard failure" or a "link down" is if VMWare/Vsphere itself fails. It will failover to the other checkpoint onsite on a different ESX host as expected. In a total power loss scenario, the only 2 remaining checkpoints would be at the opposite site. I'm OK with this. Since both sites will be advertising the same BGP route and become the primary route when they are failed over to - my ISP core will route incoming traffic to me via the opposite site. Outgoing traffic will still use the internal VIP (MGMT in this case), and would also work.

But what happens if my BGP peer goes down? For instance: If the BGP gateway goes down (ie: I'm unable to ping it but all Checkpoints are powered on & would still see the interface as up). Or if for whatever reason, BGP cannot be established with the peer? How can I ensure that if Site 1 cannot reach its WAN peer, the 4 node cluster attempts a failover? Without a failover, the LAN Side Cluster MGMT VIP ( used as the default gateway for the VM), stays with the Checkpoint that has no WAN peer.

There is no BFD. Is ping detection enough (on the default route or the BGP peer)? Or should I be using clusterXL_monitor_ips script as seen here: ?

To try and put it simpler: If WAN becomes unreachable at Site 1, I want both the ClusterXL WAN VIP & the ClusterXL MGMT VIP to move to Site 2.

Appreciate any advice, including "you're dumb, do it this way". Thanks for reading.

0 Kudos
6 Replies
This widget could not be displayed.