Gratuitous ARP not send on VLAN

Hi All,

At a customer site, I have created a R80.30 ClusterXL cluster with jumbo take 155 which is working fine. All is OK when checking the cluster with 'cphaprob stat', 'cphaprob -l list' and 'cphaprob -a if'. The connection table is also synced, so the cluster seems OK.

But when we perform a fail-over with 'clusterXL_admin down' on the active member, we loose connections on one specific VLAN. On the other interfaces and VLAN's no problems are reported when we perform a fail-over.

Our first impression was the layer 3 devices in that network do not act on the gratuitous ARP being send. But when I manually send a G-ARP into the network, all connections via that VLAN are restored. I used the following to send the G-ARP

echo 1 > /proc/sys/net/ipv4/ip_nonlocal_bind <---- 0=off, 1=on
arping -c 4 -A -I eth3

We have a computer with Wireshark in that VLAN and when we perform a fail-over with 'clusterXL_admin down', we do not see the G-ARP packets. When we manually send the G-ARP, we can see these packets in Wireshark.

I have check for know issues with ARP or G-ARP in jumbo hotfixes, but I cannot find anything.

Someone has seen this before? It is very strange because it is on one VLAN only.




This is the very reason why vMAC is available to prevent these type of problems. We have seen similar issues with Proxy ARP's that were lost on a internet router with the default 4 hour ARP cache. Switching the cluster back and forth would make it loose the G-ARP for the second switch.
This one of those little advantages VRRP has as well, there is always a vMAC with VRRP.
Regards, Maarten
If you enable VMAC, just make sure that all switchports attached to the firewall are set to "portfast" mode to avoid possibly honking off STP on some switches during a failover.

