Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Kaspars_Zibarts
Employee
Employee

Bond interfaces go down after cluster member enters STANDBY / ACTIVE after reboot

Hi! Just wondering if anyone else has seen this very strange cluster behaviour after reboot.

We are running R80.40 T120, regular gateways, non-VSX

In short sequence is as follows:

  • reboot cluster member (i.e. FW1 that is STANDBY)
  • FW1 recovers and synchronises connections
  • FW1 cluster state enters STANDBY
  • few seconds later all bond members report lost link
  • FW1 cluster state goes DOWN
  • interface driver seems to be reloaded and interfaces become available again
  • FW cluster state enters STANDBY again

Below I have full message log with comments. It seems that interface settings are modified after cluster has entered STANDBY state and that causes all bond members to go down. Regular interfaces seem to survive without link down.

The problem isn't that big if you maintain cluster state as is. But if you use "switch to higher priority member", you may end up in situation where FW1 goes ACTIVE > DOWN > ACTIVE after reboot and it has shown heaps of problems in our production network.

This is present on all our clusters and with different bond member types (i.e. 10Gbps drivers ixgbe or i40e, or 1Gbps - igb)

image.png

 

0 Kudos
11 Replies
This widget could not be displayed.