Downtime during failover

Hi everyone.

I'm currently deploying a single site dual SSM cluster in a lab environment. So far I've been able to set up both MHO and a single security group including 2 SGM.

I've been performing some failover tests (putting down 1 SSM, putting down 1 SGM), and I've noticed that when the failover occurs, established sessions that runs through the security group can fail for a second, but when SSM or SGM goes back online I lost connections for several seconds (like 30s).

Is that an expected downtime when a maestro asset goes back from the dead ? I may be missing something, I'm willing to take any advices on where to start my troubleshooting.

Best regards

Starting with some of the basics:

 - What version and JHF are the components installed with? 

 - Are the corresponding switchports on the network side configured with portfast?



Hi Chris, 

Thank you for really prompt reactivity. Following your comment, I've updated all assets to Take 94 (initial install was performed with R81.10sp T338 and followed by updating with T81), also I'm sure T81 was very fine. Portfast was already set up on the network side, but I've checked again my configuration to make sure everything was ok. It appears that the uplinks used for the management bond of the security group were incorrectly set up. In fact both uplinks were not part of a port-channel on the network side, which messed up with the arp tables when the failovers occured. It has also shown me that the trafic I was monitoring was wrongly routed through the management bond (but that's an other problem). 


Right now the failover is working fine, as I'm experiencing 1 to none ping loss during the process and the remediation. 

Anyway, your guidance helped me fix the issue by leading me to the right troubleshooting steps to take, so thank you very much. 


Have a nice day !



Does not sound right


