Avoid a failover during the VSX rebuild process

sjni01 · ‎2026-02-08

Hi Community,

In a rebuild process I had some time ago, when replacing a firewall with new hardware and applying vsx_reconf from SMS via expert, to prevent the new box from taking the active role (i.e., the old one is in standby), but when the SMS is performing the rebuild, the new box can try to take over the active role. With these simple lines, you can ensure that it doesn't take the active role from the old one. This applies to VSX Cluster architectures.

should be applied before making the vsx_reconfigure from SMS on the New Box:

[Expert@HostName]# cphastop
[Expert@HostName]# cphaconf fini
[Expert@HostName]# touch /dev/shm/during_vsx_reconfigure

I'm sharing this because it happened to me in a real-world deployment. Without any intervention, and given the specific architecture, the new appliance took control and, being active, brought down the network because the reconfiguration wasn't complete; the reboot was missing, policies hadn't been installed on each VS, etc. However, always try to have the CP TAC team supervising and logistics in place. Please, I also recommend reading the VSX guide.

Regards,

the_rock · ‎2026-02-10

Excellent advice, thanks for that @sjni01

Best,
Andy
"Have a great day and if its not, change it"

JozkoMrkvicka · ‎2026-02-10

What was version of SMS and VSX ?

Do you use VSLS or HA ?

If VSLS, did you check vsx_util vsls what was distribution of VSs there ? Maybe you had some VSs marked as active on replaced node which then might force management to make it active after it is up.

Kind regards,
Jozko Mrkvicka

Chris_Atkinson · ‎2026-02-10

The other aspect the sometimes is overlooked is Active UP vs Primary UP.

CCSM R77/R80/ELITE

sjni01 · ‎2026-02-11

Hi Jozko, Good Point: it's for an environment with VSX without VSLS enabled, and the version that had this behavior was R82. I could avoid this behavior with these commands.

emmap · ‎2026-02-10

Typically I advise customers to shut the switch ports to all bar the management (VS0) and Sync interfaces until we're ready for it to go live. That way there's no chance of it taking over the active role in the cluster because it always has an IAC pnote.

sjni01 · ‎2026-02-11

It's a good idea; however, it's useful If you don’t have networking staff available and need to make the change as soon as possible, this could be the best approach. And remember, this is for environments that do not have VSLS enabled.

JozkoMrkvicka · ‎2026-02-12

yeah, if VSLS is in game, then you need to make sure that each VS can see each other member over at least 1 monitored interface. If not, split-brain will happen.

Kind regards,
Jozko Mrkvicka

emmap · ‎2026-02-12

VSs can see each other over Sync, you won't get split-brain if Sync is up. I've done this many many times. Keep management and Sync up through the whole process and there's no clustering problems.

JozkoMrkvicka · ‎2026-02-12

If Sync is up then sure, VSs can handle it.

Kind regards,
Jozko Mrkvicka

emmap · ‎2026-02-12

Oh, another fun tip for VSLS upgrades, if you do 'clusterXL_admin down -p' on the VSs before you upgrade it, that admin down state survives the upgrade procedure. Just make sure when you set them up you do the -p (for 'permanent') again.

Are you a member of CheckMates?

Avoid a failover during the VSX rebuild process