Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
sjni01
MVP Diamond
MVP Diamond

Avoid a failover during the VSX rebuild process

Hi Community,

In a rebuild process I had some time ago, when replacing a firewall with new hardware and applying vsx_reconf from SMS via expert, to prevent the new box from taking the active role (i.e., the old one is in standby), but when the SMS is performing the rebuild, the new box can try to take over the active role. With these simple lines, you can ensure that it doesn't take the active role from the old one. This applies to VSX Cluster architectures.

should be applied before making the vsx_reconfigure from SMS on the New Box:

[Expert@HostName]# cphastop
[Expert@HostName]# cphaconf fini
[Expert@HostName]# touch /dev/shm/during_vsx_reconfigure

I'm sharing this because it happened to me in a real-world deployment. Without any intervention, and given the specific architecture, the new appliance took control and, being active, brought down the network because the reconfiguration wasn't complete; the reboot was missing, policies hadn't been installed on each VS, etc. However, always try to have the CP TAC team supervising and logistics in place. Please, I also recommend reading the VSX guide.


Regards,

10 Replies
the_rock
MVP Diamond
MVP Diamond

Excellent advice, thanks for that @sjni01 

Best,
Andy
"Have a great day and if its not, change it"
JozkoMrkvicka
Authority
Authority

What was version of SMS and VSX ?

Do you use VSLS or HA ?

If VSLS, did you check vsx_util vsls what was distribution of VSs there ? Maybe you had some VSs marked as active on replaced node which then might force management to make it active after it is up.

Kind regards,
Jozko Mrkvicka
0 Kudos
Chris_Atkinson
MVP Platinum CHKP MVP Platinum CHKP
MVP Platinum CHKP

The other aspect the sometimes is overlooked is Active UP vs Primary UP.

CCSM R77/R80/ELITE
sjni01
MVP Diamond
MVP Diamond

Hi Jozko, Good Point: it's for an environment with VSX without VSLS enabled, and the version that had this behavior was R82. I could avoid this behavior with these commands.

0 Kudos
emmap
MVP Gold CHKP MVP Gold CHKP
MVP Gold CHKP

Typically I advise customers to shut the switch ports to all bar the management (VS0) and Sync interfaces until we're ready for it to go live. That way there's no chance of it taking over the active role in the cluster because it always has an IAC pnote.

sjni01
MVP Diamond
MVP Diamond

It's a good idea; however, it's useful If you don’t have networking staff available and need to make the change as soon as possible, this could be the best approach. And remember, this is for environments that do not have VSLS enabled.

0 Kudos
JozkoMrkvicka
Authority
Authority

yeah, if VSLS is in game, then you need to make sure that each VS can see each other member over at least 1 monitored interface. If not, split-brain will happen.

Kind regards,
Jozko Mrkvicka
emmap
MVP Gold CHKP MVP Gold CHKP
MVP Gold CHKP

VSs can see each other over Sync, you won't get split-brain if Sync is up. I've done this many many times. Keep management and Sync up through the whole process and there's no clustering problems. 

0 Kudos
JozkoMrkvicka
Authority
Authority

If Sync is up then sure, VSs can handle it.

Kind regards,
Jozko Mrkvicka
0 Kudos
emmap
MVP Gold CHKP MVP Gold CHKP
MVP Gold CHKP

Oh, another fun tip for VSLS upgrades, if you do 'clusterXL_admin down -p' on the VSs before you upgrade it, that admin down state survives the upgrade procedure. Just make sure when you set them up you do the -p (for 'permanent') again.

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events