Users can follow the below procedure, in order to upgrade their VSX cluster from R77.XX to R80.20 (VSLS | In-place upgrade with Zero-downtime)
Things to discuss
A) Management Server
i) Pre-Start Tasks
ii) Operation vsx_util upgrade
B) VSX Upgrade Stand By Member
i) Pre-Start Tasks
ii) Upgrade (& Install JHF - optional)
iii) Verification
iv) Connectivity Upgrade
C) VSX Upgrade Active Member
i) Pre-Start Tasks
ii) Upgrade (& Install JHF - optional)
iii) Verification
D) Recovery Plan
A) Management Server
i) Pre-Start Tasks
1) Ensure there are no locks on objects relevant to the VSX upgrade and to show the list of all the locked objects in an R80.20 database, let’s open PostgreSQL on MDS cmd line:
# $MDS_TEMPLATE/bin/psql_client cpm postgres
2) To see current locks, run:
# select objid, name, dlesession, cpmitable, subquery1.lockingsessionid, subquery1.operation FROM dleobjectderef_data, (SELECT lockedobjid, lockingsessionid, operation FROM locknonos) subquery1 WHERE subquery1.lockedobjid = objid and not deleted and dlesession >=0;
3) To exit out of PostgreSQL: (mandatory!)
# \q
4) To remove current locks from Smart Console, go to Manage and Settings, view Sessions, locate the columns where "Locks" and "Changes" are not 0, and publish or discard session as required
5) **Take MDS and Firewalls – Snapshot & Backups before proceeding with the operation below.
6) Ensure Serial Console and/or LOM access is available to cluster members during operations.
ii) Operation vsx_util Upgrade
1) SSH to Primary MDS > elevate to expert mode
2) mdsenv x.x.x.x (switch to the context of VSX-Master Domain Server)
3) # vsx_util upgrade > enter x.x.x.x for Management Server IP Address , enter admin credentials when prompted!
4) Select Desired VSX Cluster Object Name in numerical list to upgrade
5) Select yes and the desired version to upgrade to and wait for operations to complete on management (all associated virtual objects will be updated in all associated Domains managing virtual objects tied to this VSX cluster)
B) VSX Upgrade Stand By Member
i) Pre-start Tasks (along with installing a Jumbo Hotfix)
1) Make sure the CPUSE build is up to date, see: sk92449
2) Upload the image to folder /var/log/tmp
3) Upload the Jumbo Hotfix Take_xx on a same/different directory.
4) Compare the MD5sum of packages
5) To import the file to CPUSE repository:
- > installer import local /var/log/tmp/<>.tgz
- > installer import local /var/log/tmp/<JHF>.tgz
- > quit(exit clish)
6) Ensure that the vsls status reflect all VSs in standby state before proceeding with the standby member upgrade (# vsx_util vsls)
ii) Upgrade
1) Run cphaprob state to ensure this member is standby and the peer is active
2) On the ssh session to Standby Member
- Login into clish
- Run installer upgrade <image number>
- Gateway will reboot when complete!
Jumbo Install (optional)
- On the ssh session to Standby Member
- Login into clish.
- Run installer verify <number of JHF>
- Run installer verify <# of Take xx>.
- (Verification should come clean with no conflicts. If not, fix any issues and then re-run this step)
- Run installer install <# of Take xx>
- The gateway will automatically reboot when finished.
iii) Verification
- On the ssh session to Upgraded Standby Member
- Run cphaprob state (should show cluster state as "Ready")
- After waiting for a minute or two (depending on database size) policy should be installed automatically.
- Execute an SSH to Primary Member (non-upgraded)
- Run cphaprob state (should show "Active" or "Active Attention" and upgraded peer as "Down")
iv) Commence Connectivity Upgrade Script (Will sync connections for all VSs)
Turn off SecureXL
- On the ssh session to Primary non-upgraded Active Member
- Elevate to Expert Mode
- Run vsenv 0 (to ensure you are in the main VSX GW context)
- Run fwaccel off -a (This will ensure SecureXL and Templates are disabled to ensure delayed connections are synchronized with peer)
- Run fwaccel stat -a (to verify SecureXL is disabled)
- On the ssh session to Standby
- Go to Expert Mode
- Run cphacu start
- cphacu will show connection sync status and inform if ready for failover
C) VSX Upgrade Active Member
i) Pre-start Tasks
1) Make sure the CPUSE build is up to date, see: sk92449
2) Upload the image to folder /var/log/tmp
3) Upload the Jumbo Hotfix Take_xx on the same directory.
4) Compare the MD5sum of packages
5) To import the file to CPUSE repository:
- > installer import local /var/log/tmp/<>.tgz
- > installer import local /var/log/tmp/<JHF>.tgz
- > quit(exit clish)
6) Ensure that the vsls status reflect all VSs in Active state before proceeding with the active member upgrade (# vsx_util vsls)
ii) Upgrade
1) Turn off SecureXL
Run fwaccel stat -a (to verify SecureXL is disabled)
2) Failover connections to Standby Upgraded Member – R80.20
- On the ssh session to Primary non-upgraded Active Member
- Run cpstop
- SSH to standby and run cphaprob state (should show its cluster state as active)
- Run cphacu stat on standby (For connectivity Upgrade status, should show handling connections)
- Run cphacu stop on standby (to halt the connectivity upgrade process)
3) On the ssh session to Primary Member
- get into clish
- Run installer upgrade <Image number>
- Gateway will reboot when complete
4) Jumbo install (optional)
- On the ssh session to Standby Member
- Get into clish.
- Run installer verify <number of JHF>
- Run installer verify <# of Take xx>.
- (Verification should come clean with no conflicts. If not, fix any issues and then re-run this step)
- Run installer install <# of Takexx>
- The gateway will automatically reboot when finished.
iii) Verification
- The state should now show up as Active/Standby
- We do not expect to see any traffic drops.
- Ensure that the secureXL is turned on at both nodes
D) Recovery Plan
1) Restore the snapshots on all servers in question.
Alternatively,
2) Management Server: Run mds_restore
3) VSX Servers:
- Fresh install
- First-time wizard
- Run vsx_util reconfigure from MDS