Solved: Re: R77.30 VSX appliance upgrade to R80.10

Kaspars_Zibarts · ‎2017-08-30

Hey!

We just started lab testing for these and hit the road block from the start.

Firstly CP documentation is very ambiguous - in the same R80.10 installation and upgrade document one section says that:

CPUSE upgrade is OK (read Upgrading Security Management Servers and Security Gateways, Upgrading a VSX Gateway section). It says that after vsx_util upgrade part you can skip vsx reconfigure (= fresh install) if you used CPUSE. This theory is confirmed by actual CPUSE verifier on R77.30 gateway via CLI - it confirms that upgrade is OK.
Only clean install (read Upgrading ClusterXL Deployments, Connectivity Upgrade, Upgrading VSX High Availability Cluster) says "Upgrade the Standby cluster member with a clean install"..

Now I have tried both using CPUSE and both failed. And the reason is that interface naming script is changed from appliance specific (in our case 5900 - /etc/udev/rules.d/00-PL-40-00.rules) to generic open server one /etc/udev/rules.d/00-OS-XX.rules! So extension slot interfaces are not called eth1-0x anymore!

Very odd!

I reverted the same appliance to pre-VSX state on R77.30 and run clean install of R80.10 using CPUSE and that all worked OK with interface names assigned correctly using appliance specific script.

This basically means that CPUSE is useless for VSX upgrades and we go back to old-school method of full re-install from ISO? Am I correct? Has anyone done VSX appliance upgrades and what was your approach?

Will be testing the same with open servers later today.

PhoneBoy · ‎2017-09-01

Most of the configuration (especially with VSX) is on the management side.

A fresh install of the gateway has always made more sense than an in-place upgrade, but that's just my take.

View solution in original post

Kaspars_Zibarts · ‎2018-05-28

Hi Romain! I'm not Checkpoint but will try to help - it's not a straight forward case as it depends on many factors like how many cluster members you have or if you use dynamic routing. I will just assume simple case - two cluster members and no dynamic routing and all VSes are active on one box.

Ideally test all your upgrades in the VM lab if that's an option for you. I don't have exact screenshots as we did long time ago but hopefully it helps.

Make sure your gateway has connectivity to CP (manually or you can use script sk83520 how to check connectivity to CP )

CPUSE reference guide is here Check Point Upgrade Service Engine (CPUSE) - Gaia Deployment Agent

1. Backup all your systems - gateways and management so you have solid rollback

2. Update VSX cluster object version on management server using vsx_util upgrade command:

When prompted, enter this information:
a) Security Gateway or main Domain Management Server IP address
b) Administrator name and password
c) Cluster name (if the VSX Gateway is a cluster member)
d) Version to upgrade to: R80.10

3. Log into standby VSX member CLISH and perform CPUSE upgrade:

HostName:0> show installer packages available-for-download

HostName:0> installer download <Package_Number for R80.10>

HostName:0> installer verify[press Space key][press Tab key]
HostName:0> installer verify <Package_Number for R80.10>

You should see something like this:

Result: Verifier results: Clean Install: Installation is allowed. Upgrade: Upgrade is allowed.
Status: Available for Install

HostName:0> installer upgrade[press Space key][press Tab key]
HostName:0> installer upgrade <Package_Number>

It should reboot itself at the end

Wait till VSX is up again and verify that you are running on R80.10

4. Cutover from R77 to R80 (reference here Connectivity Upgrade R77.x and R80.x Versions Best Practices )

On the upgraded member, run: cphaprob state

Make sure that this cluster member is in the Ready state.

On the cluster member that still runs the previous version, run: cphaprob state

Make sure that this cluster member is in Active or Active Attention state, and that the upgraded member is in Down state.

On the upgraded member, run: cphacu start no_dr

Make sure that the Connectivity Upgrade is complete.
On the cluster member that still runs the previous version, run:
vsenv 0
fwaccel off -a
fwaccel stat -a
Make sure the SecureXL is disabled (off). This is required to synchronize delayed connections.

cpstop (this is cutover!)
On the upgraded cluster member, run: cphaprob state
Make sure this cluster member is in Active state.

cphacu stat
Make sure this cluster member handles the traffic.

cphacu stop

5. Do your testing on newly upgraded member

6. Upgrade remaining cluster member just like in step 3. Once upgrade is complete and all VSes are up it should automatically fail back to it.

Hope it helps!

Kaspars

View solution in original post

Kaspars_Zibarts · ‎2017-09-01

Interestingly enough the same VSX cluster replicated in ESX VMs seem to work - CPUSE (CLI) upgraded with no issues and after running vsx_util upgrade on the cluster object all VSes came up! And interfaces in /etc/udev/rules.d/00-OS-XX.rules were copied over to R80.10!

Looks like a genuine bug with CPUSE and appliance upgrades! Going to raise SR

PhoneBoy · ‎2017-09-01

Most of the configuration (especially with VSX) is on the management side.

A fresh install of the gateway has always made more sense than an in-place upgrade, but that's just my take.

Markus_Malits · ‎2017-09-02

Hi!

do you have any news on your case already - I am doing a 23800 vsx cluster upgrade for 77.30 to 80.10 today, I will check and report if we run into issues as well.

Markus

Izhar_Shoshani_ · ‎2017-09-02

5900 CPUSE upgrade (as well as other appliances) was tested and we didn't found any issue.

It may be a unique issue on your environment.

Please let me know the SR number, and we will further check the issue.

Markus_Malits · ‎2017-09-03

Hi!

in our 23800 cluster upgrade we now did the procedure that support suggested in SR 1-9718669681, which is the fresh install including CU, which worked like a charm.

So after all you would suggest to first try the inline upgrade next time? (cpuse via cli)

one other thing: is there any plans to reenable the webui in vsx again? its a bit strange to me and customers to have cli only in vsx in 2017.

BR
Markus

Izhar_Shoshani_ · ‎2017-09-03

Hi,

Yes, next time you can do it with CPUSE via cli.

Currently, no concrete plans to enable WebUI in VSX mode.

We considering other ways to configure the OS part in VSX mode.

Regards,

Izhar

Markus_Malits · ‎2017-09-03

ok great!

in regards to the lockdown of webui and cli,

I guess this has been done to make inconsistences between vsx config and cli config impossible.

please consider a way to shut interfaces:

it is for example not possible to do it in vsx (also not via cli)

therefore a completely prepared (replacement) setup which you want to swap in at a certain time has to be disconnected physicly (or shut on the switchside) until golive.

It would be neat if you can shut all if's apart from sync and mgmt...

BR

Markus

Kaspars_Zibarts · ‎2017-09-06

Sorry it took a while to reply. Crazy busy scheduling and testing upgrades. Actually I have not raised an SR as we simply don't have time right now for different reasons We will go ahead with clean install and vsx_util reconfigure as per old days - at least we know that works 100%!

Kaspars_Zibarts · ‎2017-09-24

Final update. With production 13800 VSX cluster CPUSE CLI worked like a charm. I must admit we made a small mistake (due to time limitations during testing) and upgraded cluster object in Mgmt after we had upgraded first member, so that one needed gw reset and vsx_util reconfigure as it failed to fetch config and policies during first boot (doh). But the actual upgrade process completed initially so I'm 100% confident it would have been total success if we had don't Mgmt vsx object upgrade first. The second cluster object worked seamlessly!

So I must admit, I've been using vsx since 2005 and remember how painful upgrades used to be. This is a major step forward, well done checkpoint team!

Would be nice if this was actually documented in a bit less confusing way

If I have time, I will check again lab 5900 boxes as I'm still curious what went wrong there.

Djelo_Arnautali · ‎2017-12-25

Can you just clarify your steps please?

1. In-place upgrade via CPUSE CLI of the standby member

2. VSX_util upgrade and VSX_util reconfigure the upgraded member???

3. cphacu start on the upgraded member

4.On the active member do the cpstop

5. In-place upgrade of the remaining box

6. VSX_util reconfigure the remaining box

7. Push the policy

Regards,

Eyal_Rashelbach · ‎2018-01-01

Hi Kaspars,

I notes that the above answers helped you to succeed with your migrating to R80.10
We appreciate if you share your experience with us in our R80.10 Survey

if needed I will be happy to assist you with personal attention.

Eyal Rashelbach

R80 Desk Manager, Solution Center | Check Point Software Technologies

DR_74 · ‎2018-05-24

Hello Eyal,

Is it possible at Checkpoint to release a clear step by step guide on how to upgrade a VSX Cluster from R77.30 to R80.10 with CPUSE. The Upgrade guide provided on the support site is really confusing.....

Thanks

Romain

Kaspars_Zibarts · ‎2018-05-28

Hi Romain! I'm not Checkpoint but will try to help - it's not a straight forward case as it depends on many factors like how many cluster members you have or if you use dynamic routing. I will just assume simple case - two cluster members and no dynamic routing and all VSes are active on one box.

Ideally test all your upgrades in the VM lab if that's an option for you. I don't have exact screenshots as we did long time ago but hopefully it helps.

Make sure your gateway has connectivity to CP (manually or you can use script sk83520 how to check connectivity to CP )

CPUSE reference guide is here Check Point Upgrade Service Engine (CPUSE) - Gaia Deployment Agent

1. Backup all your systems - gateways and management so you have solid rollback

2. Update VSX cluster object version on management server using vsx_util upgrade command:

When prompted, enter this information:
a) Security Gateway or main Domain Management Server IP address
b) Administrator name and password
c) Cluster name (if the VSX Gateway is a cluster member)
d) Version to upgrade to: R80.10

3. Log into standby VSX member CLISH and perform CPUSE upgrade:

HostName:0> show installer packages available-for-download

HostName:0> installer download <Package_Number for R80.10>

HostName:0> installer verify[press Space key][press Tab key]
HostName:0> installer verify <Package_Number for R80.10>

You should see something like this:

Result: Verifier results: Clean Install: Installation is allowed. Upgrade: Upgrade is allowed.
Status: Available for Install

HostName:0> installer upgrade[press Space key][press Tab key]
HostName:0> installer upgrade <Package_Number>

It should reboot itself at the end

Wait till VSX is up again and verify that you are running on R80.10

4. Cutover from R77 to R80 (reference here Connectivity Upgrade R77.x and R80.x Versions Best Practices )

On the upgraded member, run: cphaprob state

Make sure that this cluster member is in the Ready state.

On the cluster member that still runs the previous version, run: cphaprob state

Make sure that this cluster member is in Active or Active Attention state, and that the upgraded member is in Down state.

On the upgraded member, run: cphacu start no_dr

Make sure that the Connectivity Upgrade is complete.
On the cluster member that still runs the previous version, run:
vsenv 0
fwaccel off -a
fwaccel stat -a
Make sure the SecureXL is disabled (off). This is required to synchronize delayed connections.

cpstop (this is cutover!)
On the upgraded cluster member, run: cphaprob state
Make sure this cluster member is in Active state.

cphacu stat
Make sure this cluster member handles the traffic.

cphacu stop

5. Do your testing on newly upgraded member

6. Upgrade remaining cluster member just like in step 3. Once upgrade is complete and all VSes are up it should automatically fail back to it.

Hope it helps!

Kaspars

Muazzam_Saeed · ‎2018-06-04

For step-4 (cutover) was there a downtime?

Kaspars_Zibarts · ‎2018-06-04

Nope, that sequence will synchronise connections from old to new. So as long as your regular failovers work, it will be no different. Technically.

it's an upgrade so things can go wrong of course. Worked for us without issues

Alex_Lam1 · ‎2019-02-18

its in the R80.20 pr R80.x doco with zero-downtime.

Daniel_Taney · ‎2018-07-03

So just to be 100% clear... if you upgrade via CPUSE, you do not have to run vsx_util reconfigure?

R80 CCSA / CCSE

_Val_ · ‎2018-07-03

On the opposite, you need to run in on the MGMT side before upgrading enforcement points with CPUSE.

Daniel_Taney · ‎2018-07-03

Ok. Because the documentation says to run vsx_util upgrade and to skip vsx_util reconfigure if you upgraded VSX via CPUSE

R80 CCSA / CCSE

_Val_ · ‎2018-07-03

Well, technically vsx_util_upgrade is the first part of vsx_util_reconfigure. I see you point, and it seems I have misread your question. Follow the upgrade guide. vsx_util_reconfigure is required when you are installing from scratch. Otherwise, it is vsx_util_upgrade

Daniel_Taney · ‎2018-07-05

Ok, perfect! Thanks for the clarification!

R80 CCSA / CCSE

Vladimir · ‎2019-02-11

Just did this for one of my clients.

Thank you for the awesome guide!

Few things of note: We were doing fresh install on both cluster members, not CPUSE upgrade, as TAC for some reason stated that CPUSE upgrade is unsupported.

After Step 6 in your guide, the last upgraded member became "Active", the VS0 on the first upgraded member was in "Standby", but the two VS' running on it were in "Down" state complaining about "routed" pnote.

Executing "cpstop;cpstart" on it revived the VS' and placed them in a "Standby" mode.

All VS' were participating in OSPF routing, so it may have been a factor.

Kaspars_Zibarts · ‎2019-02-11

Thanks for taking time to add useful notes!

Daniel_Taney · ‎2018-07-12

I just wanted to take a moment to thank everyone who contributed to this thread! Based on everyone's feedback, my VSX R77.30 to R80.10 migration went off without a hitch! The process was about as smooth and textbook as I could have hoped for! So, thanks again for everyone's input. It definitely helped ensure a clean migration!

R80 CCSA / CCSE

Mark_Tremblay · ‎2018-09-06

What options are available after Step #5 and if testing on the newly upgraded member goes bad? Are you able to fail back to the R77.30 member? or do we need to restore from backup? Just looking for back out options.

Thanks,

Mark

Kaspars_Zibarts · ‎2018-09-06

Yep - just cpstart the "old" member and cpstop the "upgared" one. It will be hard failover as in you will loose connections table. As for reverting upgraded member - I would do snapshot before upgrade and use that to revert just to be 100% that you revert to full original state.

Mark_Tremblay · ‎2018-09-06

Thanks for the quick reply! What about the management server? Would we need to revert back to a snapshot on it as well due to the vsx_util upgrade command in part #2?

Thanks again.

_Val_ · ‎2018-09-06

Yes, you need to revert management to the pre-upgrade snapshot. Same for both VSX cluster members. Alternatively, you can re-install the upgraded VSX sluter member and perform vsx_util reconfigure.

Kaspars_Zibarts · ‎2018-09-06

True true and my bad - I just assumed question was about how quickly you can be running on the previous release. Then the steps I gave would be sufficient but not fully recover as cluster object version in management would be incorrect and SIC would have to be re-established as Valeri pointed out.

I would say as long as you revert snapshot on Mgmt and the upgraded gateway you should be ok. Gateway that was not upgraded was not really touched so it should work as is.

Are you a member of CheckMates?

R77.30 VSX appliance upgrade to R80.10