Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
T_L
Contributor

1450 Firmware Upgrade Failure

Good Afternoon -

We are in the process of upgrading 50 1450 SMB gateways from R77.20.87_990172947 -- to R77.20.87_990173083

We are remotely using the Web Interface and the manual upgrade process. 40 of the gateways accepted the upload, rebooted, and upgraded. 

Ten of the gateways accepted the upload, rebooted, and simply didn't upgrade - stayed on the same version.  

Has anyone seen this behavior?  Unfortunately, we currently do not have the ability to upgrade these locally. We will open a case with support too, I was just hoping someone may have recognized it!

 

Thanks!

T

0 Kudos
21 Replies
the_rock
Leader
Leader

Hm, that sounds strange...if all 10 gateways failed, sounds like it could be something with the firmware itself. Any specific error or logs about the upgrade failure?

0 Kudos
T_L
Contributor

The firmware/ image worked successfully on a number of the other gateways. In a lot of cases we downloaded new images in case there was something wrong. We also came back to the ones that did not upgrade and tried multiple different images.

T_L
Contributor

Upgrade log files were compared between a successful upgrade and a failure - there are very little differences with the exception of the last dozen or so lines - a failed upgrade simply stops - and reloads - no upgrade. A successful upgrade moves through to the validation - reloads - and upgrades.

FAILURE

2021-Apr-27-21:50:01: Executing command: '/bin/sync'

2021-Apr-27-21:50:01: Executing command: '/bin/umount /mnt/inactive'

umount: can't unmount /mnt/inactive: Device or resource busy

2021-Apr-27-21:50:01: Executing command: '/bin/rm -rf /storage/revert_image.img'

2021-Apr-27-21:50:02: Image upgrade process completed successfully.

2021-Apr-27-21:50:02: Executing command: '/pfrm2.0/bin/sqlcmd update reboot set rebootTime = '60''

2021-Apr-27-21:50:02: Executing command: '/pfrm2.0/bin/sqlcmd update reboot set reboot = '1''

2021-Apr-27-21:50:02: Executing command: '/bin/rm -rf /fwtmp/upgradeRevertInProgress'

2021-Apr-27-21:50:02: Exiting with error code 0.

END

 

SUCCESSFUL

2021-Apr-27-01:35:45: Executing command: '/bin/sync'

2021-Apr-27-01:35:45: Executing command: '/bin/umount /mnt/inactive'

umount: can't unmount /mnt/inactive: Device or resource busy

2021-Apr-27-01:35:45: Executing command: '/bin/rm -rf /storage/revert_image.img'

2021-Apr-27-01:35:45: Image upgrade process completed successfully.

2021-Apr-27-01:35:45: Executing command: '/pfrm2.0/bin/sqlcmd update reboot set rebootTime = '60''

2021-Apr-27-01:35:46: Executing command: '/pfrm2.0/bin/sqlcmd update reboot set reboot = '1''

2021-Apr-27-01:35:46: Executing command: '/bin/rm -rf /fwtmp/upgradeRevertInProgress'

2021-Apr-27-01:35:46: Exiting with error code 0.

2021-Apr-27-01:40:44 - upgrade_sanity_checks_interval is missing or wrong in /opt/fw1/conf/upgrade_sanity_checks_props.sh

2021-Apr-27-01:40:44 - upgrade_sanity_checks_period is missing missing or wrong in /opt/fw1/conf/upgrade_sanity_checks_props.sh

2021-Apr-27-01:40:44 - upgradeSanityChecks is missing in /opt/fw1/conf/upgrade_sanity_checks_props.sh

2021-Apr-27-01:40:44 - upgrade_sanity_checks_interval=2

2021-Apr-27-01:40:44 - upgrade_sanity_checks_period=30

2021-Apr-27-01:40:44 - upgradeSanityChecks=mgmt_connectivity_and_policy_revert_verification

2021-Apr-27-01:40:44 - isPreUpgradePolicyNonDefault=true (relevant after upgrade)

2021-Apr-27-01:40:44 - preUpgradeConnectivityOK=true

2021-Apr-27-01:40:44 - isPreUpgradePolicyNameLocal=false

2021-Apr-27-01:40:44 - Pre-upgrade connectivity to management was OK. Now testing after upgrade

2021-Apr-27-01:40:44 - Running Connectivity Validations.

2021-Apr-27-01:40:45 - Upgrade validation. Policy installed successfully.

2021-Apr-27-01:40:45 - Upgrade validation is OK

2021-Apr-27-01:40:45: Disabling watchdog.

2021-Apr-27-01:40:45: Post-boot completed successfully.

 

0 Kudos
Tom_Hinoue
Collaborator

Any chance the 10 gateways [/fwtmp] or any other directory is filled up?
It should be cleared during the upgrade process though...

[Expert@]# df -h
Filesystem Size Used Available Use% Mounted on
tmpfs 20.0M 6.0M 14.0M 30% /tmp
tmpfs 60.0M 22.6M 37.4M 38% /fwtmp
ubi2_0 890.9M 59.2M 827.0M 7% /logs
ubi3_0 278.1M 174.1M 99.2M 64% /storage
ubi0_0 159.4M 131.9M 27.5M 83% /pfrm2.0
tmpfs 14.0M 9.0M 5.0M 65% /tmp/log/local
tmpfs 500.0M 0 500.0M 0% /tetmp

If it is fwtmp, try following the instuctions in [fwtmp" folder is filling up, causing crashes on SMB appliances ] to make some disk space and then test the upgrade again. (note, some report statistics maybe cleared in this operation)

0 Kudos
T_L
Contributor

A space issue is one of our avenues of investigation. The fwtmp directories on our gateways are sized at 60M - and they are all running approx. 30%.  We do have the /pfrm2.0 directory that is always running full - 85-95% - but it seems to be consistent with all of our 1450's. *And we are not sure what to specifically target for removal.

**We did notice something odd tonight - all of the gateways that accepted the upload, rebooted, but did not actually upgrade - have the CORRECT firmware image in the 'Previous Image Name' field when doing < show diag >  

Current system info
-----------------------------------
Current image name: R77_990172947_20_87
Current image version: 947
Previous image name: R77_990173083_20_87
Previous image version: 083

I actually tried to revert one of them but it crashed the gateway. 

0 Kudos
Tom_Hinoue
Collaborator

Hmm, not sure of build 990172947, but so far we are not aware of any issues so far with upgrading from 990172960 or 990173004 to 990173072/990173083. (fyi)

Can you try if it resolves if you upgrade to the same image build 990173083? (only if possible)
Since you should be able to switch through builds or upgrade to same image if its on the same R77.20.87 line.

Other wise, you might want to engage with TAC concerning the number of gateways with same issue you have.
Hope its not an database issue...

0 Kudos
T_L
Contributor

That was a good idea - but ultimately, no dice!  I tried to upgrade one of the issue gateways to build 990173004 with the intention of upgrading to 990173083 after that - it did the same thing - took the upload, reloaded, and failed to upgrade - but put the image in the 'Previous Image Name' field.

And the Upgrade Logs for each failure show that the databases are successfully migrated to the new version each time.

The Upgrade Logs also show that there was successful upgrading of the appliance - and initiates the reboot -  but then it does not complete any validation and just ends.

2021-Apr-28-06:02:16: Executing command: '/opt/fw1/bin/cp_write_syslog.sh [System Operations] Successfully upgraded the appliance software version'
2021-Apr-28-06:02:16: Executing command: '/bin/sync'
2021-Apr-28-06:02:16: Executing command: '/bin/umount /mnt/inactive'
umount: can't unmount /mnt/inactive: Device or resource busy
2021-Apr-28-06:02:16: Executing command: '/bin/rm -rf /storage/revert_image.img'
2021-Apr-28-06:02:16: Image upgrade process completed successfully.
2021-Apr-28-06:02:16: Executing command: '/pfrm2.0/bin/sqlcmd update reboot set rebootTime = '60''
2021-Apr-28-06:02:17: Executing command: '/pfrm2.0/bin/sqlcmd update reboot set reboot = '1''
2021-Apr-28-06:02:17: Executing command: '/bin/rm -rf /fwtmp/upgradeRevertInProgress'
2021-Apr-28-06:02:17: Exiting with error code 0.

END

0 Kudos
T_L
Contributor

Current system info
-----------------------------------
Current image name: R77_990172947_20_87
Current image version: 947
Previous image name: R77_990173004_20_87
Previous image version: 004

0 Kudos
G_W_Albrecht
Legend
Legend

I would also suggest TAC - installing as the previous image is new even to me 😎

Greg_Harbers
Participant

Hi T_L,

Did you get to the bottom of this?

I am having exactly the same issue except on  part of 1590 appliances. We have around 20 1550/1590 appliances of which I have upgraded about 10 of them to R80.20.25 build 2077. On two gateways, the upgrade appears to proceed normally however after reboot they are still at the original level.

After reading your reply, I have just checked rang show diag from clish and it reports the previous image as the new version, eg...

Current system info
-----------------------------------
Current image name: R80_992001682_20_15
Current image version: 682
Previous image name: R80_992002077_20_25
Previous image version: 077

Thanks

Greg

0 Kudos
T_L
Contributor

Interesting.

Tac/support had us try a couple of things - empty some of the logs, upgrade to a specific previous version firmware and then attempt our final upgrade - but nothing has worked and our case has seemed to stall out with them.

I will post updates as we work through everything,

0 Kudos
T_L
Contributor

Check Point Support provided a new firmware_fw1_sx_dep_R77_990173042_20 - that they wanted us to install on one of the issue gateways first and then attempt the upgrade to the current firmware - ended up crashing the gateway and it now needs to be locally rebuilt. *I wouldn't recommend trying this!!

0 Kudos
G_W_Albrecht
Legend
Legend

Did you also try revert to previous-image ?

0 Kudos
T_L
Contributor

Yes sir - on two gateways that were having the issue - I noted it in the string above.  I was hoping it would do the trick but it 'crashes' the gateways - we lose all SSH/Web connectivity.  When I was able to connect locally the configs are gone and a < show diag > indicates the images are the same as pre-revert.

As of today, we have not received any further response from support.

0 Kudos
G_W_Albrecht
Legend
Legend

Yes, i have read that in your initial post - i had asked for Greg_Harbers 1550/1590 issue.

0 Kudos
jace1
Explorer

When a fresh image is installed on appliance (only 80.20.25) the situation is the same.
After manually configuring the appliance (no configuration restore) and doing a reboot
the system does not start properly.

0 Kudos
T_L
Contributor

Still have not received any additional support/ ideas from Check Point. 

0 Kudos
the_rock
Leader
Leader

I know this may seem like a long shot, but I recall someone having similar issue and it was license related.

0 Kudos
T_L
Contributor

Just double checked them all - good to go until 9/2021.

*But, I will take a shot and re-license one of the issue gateways from our Unicenter account - see if we get a different result. 

0 Kudos
T_L
Contributor

Brought down a new license file - removed the existing license from the target issue gateway - applied the new license. 

Same result. 

Worth the shot though! Thanks!

0 Kudos
the_rock
Leader
Leader

Sorry brother, at this point, I got nothing else, apologies.

0 Kudos