Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Steve-Pearson
Contributor
Jump to solution

R81.10 to R81.20 upgrade failing

I'm trying to upgrade a management server from R81.10 to R81.20 but it's failing right at the end during the import process.

The upgrade report shows:

Failed: Upgrade of "SMC User".
For more details see upgrade logs below

User Data 36 sec.
Objects 11728
Upgrade Rate 318 Objects Per Second
Import Time 11:07:53 - 11:08:29

This is very similar to the post by Blason_R a couple of weeks ago, but he at least got an error message, where as i'm getting nothing useful at all!

CPM_Doc doesn't show anything that could help either, there is only one issue which is "Application Control Data Overload". This requires a hotfix from TAC to resolve it, but as the application data is upgrading correctly I've ruled this out as a cause.

I have a case open with TAC, but if anyone has any ideas of what causes this or what additional logs I can look at for more info, it would be appreciated!

Thanks,

Steve

0 Kudos
1 Solution

Accepted Solutions
Steve-Pearson
Contributor

Hi Andy,

Just wanted to let you know that we've finally got to the bottom of this issue!

After submitting all of my troubleshooting findings to TAC, they released a new version of the upgrade tools to address the issue. Tests so far are good, it appears to have resolved the problem!

Thanks for your help!

Regards,

Steve

View solution in original post

19 Replies
the_rock
Legend
Legend

Can you send a screenshot? I ask, because I remember having this happened once with the customer and we just gave it some time and import worked fine, was able to log in and install policy

Can you run below commands from ssh and send?

api status

watch -d $FWDIR/scripts/./cpm_status.sh (ctrl+c to stop)

cpwd_admin list

Best,

Andy

0 Kudos
Steve-Pearson
Contributor

Hi Andy,

I've attached a screeshot of the upgrade report. It gets to this point then after 30 seconds or so it automatically reverts!

This is the output from the commands:


API Settings:
---------------------
Accessibility: Require local
Require ip 172.XX.XX.6 172.XX.XX.0/255.255.XXX.0
Automatic Start: Enabled

Processes:

Name State PID More Information
-------------------------------------------------
API Started 20039
CPM Started 20039 Check Point Security Management Server is running and ready
FWM Started 19552
APACHE Started 18516

Port Details:
-------------------
JETTY Internal Port: 57457
JETTY Documentation Internal Port: 60976
APACHE Gaia Port: 4434 (a non-default port)
When running mgmt_cli commands add '--port 4434'
When using web-services, add port 4434 to the URL

Profile:
-------------------
Machine profile: 24800-35800 without SME
CPM heap size: 3072m

Apache port retrieved from: httpd-ssl.conf


--------------------------------------------
Overall API Status: Started
--------------------------------------------

API readiness test SUCCESSFUL. The server is up and ready to receive connections

Notes:
------------
To collect troubleshooting data, please run 'api status -s <comment>'

***************************************************************************************************************************************

Every 2.0s: /opt/CPsuite-R81.10/fw1/scripts/./cpm_status.sh Tue May 14 14:33:56 2024

Check Point Security Management Server is running and ready

***************************************************************************************************************************************

APP PID STAT #START START_TIME MON COMMAND
CPVIEWD 19191 E 1 [12:45:28] 14/5/2024 N cpviewd
CPVIEWS 19217 E 1 [12:45:28] 14/5/2024 N cpview_services
CVIEWAPIS 19224 E 1 [12:45:28] 14/5/2024 N cpview_api_service
CPD 19238 E 1 [12:45:28] 14/5/2024 Y cpd
FWD 19549 E 1 [12:45:32] 14/5/2024 N fwd -n
FWM 19552 E 1 [12:45:33] 14/5/2024 N fwm
FWMHA 19561 E 1 [12:45:33] 14/5/2024 N fwmha -H
STPR 19608 E 1 [12:45:33] 14/5/2024 N status_proxy
CLOUDGUARD 19693 E 1 [12:45:34] 14/5/2024 N vsec_controller_start
CPM 20039 E 1 [12:45:37] 14/5/2024 N /opt/CPsuite-R81.10/fw1/scripts/cpm.sh -s
SOLR 20284 E 1 [12:45:39] 14/5/2024 N java_solr
RFL 20602 E 1 [12:45:43] 14/5/2024 N LogCore
SMARTVIEW 21151 E 1 [12:45:51] 14/5/2024 N SmartView
INDEXER 21581 E 1 [12:45:55] 14/5/2024 N /opt/CPrt-R81.10/log_indexer/log_indexer
SMARTLOG_SERVER 21641 E 1 [12:45:57] 14/5/2024 N /opt/CPSmartLog-R81.10/smartlog_server
CP3DLOGD 21788 E 1 [12:45:59] 14/5/2024 N cp3dlogd
SICTUNNEL 21919 E 1 [12:46:02] 14/5/2024 N /opt/CPshrd-R81.10/bin/cptnl -c "/opt/CPuepm-R81.10/engine/conf/cptnl_srv.conf"
EPM 21930 E 1 [12:46:02] 14/5/2024 N startEngine
REPMAN 22173 E 1 [12:46:07] 14/5/2024 N java_repository_manager
DASERVICE 22242 E 1 [12:46:08] 14/5/2024 N DAService_script
AUTOUPDATER 22290 E 1 [12:46:10] 14/5/2024 N AutoUpdaterService.sh
CPSM 32171 E 1 [12:48:38] 14/5/2024 N cpstat_monitor
LPD 21154 E 1 [12:49:59] 14/5/2024 N lpd

 

 

0 Kudos
Steve-Pearson
Contributor

TAC are asking me to run the following to change a variable, but it's not something i've come across before:

FWDIR/scripts/override_server_setting.sh -e ENABLE_DEADLOCK_CHECK false

Can't even think how this could help!

0 Kudos
the_rock
Legend
Legend

I have no clue in the world what that option does, sorry. Dont see how it would help, but you can ask them. Btw, just curious, if you are allowed to do remote, would love to help you out. If so, message me directly.

Also, does web UI show upgrade in process or completed? Can you log into smart console?

Andy

0 Kudos
Steve-Pearson
Contributor

I have asked them but no response yet, and I'm with you, don't see how it can help!

Can't do remote i'm afraid, but thanks for the offer!

WebUI shows it's in progress until it fails then it auto reverts (so no time to try logging in)

Support have come back and said (very casually) that this is a known issue (despite there being no SK's about it!) and there is a fix, but they have to request it and  it can take several days! 

Steve

0 Kudos
the_rock
Legend
Legend

I disagree 100% with statement thats a known issue. I had done management upgrades who knows how many times and only seen that happen once and all worked fine at the end, just took 3 hours, as customer's HDD was 2 TB, so I assumed thats why.

Other than that, never seen it before. No worries mate, understood for remote session not being possible, no offence taken : - )

Anywho, let me do some "digging" and see if we can help.

Give me some time.

Andy

0 Kudos
the_rock
Legend
Legend

Not sure if this was post you referred to initially?

https://community.checkpoint.com/t5/General-Topics/Unable-to-upgrade-from-R80-40-to-R81-10/td-p/1795...

Maybe run that script, see what it shows.

Andy

[Expert@CP-management:0]# cd $FWDIR/scripts
[Expert@CP-management:0]# ./run
run_cpmdoc.sh run_packages_cb.sh
run_groovy_script.sh runfwconfig
[Expert@CP-management:0]# ./run_cpmdoc.sh

 

0 Kudos
Steve-Pearson
Contributor

Yes thats the article I was refering to.

For the fun of it i've changed that setting and am trying again (but not expecting it to work!)

I'll try these scripts after and let you know what they say, although I have already run the last one cpmdoc, it shows one issue, which is a known problem requiring a fix from TAC. Its related to "Application Control Data Overload" (sk181332) but as the application control data upgrades perfectly so that can't be related!

the_rock
Legend
Legend

Thats totally logical explanation, I agree with that. Sucks you cant do remote, but understood. Let me know if anything comes out of that script.

Andy

0 Kudos
the_rock
Legend
Legend

Hey Steve,

Any luck with this issue?

Best,

Andy

0 Kudos
Steve-Pearson
Contributor

Morning Andy,

Sorry, it got a bit chaotic yesterday afternoon, going to run those scripts now.

However, I tried a couple more things last night, firstly a migrate_server export to Version R81.10 and then a migrate_server_import to read it back in, as i've seen this process fix database issues before, but in this case it failed again in the same way.

Secondly, I took a migrate_server export -v R81.20. then spun up a new R81.20 management under VMware workstation and ran a migrate_server import. This also failed, BUT this time I got more information. I've attached the upgrade_report here, i'm thinking that there may be a problem with an object in the database.

Let me know if you have any ideas from this report.

Thanks,

Steve

0 Kudos
Steve-Pearson
Contributor

Ok, I've tried the scripts you suggested, here are the results:

run_packages_cb.sh

This returned nothing

run_groovy_script.sh runfwconfig

This returned:

[Expert@FWMGMT:0]# ./run_groovy_script.sh runfwconfig
Loading groovy script engine ...
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/CPsuite-R81.10/fw1/cpm-server/activemq-all-5.9.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/CPsuite-R81.10/fw1/cpm-server/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
09:28:20,618 INFO com.checkpoint.management.cpm.Cpm.enableLocalSic:81 [main] - Enabling local sic. Setting cp.ssl_local.certificate.check=local
09:28:20,655 INFO com.checkpoint.management.cpm.configuration.Utils.setTdLogConfigFile:69 [main] - Starting to configure logging options
[Expert@FWMGMT:0]#

 

./run_cpmdoc.sh

I've attached a screenshot of the report summary

 

Thanks,

Steve

0 Kudos
the_rock
Legend
Legend

Do you have details on the error at the bottom?

Andy

0 Kudos
Steve-Pearson
Contributor

I assume you're refering to the Failed to read json object?

No, I don't have any further info on this, i'm looking into it at the moment.

It refers to St_Stephens which would probably be a name used for an object, but there is no object with that name, which concerns me!

0 Kudos
the_rock
Legend
Legend

Okay...suggestion that came to my mind. I once had similar issue and below is how I got around it.

Andy

 

https://community.checkpoint.com/t5/Management/Migrate-server-issue-on-Azure-CP-management-server/m-...

Its ignore warnings flag, but not 100% certain that would work for you, but worth a try.

0 Kudos
Steve-Pearson
Contributor

This is on the export command I assume?

I did run migrate_export verify first and it did not report any issues or warnings, but i'll export again and try it.

Support have suggested installing the latest jumbo on my R81.20 test vm to see if it will then import, so currently running that

 

0 Kudos
the_rock
Legend
Legend

Yes, export command, right. I cant say latest jumbo may help, but worth a try.

Andy

0 Kudos
Steve-Pearson
Contributor

Hi Andy,

Just wanted to let you know that we've finally got to the bottom of this issue!

After submitting all of my troubleshooting findings to TAC, they released a new version of the upgrade tools to address the issue. Tests so far are good, it appears to have resolved the problem!

Thanks for your help!

Regards,

Steve

the_rock
Legend
Legend

Fantastic!

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events