Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Participant

Secondary MDS upgrade fails "Upgrading Products" at 58% using CPUSE

Upgrade of secondary mds from r80.10 to r80.20 fails at 58% "Upgrading products" . Upgrade of primary mds was successful using CPUSE.

0 Kudos
Reply
26 Replies
Highlighted

Did the machine revert back to R80.10 ? Or it is simply hanging at 58% of the upgrade process ? In my LAB, it took around 3 hours to upgrade Secondary MDS to R80.20 with 5 CMAs. There are heavy logs generated during the upgrade process, which can surely tell you what is wrong with the upgrade. Check it by executing "show installer package <NUM>" command.

Also, make sure that there is connectivity with Primary MDS during the upgrade.

Kind regards,
Jozko Mrkvicka
0 Kudos
Reply
Highlighted
Contributor

Hi,

I have the same issue.

I upgrade from MDS R77.30 to R80.30 via Cpuse.

The upgrade via cpuse for the primary MDS is OK.

The upgraade via cpuse for the secondary MDS is NOK. This is the log message :

cat /opt/CPInstLog/import_mds.4May2020-073710.log
Reading configuration of imported Multi-Domain Server.

*** Verifying source version...
Export tool version matches import tool version. Proceeding.
./cp.license
./mdsdb/mdss.C

*** There Ip is valid
./cp.license
./mdsdb/mdss.C

*** Connectivity to server : succeeded.
./cp.license
./mdsdb/mdss.C

*** There is valid license on this server

Your Multi-Domain Server should NOT be running while you import.
mds_import.sh will now stop the Multi-Domain Server.
Waiting for CPM server...
Waiting for CPM server...
Waiting for CPM server...
Waiting for CPM server...
CPM server started
----------------------------------------
--- Starting Import Procedure ---
----------------------------------------

Importing Multi-Domain Server data
Upgrading Databases:
Importing Multi-Domain Server Databases.

Error: Failed to upgrade Multi-Domain Server Databases

Error: Importing of Multi-Domain Server Databases failed.
DONE.

Summary of Upgrade operation:

=====================================================================

Import operation started at: Mon May 4 07:37:10 CEST 2020

Multi-Domain Server databases - Failure

=====================================================================

--------------------------------------------------------------------------------
Import operation Failed.
Please see log file /opt/CPInstLog/import_mds.4May2020-073710.log for details
--------------------------------------------------------------------------------
DONE.

 

0 Kudos
Reply
Highlighted
Contributor

This is some detail that i have find on file log:

[Secondary_MDS]# tail -f /opt/CPshrd-R80.30/log/migrate-2020.05.04_07.40.00.log


[4 May 7:40:07] [runShellCommand] Executing command: '/opt/CPmds-R80.30/bin/upgrade_standby_server -migrate -mdsdb'
[4 May 9:40:48] [runShellCommand] Execution result: 1, exit code: 5
[4 May 9:40:48] [DbUpgrader::ExecuteCpdb] ERR: cpdb completed with error code '5'
[4 May 9:40:48] ...<-- DbUpgrader::ExecuteCpdb
[4 May 9:40:48] ..<-- DbUpgrader::RunCustomCpdbCommand
[4 May 9:40:48] [DbUpgrader::exec] ERR: Failed to execute custom upgrade command
[4 May 9:40:48] .<-- DbUpgrader::exec
[4 May 9:40:48] <-- ConditionalExecutor::exec
[4 May 9:40:48] [ActivitiesManager::exec] ERR: Activity 'ConditionalExecutor' failed
[4 May 9:40:48] [ActivitiesManager::exec] WRN: Activities execution finished with errors
[4 May 9:40:48] [ActivitiesManager::exec] WRN: Activities 'ConditionalExecutor' have failed
[4 May 9:40:48] [ActivitiesManager::exec] Designated exit code is 1
[4 May 9:40:48] --> CleanupManager::Instance
[4 May 9:40:48] <-- CleanupManager::Instance
[4 May 9:40:48] --> CleanupManager::DoCleanup
[4 May 9:40:48] [CleanupManager::DoCleanup] Starting to perform cleanup
[4 May 9:40:48] .--> DirCleaner::exec
[4 May 9:40:48] [DirCleaner::exec] Going to remove directory '/opt/CPmds-R80.30/tmp/migrate/'
[4 May 9:40:48] .<-- DirCleaner::exec
[4 May 9:40:48] .--> ImportFailureMarker::exec
[4 May 9:40:48] [ImportFailureMarker::exec] Checking if cleaner is active
[4 May 9:40:48] [ImportFailureMarker::exec] Cleaner is not active, nothing to do
[4 May 9:40:48] .<-- ImportFailureMarker::exec
[4 May 9:40:48] [CleanupManager::DoCleanup] Completed the cleanup
[4 May 9:40:48] <-- CleanupManager::DoCleanup

0 Kudos
Reply
Highlighted
Employee+
Employee+

Hi @nabs_nabs 

I believe this is the same issue from sk146933, which already resolved.

Can you please confirm that you are running with latest DA version?

Thanks,

Miri

 

0 Kudos
Reply
Highlighted
Contributor

Hi @Miri_Ofir

What is a "DA" ?

0 Kudos
Reply
Highlighted
Employee+
Employee+

0 Kudos
Reply
Highlighted
Contributor

Hi @Miri_Ofir 

The Deployement Agent is: 1889  |  R77.30 take 5

 

0 Kudos
Reply
Highlighted
Employee+
Employee+

OK, the issue from sk146933 is not relevant in upgrade from R77.30.
Please open SR and share with me the ticket number.
I'm from R&D responsible to support mgmt upgrade and I will monitor it and make sure it's prioritized.
Thanks
0 Kudos
Reply
Highlighted

Hi @nabs_nabs ,

can you please zip the "install_Major*" Dir from /opt/CPInstLog/ and attache it here.

 

Thanks,

Timur Khairulin

0 Kudos
Reply
Highlighted
Contributor

Hi @Timur_Khairulin ,

See the attached file.

Thx for help.

 

0 Kudos
Reply
Highlighted
Contributor

Hi @Miri_Ofir , Hi @Timur_Khairulin ,

 

On the log detail message I have found this:

[4 May 14:12:50] [DbUpgrader::ExecuteCpdb] Executing cpdb using the following arguments: /opt/CPmds-R80.30/bin/upgrade_standby_server -migrate -mdsdb
[4 May 14:12:50] [runShellCommand] Executing command: '/opt/CPmds-R80.30/bin/upgrade_standby_server -migrate -mdsdb'
[4 May 16:14:42] [runShellCommand] Execution result: 1, exit code: 5
[4 May 16:14:42] [DbUpgrader::ExecuteCpdb] ERR: cpdb completed with error code '5'
[4 May 16:14:42] ...<-- DbUpgrader::ExecuteCpdb
[4 May 16:14:42] ..<-- DbUpgrader::RunCustomCpdbCommand
[4 May 16:14:42] [DbUpgrader::exec] ERR: Failed to execute custom upgrade command

 

It seems that the standby MDS can't execute this command "/opt/CPmds-R80.30/bin/upgrade_standby_server -migrate -mdsdb"  with the error code: 5.

unfortunately I did not find anything in the kb checkpoint regarding this error.

Any idea ?

 

Thank you in advance for your help.

 

0 Kudos
Reply
Highlighted
Contributor

On custom_upgrade_log-TheDateOfTheUpgrade.elg I have found this:
[30263 4087416640]@acprovider1-02[4 May 16:14:42] Full sync has failed for this Multi-Domain Management Server. Please make sure that proper license is installed.

Is there a link between the command that can't be executed and the full sync ?
Licence on primary and secondary MDS and secondary MDS are OK.
0 Kudos
Reply
Highlighted

Hi @nabs_nabs ,

we review it now and update you ASAP with conclusions.

Thanks,

Timur

0 Kudos
Reply
Highlighted

Hi @nabs_nabs ,

can you please check two additional things

  1. Verify that there is network connectivity between the Primary and Secondary MDS servers (check it with ping / ssh).
  2. Verify that the clocks on the Primary and Secondary servers are synchronized.

 

Please check also the following on both primary and secondary machines environment :

  1. FWM is up
  1. FWM is listening on HA port 18221. you can use those commands -

For MDS’s run: “netstat -a |grep 18221”

For CMA’s run: “netstat –nap |grep 18221 | grep <cma ip>”

  1. Check that there are no drops on port 18221, If there is GW between MDS’s.
  1. Check that sic is established and communicating between the 2 MDSs. you can use this command -
    cpstat -h <otherMdsIP> mg

 

please collect cpm doctor from both machines according to sk117219

Thanks,

Timur Khairulin

0 Kudos
Reply
Highlighted
Contributor

Hi,

 

We had a smilar issue upgrading from MDS 80.20 to r80.30 with CPUSE, and we managed with TAC/r&d to fix it

During the export one of the scripts is having an issue, TAC provided an updated script, so ask them this if that is similar.

Step takes to make this work in our case (not possible via cpuse, you need to follow the migrate plan with a fresh install)

 

    1. Fresh install a new R80.30 MDS.
    2. # cp /opt/CPsuite-R80.30/fw1/sql_scripts_R80_20/enable_ip_search_for_groups.sql /opt/CPsuite-R80.30/fw1/sql_scripts_R80_20/enable_ip_search_for_groups.sql.ORIG
    3. Download the new enable_ip_search_for_groups.sql provided by TAC, and place it in  /opt/CPsuite-R80.30/fw1/sql_scripts_R80_20/
    4. Export the R80.20 MDS, and import into the new server.

0 Kudos
Reply
Highlighted
Contributor

Hi @Khalid_Aftas ,

My issue seems to be on import database and not on export like you.

Can you please confirm that your problem is for the upgrade of the secondary MDS ?

You said: "During the export one of the scripts is having an issue, TAC provided an updated script, so ask them this if that is similar."

Which script is missing ? Is it a script for a secondary MDS upgrade ?

As a reminder, the upgrade of the primary MDS from R77.30 to R80.30 is ok.

"Only" the upgrade of my secondary MDS failed.

 

Regards.

 

 

0 Kudos
Reply
Highlighted
Contributor

Sorry small correction, the issue is during the import of the MDS database after CPUSE installed r80.30.

The export must be done with the tweaked script for the import to work

At lease in our case.

0 Kudos
Reply
Highlighted
Contributor

@Khalid_Aftas ,

Thank you for sharing this.

Do you know what exactly do this script ?

0 Kudos
Reply
Highlighted
Contributor

Hi @Timur_Khairulin ,

I have done all the command that you expected and this is the result:

- Network is ok between both MDS and no drop between them

- clock are same on both MDS

- MDS fwm is up

- For the step of the full synchro during the upgrade, a connection is established from the secondary but the communication has been aborted by the secondary. After that, 2 hours without nothing, the upgrade failed.

- SIC is ok during the upgrade (on 2 MDS)

 

I don't know where to find the reason why the secondary MDS abort the synchro full.

I have just find this on the custom_upgrade_log-TheDateOfTheUpgrade.elg:

" Full sync has failed for this Multi-Domain Management Server. Please make sure that proper license is installed."

I don't understand what does it can mean, because my licence are proper before the upgrade.

 

NB: I can't see sk117219

 

 

 

 

 

0 Kudos
Reply
Highlighted

Hi @nabs_nabs ,

For continuity investigate this issue, we will need a replication of your setup.

Please open SR and share with me the ticket number.

Thanks,

Timur

0 Kudos
Reply
Highlighted
Contributor

Hi @Miri_Ofir 

The case is open:  6-0001991829

Thank you for the help

Regards

0 Kudos
Reply
Highlighted
Employee+
Employee+

Thanks, @Timur will be in contact with TAC engineer to expedite and assist with the investigation.
0 Kudos
Reply
Highlighted
Contributor

Good morning,

For information, yesterday I installed the last Jumbo HF T191 on the primary MDS and I used the cpuse upgrade on the secondary MDS but it failed at the same step of the upgrade (import database failed + full sync failed with peer).

On the primary MDS cpm.elg logs,  I can see this message:

" Failed_to_synchronize__The_Security_Management_Servers_contain_different_Hotfixes"

 

0 Kudos
Reply
Highlighted

Try to uninstall Jumbo from Primary MDS and then upgrade secondary MDS.

The point is that sync cannot be performed on Secondary MDS because there is no Jumbo yet, but on Primary there is (creates conflict).

Steps in R80.30 CPUSE upgrade guide are not 100% correct.

Kind regards,
Jozko Mrkvicka
0 Kudos
Reply
Highlighted
Contributor

Hi guys,

Upgrade of secondary mds from r77.30 to r80.30 finally works fine.

My issue  was that the Vm where the secondary mds is installed has some datastore issue.

So my issue was not on checkpoint but on vmware!

Thanks a lot for your help guys.

Note: if you install jumbo HF on primary mds before the upgrade of the secondary mds , the upgrade of the secondary mds will fail. So you have to install junbo HF after the upgrade of both mds. 

Regards.

0 Kudos
Reply
Highlighted
Employee+
Employee+

Thank you for this update, I received it from the team that handled your case and I'm happy it was completed successfully 🙂
And your note is correct - you must keep the same patch level between MDS (or mgmt) machines, for the sake of HA synchronization between machines.