Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
JH_Ranger
Participant

SGM fails to join cluster after upgrade to Take130 due to CPSSH config file mismatch.

I have recently upgraded my Maestro setup from R81.10 Take110 to Take130. As per the Maestro documentation, I always upgrade one SGM first, and then the other (two SGMs, single site).

After Take130 installed on the first member, it rebooted, but failed to join the cluster. It then proceeded to reboot 5 times, after which it stayed in DOWN state. Looking at some log files, it appears that the member failed to obtain a SSH DPI config file (despite us not running SSH deep packet inspection).

Looking at the log files on the problematic member in the DOWN state:

# cat /var/log/pull_config_report.log
Report of "apply all":
| 192.0.2.1] Configuration mismatch (refer to /var/log/configuration_reboot_reason.log)
reboot retry left: 0/5. Reboot is aborted.

# cat /var/log/configuration_reboot_reason.log
Reboot was performed due to configuration mismatch:
- /opt/CPsuite-R81.10/fw1/conf/cpssh/settings.fwset
Remote file '/opt/CPsuite-R81.10/fw1/conf/cpssh/settings.fwset' does not exist

So the upgraded member failed to get the /cpssh/settings.fwset file from the ACTIVE member (even though SSH DPI is not turned on, this file is apparently necessary). Note that these files were not present on either member prior to the upgrade.

To overcome this, I have run the following commands on the ACTIVE member:

# cpssh_config

# cpssh_config istatus

After running the above, the relevant files were created on the ACTIVE member, and after rebooting the DOWN member, it joined the cluster and started handling traffic.

Hope this helps.

 

0 Replies