Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Thomas_Eichelbu
Advisor
Advisor
Jump to solution

How does Image auto-clone really work? How often does a SGM needs to reboot?

Hello fellow Check Mates, 

 

an interesting question i have received from a Maestro customer: "how often does a newly added SGM really needs to reboot?"

We saw the following:
A SGM was added after a RMA, and it booted 4 times! we collected the LOM output and saw all steps.
it took over 30min for the SGM to become active and start to forward traffic.
which is far from the marketing slides from Maestro, promising something below 10min 🙂 Yes its marketing.

the system was in Dual Site, VSX/VSLS with 2 VS, Running R81.10 Take 78.

the orig. installer details on the SMO:

show installer packages
Check_Point_R81_10_JHF_T66_155_MAIN_Bundle_T1_FULL.tgz Installed as part of
Check_Point_R81_10_JUMBO_HF_MAIN_Bundle_T66_FULL.tgz Installed as part of
Check_Point_R81_10_JUMBO_HF_MAIN_Bundle_T78_FULL.tgz Installed
Check_Point_R81_10_JUMBO_HF_MAIN_Bundle_T79_FULL.tgz Available for Download
SecurePlatform_HOTFIX_R81_10_JHF_T78_330_MAIN_GA_FULL.tgz Installed
fw1_wrapper_HOTFIX_R81_10_JHF_T78_859_MAIN_GA_FULL.tgz Installed

the SGM says:
show installer packages imported
** ************************************************************************* **
** Hotfixes **
** ************************************************************************* **
Display name Type
Check_Point_R81_10_JHF_T66_155_MAIN_Bundle_T1_FULL.tgz Hotfix
Check_Point_R81_10_JUMBO_HF_MAIN_Bundle_T78_FULL.tgz Hotfix
SecurePlatform_HOTFIX_R81_10_JHF_T78_330_MAIN_GA_FULL.tgz Package
fw1_wrapper_HOTFIX_R81_10_JHF_T78_859_MAIN_GA_FULL.tgz Package

show installer packages
** ************************************************************************* **
** Hotfixes **
** ************************************************************************* **
Display name Status
Check_Point_R81_10_JHF_T66_155_MAIN_Bundle_T1_FULL.tgz Installed as part of
Check_Point_R81_10_JUMBO_HF_MAIN_Bundle_T66_FULL.tgz Installed as part of
Check_Point_R81_10_JUMBO_HF_MAIN_Bundle_T78_FULL.tgz Installed
Check_Point_R81_10_JUMBO_HF_MAIN_Bundle_T79_FULL.tgz Available for Download
SecurePlatform_HOTFIX_R81_10_JHF_T78_330_MAIN_GA_FULL.tgz Installed
fw1_wrapper_HOTFIX_R81_10_JHF_T78_859_MAIN_GA_FULL.tgz Installed

so what is the SGM really doing when he was added to the Security Group and "apply" has been pressed.
we saw it took 8 minutes to "wait"; we saw an "rsync"operation was running.
does it mean all imported hotfixes from the CPUSE repository will be sent to the new SGM and then installed one after another?
where is the "imaging"  happening in the command  "set smo image auto-clone state on" command?


since we have VSX and a few VS on top we can accept a longer reboot ...
but 30min? is this normal?
does the SGM really installs all of the HFA/costum hotfixes one after another?
what if we have many many hotfixes and custom hotfixes installed, does it really reboots for every Hotfix?
will the "apply" button in the Security Group will initiate a factory reset to whipe all unwanted configuration off the appliance?
so two reboots seem to be the minimum then?

and if CPUSE says
Installed as part of
Installed as part of
Installed as part of

does is really install all those previous Hotfixes too?
or is this required to be able to install hotfixes, even when they came via "image auto-clone"?

who has a good technical explanation of this?

 

best regards
Thomas

(1)
1 Solution

Accepted Solutions
Thomas_Eichelbu
Advisor
Advisor

Hello Lari, 

yes all true, some investigations with TAC also resulted in some answers,
so i allow myself to post the TAC´s original response:

Image Auto-Clone pulls binaries (from the /bin directory) in addition to the text configuration files.
This includes all installed JHF, private HF, and OS modules.

Some files are excluded from Image Auto-clone.
The list of excluded files can be found here (per partition):
/etc/exclude_file_root.conf
/etc/exclude_file_var.conf
/etc/exclude_file_log.conf

Basically, to answer your question - the SGM will neither become a 1:1 replica nor will just transfer the hotfixes. It will copy repositories and will exclude the rest of the files according to the excluded files given.

since my basic question was more like, "does a newly added SMG pulls hotfixes and installs them or does it just copy entire directory structures?"
So it does copy entire directory structures.
Mistery solved!

as you correctly say, when the newly added SGM reboots many times, or if it is stuck a booting loop, i have to investigate further, since this is not normal.

View solution in original post

0 Kudos
23 Replies
Timothy_Hall
Legend Legend
Legend

Curious to hear this as well, when running the new Check Point Certified Maestro Expert R81.10 class (which prominently features @Danny's ccc tool) new SGM provisioning did take a very long time even with a pretty beefy 16200.  Had several attendee questions in this same vein, exacerbated by the fact that there is no easy way to monitor the progress of the provisioning member other than running asg monitor on the SMO Master.  Even watching this command's output the path to ACTIVE was a long and tortured one, with many transitions between INIT/LOST/DETACHED/DOWN before finally settling at ACTIVE.

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
Thomas_Eichelbu
Advisor
Advisor

Hello, 
yes u right, on this customer we also used a bunch of 16600HS SGM´s.
after adding one new SGM to the Security Group and pressing apply it took 8min for the first reboot to be initiated.
It did three more reboots then until it was ready for service.

and the question is.
does the new SGM install all of the HFA and all of the custom hotfixes which are installed on the SMO step by by step or not?
does it ONLY install the very latest HFA installed on the SMO or the whole battery of previous installed HFA´s?

how is the imaging process affected if i have ten custom hotfixes (hopefully never required) installed?
does it reboot once for FCD and assigning to the Security Group, once for the main HFA, 10 times for custom hotfixes, so 12 times?

best regards

0 Kudos
Danny
Champion Champion
Champion

In addition to @Timothy_Hall's recommendation of asg monitor I also recommend to monitor the serial console interface of the SGM that is being added to the SG via image cloning. This should be fairly easy as most SGM appliance have an integrated LOM and it's generally recommended to use Maestro together with a serial terminal console server.

One downside of asg monitor that I encountered is, that it doesn't show the duration of the SGM's statuses.
Therefore this one-liner might serve as an alternative:

watch "echo;g_all cphaprob stat|grep 'local\|time';echo"

0 Kudos
Timothy_Hall
Legend Legend
Legend

Nice little one-liner there, and the recommendation to have out-of-band console access and/or a LOM present was included in the Maestro course as well.  Especially important now with so many working from home, as one can't typically just run down the hall to the computer room and jump on a physical serial console anymore.  Lack of remote console access has been the cause of many a frantic late night drive into the office/data center that is best avoided...

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
Thomas_Eichelbu
Advisor
Advisor

Hello, 

 

yes we watched the newly added SGM over LOM/serial port output, that's how we realized it has rebooted 4 times ...
the output is much too long to post it here ... 
perhaps i have to crawl through all the logs again or try the same operation in a lab environment.
but would be good to hear from Check Point staff how this image cloning is really supposed to work!

 

0 Kudos
Alexander_Wilke
Advisor

Hello,
can someone tell me what triggers "auto clone" to start?
But a reboot of 1_01 or 2_01 does not trigger these SGMs to fetch the images from 1_02 (which is SMO after 1_01 restarted)?

R81.10 JHFA Take 79 on 64k Scalable Plattform

Is there a list of files which will be checked if there is a difference between the local SGM and the SMO which triggers "image auto-clone" or a way to force an SGM to auto-clone (without removing it from Security Group or factory default) ?

 

 

If I run "asg_provision" I see issues like:

+-------------------------------------------------------------------------------------------------------+
|Installed Hotfixes |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|Package |Name |Take |Result |Comments |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|CPUpdates |BUNDLE_CPOTELCOL_AUTOUPDATE |22 |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|CPUpdates |BUNDLE_CPSDC_AUTOUPDATE |21 |Failed |1_01: Not exists |
| | | | |2_01: Not exists |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|CPUpdates |BUNDLE_CPVIEWEXPORTER_AUTOUPDATE |25 |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|CPUpdates |BUNDLE_ENDER_V17_AUTOUPDATE |15 |Failed |1_01: Not exists |
| | | | |2_01: Not exists |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|CPUpdates |BUNDLE_GOT_TPCONF_AUTOUPDATE |107 |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|CPUpdates |BUNDLE_HCP_AUTOUPDATE |58 |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|CPUpdates |BUNDLE_PUBLIC_CLOUD_CA_BUNDLE_AUTOU|19 |Failed |1_01: Not exists |
| |PDATE | | | |
| | | | |2_01: Not exists |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|CPUpdates |BUNDLE_R80_40_MAAS_TUNNEL_AUTOUPDAT|47 |Failed |1_01: Not exists |
| |E | | | |
| | | | |2_01: Not exists |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|CPUpdates |BUNDLE_R81_10_JHF_T79_640_MAIN |2 |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|CPUpdates |BUNDLE_R81_10_JUMBO_HF_MAIN |79 |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|CPotelcol |HOTFIX_OTLP_GA |- |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|CPviewExporter |HOTFIX_OTLP_GA |- |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|CVPN |HOTFIX_R81_10_JUMBO_HF_MAIN |79 |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|FW1 |HOTFIX_GOT_TPCONF_AUTOUPDATE |- |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|FW1 |HOTFIX_PUBLIC_CLOUD_CA_BUNDLE_AUTOU|- |Failed |1_01: Not exists |
| |PDATE | | | |
| | | | |2_01: Not exists |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|FW1 |HOTFIX_R80_40_MAAS_TUNNEL_AUTOUPDAT|- |Failed |1_01: Not exists |
| |E | | | |
| | | | |2_01: Not exists |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|FW1 |HOTFIX_R81_10_JHF_T79_640_MAIN |2 |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|FW1 |HOTFIX_R81_10_JHF_T79_640_V2_003_MA|- |Pass | |
| |IN | | | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|FW1 |HOTFIX_R81_10_JUMBO_HF_MAIN |79 |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|PPACK |HOTFIX_R81_10_JUMBO_HF_MAIN |79 |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|SMO |HOTFIX_R81_10_JHF_T79_640_MAIN |2 |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|SMO |HOTFIX_R81_10_JUMBO_HF_MAIN |79 |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|SecurePlatform |HOTFIX_ENDER_V17_AUTOUPDATE |- |Failed |1_01: Not exists |
| | | | |2_01: Not exists |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|SecurePlatform |HOTFIX_R81_10_JHF_T79_640_MAIN |2 |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|SecurePlatform |HOTFIX_R81_10_JUMBO_HF_MAIN |79 |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|cpsdc_wrapper |HOTFIX_CPSDC_AUTOUPDATE |- |Failed |1_01: Not exists |
| | | | |2_01: Not exists |
+--------------------+-----------------------------------+---------------+---------+--------------------+
|hcp_wrapper |HOTFIX_HCP_AUTOUPDATE |- |Pass | |
+--------------------+-----------------------------------+---------------+---------+--------------------+


Summary
=======
1. Software verifier passed successfully.
2. Module verifier passed successfully.
3. Hotfix verifier failed.

 

0 Kudos
Lari_Luoma
Ambassador Ambassador
Ambassador

Hi,

Let's first understand what is difference between auto-cloning and just a regular configuration pull.

In Maestro all SGMs must have the same configuration all the same, so effectively they are all clones of the SMO what comes to configuration and installed software/jumbos.

When an SGM is added to the SG, it pulls configuration from the SMO, regardless if auto-cloning is enabled or not. 

When auto-cloning is enabled, the pull also includes the binaries, i.e. the installed hotfixes. Auto-cloning does not support the base operating system version, so that must be the same on all SGMs.

Next let's look at the process when adding a new SGM into the Security Group.
SMO constantly sends security group CCP messages every few seconds and each SGM receives them regardless if they are in the SG or not. Once an SGM has been added to the SG, it recognizes this from the CCP header and goes into reboot in order to add itself to the SG.

Upon reboot the newly added SGM pulls configuration from the SMO. Because some configurations it pulls require a reboot in order to be applied, it reboots again. If you have auto-cloning enabled, it typically boots even third time to apply the Jumbo.

I'm not sure why in your case it had rebooted four times. This would require better analysis and troubleshooting. Sometimes if the SMO has some problems, the SGMs can reboot several times or even indefinitely as they are not able to get configuration from the SMO. In this kind of case, rebooting the SMO typically fixes the issues.
Note that it always takes longer to get a VSX environment up than a regular SGW especially if you have a lot of virtual systems.

0 Kudos
Dario_Perez
Employee Employee
Employee

In case of deployment agent was updated or you are using Take 335 instead take 338 or the system it-self is corrupted lack space. you might have some issues like that with auto-clone 

0 Kudos
Thomas_Eichelbu
Advisor
Advisor

Hello Lari, 

yes all true, some investigations with TAC also resulted in some answers,
so i allow myself to post the TAC´s original response:

Image Auto-Clone pulls binaries (from the /bin directory) in addition to the text configuration files.
This includes all installed JHF, private HF, and OS modules.

Some files are excluded from Image Auto-clone.
The list of excluded files can be found here (per partition):
/etc/exclude_file_root.conf
/etc/exclude_file_var.conf
/etc/exclude_file_log.conf

Basically, to answer your question - the SGM will neither become a 1:1 replica nor will just transfer the hotfixes. It will copy repositories and will exclude the rest of the files according to the excluded files given.

since my basic question was more like, "does a newly added SMG pulls hotfixes and installs them or does it just copy entire directory structures?"
So it does copy entire directory structures.
Mistery solved!

as you correctly say, when the newly added SGM reboots many times, or if it is stuck a booting loop, i have to investigate further, since this is not normal.

0 Kudos
Lari_Luoma
Ambassador Ambassador
Ambassador

TAC’s answer nailed it. We don’t install the HFAs, but copy the directories.

0 Kudos
Alexander_Wilke
Advisor

For me it is still not clear what triggers the image auto-clone feature. If image auto-clone is enabled and only changes in gclish it just pulls the gclish config I think. But if packages are missing it should do the auto-clone but to do this in my opinion it needs to check if there is really a difference in file system.

 

Thanks.

0 Kudos
Lari_Luoma
Ambassador Ambassador
Ambassador

You will have to enable auto-cloning in gclish with "set smo image auto-clone state on"

When you add an SGM into the security group with the feature enabled, it will copy the binaries in addition to the configuration. Configurations are copied even if auto-cloning is disabled.

0 Kudos
Alexander_Wilke
Advisor

This is not what I was aksing for. E.g. in the past it was possible to have "image auto-clone" enabled and do an jumbo installation. So you installed the jumbo on chassis2 and after the reboot the chassis2 reverted back to the old state because auto-clone was enabled and the chassis2 was cloned by chassis1. So there is something else than just adding a SGM to a Security group that triggers the auto-clone feature and I would like to know what will trigger it or if there is a "force" switch without removing an SGM from security group and add it back.

 

Regards

0 Kudos
Lari_Luoma
Ambassador Ambassador
Ambassador

Auto-cloning is not a feature that will be triggered. It's a feature that is enabled. If you have it constantly enabled, auto-cloning takes place every time you reboot the gateways. You can use it to install Jumbos, but in a production environment it's not the recommended way. Consider the following scenario:
You have four SGMs in your SG and you want to install new Jumbo to two at a time.
First you will install it on SGMs 3 and 4. After the installation they will reboot automatically. Once they come up, you check the Jumbo version, but notice it's still the old one. What just happened? The auto-cloning was enabled and it cloned the HFAs from the SMO that still had the old version installed. So in this kind of upgrade scenario you should keep auto-cloning disabled.

0 Kudos
Alexander_Wilke
Advisor

Hello @Lari_Luoma 

thanks for taking time and trying to explain that feature but I think you are wrong. You can have ato clone enabled all the time and if there is no difference between SGMs it will not "start" at boot time. After the SGMs starts it checks for the SMO and waits until the Cluster "to stabilize". Then it checks - don't know how - if the versions are the same and then boots or pulls the config and if needed clones the SGMs. If cloning was needed it reboots. If cloning was not needed it starts.

 

So the feature can be on every time but it will not trigger every time. That's my experience.

So there must be something which triggers the starting SGM to start the clone process.

 

0 Kudos
Lari_Luoma
Ambassador Ambassador
Ambassador

Checking with R&D and will get back to you. 

0 Kudos
Lari_Luoma
Ambassador Ambassador
Ambassador

Ok checked. 
Auto-cloning verifies if the version is the same (same jhfa etc.). If it’s different the image will will be cloned. Verification happens at reboot and I think also at admin up events. This means that the image will be cloned when an SGM is added to the SG the first time. 


if any other info comes up I will update this thread. 

0 Kudos
Zolo
Contributor
Contributor

Hi ,

@Thomas_Eichelbu

You can check /var/log/reboot.log

Maybe you can find the reason of reboots.

When I performed auto-clone the SGM rebooted 3 times (as Lari said) - after a manual reboot of course 😀

20:01:09 Reason: Unknown caller : Manual reboot/shutdown by system_reboot/shutdown_start Type: manual
20:08:08 Reason: Reboot called by image_clone Type: image_clone
20:35:15 Reason:Reboot after JHF installation. Type:configuration
20:41:11 Reason: reboot_with_log : Rebooting local blade (global context database was modified) Type: configuration

@Alexander_Wilke

You can also check /var/log/image_clone.log.dbg

"... if there is a "force" switch without removing an SGM from security group and add it back"

Maybe a manual change in the /etc/sysconfig/image.md5 file can do the trick but I do not suggest in production 🤔

Alexander_Wilke
Advisor

Hello,

 

just want to let you know a situation when auto-clone will fail.

 

We had a 64k Scalable Plattform installed with R80.20SP (using ext3 file system).
We did the recommended and supported upgrade method to R81.10 which worked fine.
We got an RMA SGM and replaced it. The RMA SGM had R81.10 factory image installed (xfs file system) and auto-clone failed.

So you may have the same major version but different file-systems and this feature will not work. Keep this in mind if you follow CheckPoint's "recommended" upgrade paths.

0 Kudos
AkosBakos
Leader Leader
Leader

Hi Alexander_Wilke,

 

I'm just wondering, what was the solution for this issue?

The configuration was copied between the SGM-s, or  not?

BR

Akos

----------------
\m/_(>_<)_\m/
0 Kudos
Alexander_Wilke
Advisor

Hello @AkosBakos 

we did a fresh install of all of our SGMs from USB stick using R81.10.

The fresh install installed the XFS file system.
If I remember correctly before we did this we disabled MDPS (Management DataPalne Separation) because this is not supported in R81.10 without any Jumbo.

With R81.10 and Jumbo Take 110 and enabled MDPS it is possible to do image-auto clone to a fresh installed SGM.

Works for Maestro, too.

0 Kudos
AkosBakos
Leader Leader
Leader

Hi @Alexander_Wilke,

Thanks for the answer. Can I have one more question? When you put the new SGM with the new filesystem into the security group, the configuration of the existing security group was copied tho the new SGM?

In short: Even through the image cloning is not working, the configuration is distributed to the new SGM? 

BR

Akos

----------------
\m/_(>_<)_\m/
0 Kudos
Alexander_Wilke
Advisor

Yes,

Configuration was synced between SGMs. Because of that we had to disable MDPS before.

 

0 Kudos