Background

Bob_Zimmerman · ‎2024-06-10

Background

In my software development work, I found myself wanting an API target which met a few requirements.

Minimal ongoing cost. I don't have $10k to spend on a license and $1k per year on a software subscription (guessing at prices, since there's apparently an unwritten rule on the forum to not discuss the prices available to everyone at catalog.checkpoint.com).
Stays up with minimal manual work. Eval licenses are okay, but only last 30 days, at which point you have to manually click around a website to give yourself a new one, manually click around to license it to the IP, get the license command, and run it on the management. Not great. Buying old boxes for their licenses is risky, and support gets cut off quickly, leading to higher ongoing cost. Plus old boxes are mostly loud and unpleasant.
Repeatable to let me test different management versions and configurations.

That's pretty much it. I don't need config changes I make to stick around. If anything, it's better if they don't, because that lets me build more reliable tests. I may eventually need some firewalls to let me test VSX provisioning and the like, but the solution I arrived at should be extensible enough to allow that later.

Solution

I have built a small service VM which handles provisioning CloudGuard images which I instantiate behind it. I then have a script on my hypervisor which runs periodically to provision a new VM behind it, then clean up any old VMs. By repeating this often enough, I always have a VM behind it which is running on the 15-day plug-and-play eval (doesn't need human interaction). The only cost is my VM host. Each service VM has its own totally isolated network behind it for the functional VMs, so I can build a bunch of service VMs all running different configurations.

My Specific Environment

My VM host happens to be an old Antsle, but I don't use their software. To me, it's simply a silent whitebox server. It has an Intel Atom C3758 (Supermicro A2SDi-H-TF), 128 GB of RAM, a 500 GB NVMe drive for the OS and base image repository, and four 1 TB SATA SSDs for storing the VMs. Antsle uses Akasa cases, which are fanless. I personally dislike Linux, so I decided to try running my VMs on Hyper-V, and got an inexpensive Windows Server license for this. The four VM SSDs are in a Storage Spaces pool which provides one volume, which I have mounted as D. I keep VM disks in D:\VMs\ and base images in C:\ISOs\.

For the service VMs, I use OpenBSD (specifically, you want the amd64 version of the "install" ISO image, which includes the file sets). It happens to be very small, and it includes all of the software we need in a base install. There's no need to pull packages from the Internet. Here is a simple diagram showing how it works:

For this example, I will build a rotating set of SmartCenters. The process to build an equivalent MDS is very similar. First, we need to build the VM.

Building a Service VM

First, we need to build the associated environment plus the service VM itself. Run this in PowerShell on the host:

$functionalVmNamePrefix = "SmartCenter"
$vmStoragePath = "D:\VMs\"

$openBsdImagePath = "C:\ISOs\OpenBSD\install75.iso"
$metadataVmName = "${functionalVmNamePrefix} Metadata Service"
$metadataVhdPath = "${vmStoragePath}\${metadataVmName}.vhdx"
$outsideSwitch = Get-VMSwitch -Name "bridgePort3"

$insideSwitch = Try { Get-VMSwitch -Name "$metadataVmName Internal" -ErrorAction Stop } `
Catch { New-VMSwitch -Name "$metadataVmName Internal" -SwitchType "Private" }
$metadataDrive = Try { Get-VHD -Path "$metadataVhdPath" -ErrorAction Stop } `
Catch { New-VHD -Dynamic -Path "$metadataVhdPath" -SizeBytes 20GB }
$metadataVM = New-VM -Generation 1 `
-Name "$metadataVmName" `
-MemoryStartupBytes 256MB `
-VHDPath $metadataDrive.path `
-SwitchName $outsideSwitch.name `
-BootDevice VHD
Add-VMNetworkAdapter -VMName $metadataVM.name `
-SwitchName $insideSwitch.name
Set-VMDvdDrive -VMName $metadataVM.name `
-Path $openBsdImagePath
Start-VM $metadataVM

This creates and starts the VM, but we still need to install the OS. Here are the answers I use in the OpenBSD installer. Note that the "System Hostname" answer is actually a PowerShell command, so it won't work directly as written. I'm working on auto-installing the OpenBSD VM. You can name it whatever you want:

Install
Choose your keyboard layout = default (default)
System Hostname = $metadataVmName.replace(" ","-")
Network interface to configure = hvn0 (default)
IPv4 address for hvn0 = autoconf (default)
IPv6 address for hvn0 = autoconf
Network interface to configure = done (default)
Password for root account = 1qaz!QAZ
Start sshd(8) by default = yes (default)
Do you expect to run the X Window System = no
Change the default console to com0 = no (default)
Setup a user = no (default)
Allow root ssh login = yes
What timezone are you in = UTC
Which disk is the root disk = sd0 (default)
Encrypt the root disk with a passphrase = no (default)
Use (W)hole disk MBR, whole disk (G)PT, or (E)dit = whole (default)
Use (A)uto layout, (E)dit layout, or create (C)ustom layout = c
a a
offset = 64 (default)
size = 41110240
FW type = 4.2BSD (default)
mount point = /
a b
offset = 41110304 (default)
size = 832736 (default)
FW type = swap (default)
w
q
Location of sets = cd0 (default)
Pathname to the sets = 7.4/amd64 (default)
Set name(s) = -g*
Set name(s) = -x*
Set name(s) = done
Directory does not contain SHA256.sig. Continue without verification = yes
Location of sets = done (default)
Time appears wrong = yes (default)
Exit to (S)hell, (H)alt or (R)eboot = reboot (default)

Then back in PowerShell on Windows, we remove the install image from the emulated optical drive:

Set-VMDvdDrive -VMName $metadataVM.name -Path ""

Configuring the Service VM

Now that we have the service VM, we need to set up the services it will run for us. Specifically, it acts as a DHCP server for the network with the functional VMs, an HTTP server to hand the functional VMs their configuration, and a load balancer which terminates TLS to let our client software use a consistent key to talk to them. First, we set up the services and generate a stable certificate. All of these commands are run inside the service VM:

rcctl enable dhcpd httpd relayd slowcgi
rcctl set dhcpd flags hvn1
cp /bin/ksh /var/www/bin/
cp /bin/ls /var/www/bin/
mkdir -p /var/www/htdocs/metadata/openstack/2015-10-15
mkdir /var/www/htdocs/metadata/jumbo
echo 'net.inet.ip.forwarding=1' >> /etc/sysctl.conf
echo 'net.inet6.ip6.forwarding=1' >> /etc/sysctl.conf
openssl req -x509 -days 365 -newkey rsa:2048 -passout pass:'1qaz2wsx' \
-subj "$(printf "/CN=metadata.standingsmartcenter.mylab.local"
printf "/C=ZZ/ST=Empty/L=Nowhereville"
printf "/O=Zimmie's Lab/OU=Standing SmartCenter")" \
-keyout /etc/ssl/private/encrypted.key -out /etc/ssl/selfSigned.crt
openssl rsa -in /etc/ssl/private/encrypted.key -passin pass:'1qaz2wsx' \
-passout pass: -out /etc/ssl/private/selfSigned.key

Then we set up various files. I typically use vi for this:

### Contents of /etc/dhcpd.conf
subnet 169.254.0.0 netmask 255.255.0.0 {
	option routers 169.254.169.254;
	range 169.254.0.1 169.254.0.3;
	default-lease-time 1800;
	max-lease-time 1800;
}


### Contents of /etc/hostname.hvn0
inet autoconf
inet6 autoconf


### Contents of /etc/hostname.hvn1
inet 169.254.169.254 255.255.0.0
inet6 fe80::a9fe:a9fe 64


### Contents of /etc/httpd.conf
server "metadata.standingsmartcenter.mylab.local" {
	listen on * port 80
	root "/htdocs/metadata"
	directory index index.txt
	location "/jumbo/" {
		request rewrite "/jumbo/index.cgi"
	}
	location "/jumbo/index.cgi" {
		fastcgi
	}
}


### Contents of /etc/pf.conf
set skip on lo
block return log
block in quick from any to {224.0.0.251 ff02::fb}
pass in quick on hvn1 proto udp from port bootpc to port bootps
pass out quick on hvn1 proto udp from port bootps to port bootpc
pass out on hvn0 from hvn0
pass out on hvn1 from hvn1
pass in on hvn1 from hvn1:network to any
pass out on hvn0 from hvn1:network to any nat-to hvn0
anchor "relayd/*"
pass in on hvn0 proto tcp from any to hvn0 port {ssh www https}
block return out log proto {tcp udp} user _pbuild


### Contents of /etc/relayd.conf
table <insideServers> {169.254.0.1 169.254.0.2 169.254.0.3}
http protocol selfSignedCert {
	tls keypair "selfSigned"
}
relay "mgmtApi" {
	listen on hvn0 port https tls
	protocol selfSignedCert
	forward with tls to <insideServers> port https check https "/web_api/" code 401
}


### Contents of /root/.ssh/config
Host 169.254.0.?
	StrictHostKeyChecking no
	UserKnownHostsFile /dev/null
	User admin


### Contents of /var/www/htdocs/metadata/openstack/2015-10-15/index.txt
user_data


### Contents of /var/www/htdocs/metadata/jumbo/index.cgi
#!/bin/ksh
echo "Content-type: application/octet-stream"
echo ""
cpuseName=""
for line in $(ls -1tr DeploymentAgent*);do
cpuseName="$line"
done
[ -n "$cpuseName" ] && echo "$cpuseName"
jumboName=""
for line in $(ls -1tr *_JUMBO_*);do
jumboName="$line"
done
[ -n "$cpuseName" ] && echo "$jumboName"


### Contents of /var/www/htdocs/metadata/jumbo/jumboScript.sh
#!/usr/bin/env bash
# Wait for the management API to be up and running.
false;while [ $? -ne 0 ];do
sleep 60
mgmt_cli -r true show hosts limit 1
done
fileNames=$(curl_cli http://169.254.169.254/jumbo/)
for fileName in $fileNames;do
curl_cli "http://169.254.169.254/jumbo/${fileName}" -o "/var/log/${fileName}"
if [[ "$fileName" == DeploymentAgent* ]];then
clish -c "lock database override"
clish -c "installer agent install /var/log/${fileName}"
rm "/var/log/${fileName}"
elif [[ "$fileName" == *_JUMBO_* ]];then
clish -c "lock database override"
clish -c "installer import local /var/log/${fileName}"
rm "/var/log/${fileName}"
echo "y" | clish -c "installer install ${fileName}"
fi
done


### Contents of /var/www/htdocs/metadata/openstack/2015-10-15/user_data
#cloud-config
password: "1qaz!QAZ"
hostname: "TestSC"
timezone: "Etc/GMT"
clish:
  - set user admin shell /bin/bash
config_system:
  maintenance_hash: "grub.pbkdf2.sha512.10000.614DE3DFE72E72D7D72139355DF35F9AF6335B16BA6487B40ED0F9F2A261014BB3AD8CE1310732696485B4FF43BF503D339FCAD0D2608AAC951DA437CF63DB94.F6C791782BEF20DED8BCCE548176E09A44DC34F20140200DDF75F106BB57EB9039A8C8AA50C34CBB78584B2F93FEAF1AE2A038E5143F3DF9582CC4BC905C4016"
  domainname: "standingsmartcenter.mylab.local"
  install_security_managment: true
  install_mgmt_primary: true
  mgmt_gui_clients_radio: "any"
  mgmt_admin_radio: "gaia_admin"
  download_info: true
  upload_info: true
  upload_crash_data: true
runcmd:
  - curl_cli http://169.254.169.254/jumbo/jumboScript.sh -o /home/admin/jumboScript.sh
  - chmod u+x /home/admin/jumboScript.sh
  - bash -c "nohup /home/admin/jumboScript.sh &"
  - curl_cli http://169.254.169.254/postBootBuild.sh -o /home/admin/postBootBuild.sh
  - chmod u+x /home/admin/postBootBuild.sh
  - ln -s /home/admin/postBootBuild.sh /etc/rc.d/rc3.d/S99zzzPostBootBuild

The maintenance password in that cloud-config file is '1qaz!QAZ'. Next, we run a few final commands in ksh:

chown -R www:www /var/www/htdocs/metadata
chmod 544 /var/www/htdocs/metadata/jumbo/index.cgi
chmod 644 /var/www/htdocs/metadata/openstack/2015-10-15/index.txt
chmod 644 /var/www/htdocs/metadata/openstack/2015-10-15/user_data
syspatch
reboot

With that, our service VM is ready! When we boot a CloudGuard image behind it, the new CloudGuard VM gets a DHCP address from the service VM. It then downloads the user_data (cloud-config) file and starts provisioning itself. At the end of the cloud-config, we have some commands to download jumboScript.sh and postBootBuild.sh, then schedule both for execution.

jumboScript.sh checks the web service's /jumbo/ directory for a CPUSE build and installs it if it finds one. Next, it looks for a jumbo and install it if it finds one. All you have to do is put the CPUSE file and the jumbo in /var/www/htdocs/metadata/jumbo/ on the service VM.

For the postBootBuild.sh script, take a look at my BASH framework for management API commands. I end my SmartCenter's postBootBuild.sh like this:

# Set up the API to allow connections from remote clients.
login "System Data"
mgmtCmd set api-settings accepted-api-calls-from "all ip addresses that can be used for gui clients"
logout
api restart

# Remove the script from the runlevel 3 pool so it only runs once.
rm /etc/rc.d/rc3.d/S99zzzPostBootBuild

Once the management API is set up to accept calls from all client IPs, the health check we set up in /etc/relayd.conf starts working. Now, an HTTPS call to the service VM's external interface results in TLS terminated by relayd (giving you a TLS certificate which doesn't change every time you build a new functional VM), then the data is sent along to the functional VM. The effect is that to a management API client, the service VM looks like a Check Point management server!

Creating Functional VMs

Finally, we create the functional VM. I'm using an R81.20 CloudGuard image meant for OpenStack. I converted the image from qcow2 to Hyper-V's preferred vhdx format, and I put it in C:\ISOs\ on the hypervisor. The particular image I'm using is named ivory_main-631-991001402_unsecured.vhdx. This PowerShell script (I named it CycleSmartCenter.ps1) run on the hypervisor builds a suitable VM from it:

$functionalVmNamePrefix = "SmartCenter"
$vmStoragePath = "D:\VMs\"
$functionalBaseImagePath = "C:\ISOs\ivory_main-631-991001402_unsecured.vhdx"

$functionalVmName = "${functionalVmNamePrefix}-$(Get-Date -DisplayHint Date -Format FileDate)"
$functionalVhdPath = "${vmStoragePath}\${functionalVmName}-$(New-Guid).vhdx"
$functionalDrive = New-VHD -Differencing `
-ParentPath "${functionalBaseImagePath}" `
-Path "${functionalVhdPath}"

$metadataVmName = "${functionalVmNamePrefix} Metadata Service"
$insideSwitch = Get-VMSwitch -Name "$metadataVmName Internal"
$functionalVM = New-VM -Generation 1 `
-Name "$functionalVmName" `
-MemoryStartupBytes 16GB `
-VHDPath $functionalDrive.path `
-SwitchName $insideSwitch.name `
-BootDevice VHD
Set-VMProcessor -VM $functionalVM -Count 2
Start-VM -VM $functionalVM

$action = New-ScheduledTaskAction -WorkingDirectory $vmStoragePath -Execute 'powershell.exe' `
-Argument "-NonInteractive -NoLogo -NoProfile -File `"$(Get-Location)\Remove-OldVms.ps1`" `"${functionalVmNamePrefix}`""
$trigger = New-ScheduledTaskTrigger -Once -At (Get-Date).AddHours(2)
$principal = New-ScheduledTaskPrincipal -UserID "NT AUTHORITY\SYSTEM" -LogonType ServiceAccount -RunLevel Highest
$task = New-ScheduledTask -Action $action -Principal $principal -Trigger $trigger
Register-ScheduledTask "Cleanup ${functionalVmNamePrefix}" -InputObject $task

The section at the end defines a scheduled task which will run two hours after the script to create a new VM. Here is the cleanup script (Remove-OldVms.ps1):

param([Parameter(Position=0,mandatory=$true)][String]$functionalVmNamePrefix)

Start-Transcript -path "Cleanup ${functionalVmNamePrefix}.log" -append
Write-Output "About to clean up ${functionalVmNamePrefix} VMs ..."
Write-Output "Cleanup starting at $(Get-Date)."
$oldDate = (Get-Date).AddDays(-2)
(Get-VM | Where-Object {
        $_.Name -Match "${functionalVmNamePrefix}-[0-9]+" `
        -And $_.CreationTime -lt $oldDate }) `
| ForEach-Object {
        Stop-VM -VM $_ -TurnOff
        $_.HardDrives | ForEach-Object {
                Remove-Item -Path "$($_.Path)"
        }
        Remove-VM -VM $_ -Force
}
Write-Output "Cleanup finished at $(Get-Date)."
Unregister-ScheduledTask -Confirm:$false "Cleanup ${functionalVmNamePrefix}"
Stop-Transcript

Now every time you run the CycleSmartCenter.ps1 script, it creates a whole new VM, the service VM provisions it as a SmartCenter and hands it the jumbo and postBootConfig, then the hypervisor deletes the old related VMs two hours later.

Schedule the execution of CycleSmartCenter.ps1, and you will always have a SmartCenter VM to try API calls against. Every time it cycles, you get a new VM with exactly the config you specify in the postBootBuild script. This is great for automated integration testing for API client software. There's a very limited window for accumulated state to change the test results.

Notes for an MDS

Change the 'functionalVmNamePrefix'
Modify the cloud-config file (user_data) with 'install_mds_primary: true' and 'install_mds_interface: eth0'
Bump the functional VM creation command to at least four cores
Change your postBootBuild.sh to build an MDS. Generally this involves building some domains, then building your objects in one of the domains.

PhoneBoy · ‎2024-06-11

Well done!

the_rock · ‎2024-06-11

Fantastic, job well done @Bob_Zimmerman 👍👌

Duane_Toler · ‎2024-06-11

Interesting process. You've put a lot of work into this. Some items you may find helpful:

Consider using the "da_cli" commands instead of calls to "clish -c .." for installing your JHFs. It returns JSON from each command. You'll have to follow the ActionID value that is returned from each command and sometimes monitor the Message attribute. You can do all the commands you'd expect for a package: download, add_private_package, import, verify, install, upgrade, uninstall, package_info. You can get info of all packages with packages_info. You can a reboot_delay after install, uninstall, or upgrade. The 'get_status_of_action' command will show you the progress (same as you'd see in WebUI). Run "da_cli |sort" to get a sorted list of commands available. I think you'll like this very much for some deterministic results.
da_cli da_status will show you the agent build and any of its internal operations, such as when the machine boots or after you run "da_cli check_for_updates"
With da_cli, become friendly with 'jq', if you aren't already, so you can filter, parse, and select specific JSON values as you require. You can also reformat the output into another JSON document if you need to carry values forward to another script.
All the various Check Point packages are actually indexed and stored by the "packageKey" attribute from a package's info (as returned by "da_cli package_info package=Check_Point_....JUMBO...tgz". Depending on what you want to do, you may want to key off this value.
For updating Deployment Agent, you might be better served doing the updates with rpm (after extracting it from the tarball) and running dastop;dastart (and checking with 'da_cli da_status'). At some point, you'll have the inevitable backward-compatibility break where an older DA from a fresh install can't update itself to a new DA you download. RPM will be the only way.

Good work!

the_rock · ‎2024-06-11

I see what you mean about da_cli. I always found it super helpful/useful.

Andy

Duane_Toler · ‎2024-06-13

Yeah it is! If you have the CPUSE identifier of a package (and older JHF, for example), you can tell da_cli to import that older package from the Check Point download server, but this is a 2-step operation with da_cli. If you have a custom hotfix, you can copy that and do a local import of it (same as CLISH "installer import").

I've written a large Ansible role that handles the entire sequence of installing both Jumbo HFAs, hotfixes, and full version upgrades. It also monitors the progress of each operation and reboots. The process seems obvious and simple at first, but... hahaha... yeah right! I encountered a bunch of odd cases that had to be handled.

I would have gone insane trying to do it all with shell scripts and brute force. Ansible was a huge help for the skeletal structure, communications, management plane, and control plane handling. Thankfully, da_cli spits out JSON, too, so that was an easy query.

The one thing that is still troublesome, tho, is there are some absolute timers at play, which I still don't like. I debated using async and poll for the install/upgrade routines and still may switch to that. While I *could* run the playbooks targeting 20 hosts, there would be a LOT of variances for the post-install/upgrade reboots and having to monitor those. I wish there was an obvious way to suppress the reboot. Unfortunately, da_cli doesn't have an option for that; it only has a reboot_delay, but again, that's an absolute timer. 😕 So far, tho, running it on a single customer with 2 or 3 hosts of similar capabilities seems to work well.

This routine definitely had its "day in the sun" when the CVE hotfix patch came down and when JHFs got updated! This took a multi-hour process down to minutes (of my time, attention, and mental effort; the operations still took their normal time on the hosts). I didn't have to watch, manage, or monitor anything other than the Ansible status outputs. It was nice to do a customer update at midnight, and know that it was going to be OK.

the_rock · ‎2024-06-13

100%. I have not used it much in the past, maybe few times, but so far, had not had any issues with it at all.

Best,

Andy

Are you a member of CheckMates?

Long-Lived Lab API Target

Background

Solution

My Specific Environment

Building a Service VM

Configuring the Service VM

Creating Functional VMs

Notes for an MDS