Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Bob_Zimmerman
Authority
Authority

Cloud-Init Troubleshooting Logs?

I'm trying to build a VM from the R81.20 management/standalone qcow2 images. I've tried both the OpenStack and the generic KVM images. I built a config drive image using the directions in sk180452, and neither qcow2 appears to even try mounting the emulated optical drive.

I don't see any of the normal cloud-init logs in the places I expect. Does anybody have any information about how to troubleshoot this?

0 Kudos
13 Replies
Bob_Zimmerman
Authority
Authority

Forgot to mention. I also found sk165476. The ISO image I'm using has the label config-2, it's iso9660, and it has both user-data and user_data with the same contents in each.

➜  CheckPointCloudInit lsblk -f
NAME        FSTYPE  FSVER LABEL    UUID                                 FSAVAIL FSUSE% MOUNTPOINT
loop0       iso9660       config-2 2023-09-10-21-36-08-00                     0   100% /mnt/cidata
...

➜  CheckPointCloudInit tree file_structure 
file_structure
└── openstack
    └── latest
        ├── meta_data.json
        ├── user-data
        ├── user_data
        └── vendor-data

2 directories, 4 files

➜  CheckPointCloudInit cat file_structure/openstack/latest/user-data 
#cloud-config
password: letmein

➜  CheckPointCloudInit cat file_structure/openstack/latest/user_data 
#cloud-config
password: letmein

As for the contents of user[-_]data, I have tried the exact contents in sk180452 (which you can see above) and a slightly modified version of the management YAML from sk179752. I have tried it with and without meta_data.json and with and without vendor-data. None of this has changed the behavior of the system in the slightest.

0 Kudos
Jeff_Engel
Employee
Employee

Hi @Bob_Zimmerman 

sk165476 is not relevant for R81.20.  Make sure to also review https://support.checkpoint.com/results/sk/sk180452

Let us know if you still have issues.

BR!

Jeff

0 Kudos
Bob_Zimmerman
Authority
Authority

I don't see how sk165476 wouldn't be relevant to R81.20. Does R81.20 includes a new version of cloud-init which fundamentally changes how it looks for a config drive?

I mentioned that I followed sk180452 in both of my posts and provided output in the second mirroring what is in that article. In that output, I used /openstack/latest rather than /openstack/<specific date>, but I tried it with a specific date as well.

At this point, I think I need the log to find out what cloud-init is trying to do. What device nodes it is examining, what files it is looking for, and so on.

Jeff_Engel
Employee
Employee

R81.20's implementation of cloud-init is completely new.

0 Kudos
Bob_Zimmerman
Authority
Authority

Okay. Can you point me to the logs showing what it tries to do?

Following the directions in sk180452 exactly didn't work. Modifying them based on all the other SKs I mentioned didn't work. Modifying them based on standard cloud-init config drive datasource operation (e.g, /openstack/latest instead of specific dates) also didn't work. Clearly I'm missing something, but I'm not able to find what without logs showing what is being attempted.

0 Kudos
Jeff_Engel
Employee
Employee

Log is located here > $CGEDIR/log/boot.log

Can you also share your current user_data?

0 Kudos
Bob_Zimmerman
Authority
Authority

I realized my terminal prompt doesn't always come across in unstyled text, so I added a > to the end of the prompt to better show the commands I'm running.

 

➜  CheckPointCloudInit> ls -lh
total 364K
-rw-r--r-- 1 zimmie zimmie 360K Sep 10 21:36 config-2.iso
drwxr-xr-x 3 zimmie zimmie 4.0K Sep 10 19:20 file_structure

➜  CheckPointCloudInit> mkdir config-2

➜  CheckPointCloudInit> sudo mount config-2.iso ./config-2
mount: /home/zimmie/CheckPointCloudInit/config-2: WARNING: source write-protected, mounted read-only.

➜  CheckPointCloudInit> lsblk -f
NAME        FSTYPE  FSVER LABEL    UUID                                 FSAVAIL FSUSE% MOUNTPOINT
loop0       iso9660       config-2 2023-09-10-21-36-08-00                     0   100% /home/zimmie/CheckPointCloudInit/config-2
...

➜  CheckPointCloudInit> tree config-2/
config-2
└── openstack
    └── latest
        ├── meta_data.json
        ├── user-data
        ├── user_data
        └── vendor-data

2 directories, 4 files

➜  CheckPointCloudInit> cat ./config-2/openstack/latest/meta_data.json
{
    "availability_zone": "nova",
    "files": [],
    "hostname": "test.novalocal",
    "launch_index": 0,
    "name": "test",
    "meta": {
        "role": "webservers",
        "essential": "false"
    },
    "public_keys": {
        "mykey": "ecdsa-sha2-nistp521 AAAAE2VjZHNhLXNoYTItbmlzdHA1MjEAAAAIbmlzdHA1MjEAAACFBAFqI7nPeqi3AxrFrN6JGHsHM1vWBAjb8kHrBmweQvQ5s9qRRpHgbZTUtKrzukbqnAK/VJLSojMZMc7yFDyXc6lNQQEy9CuTkgjFLW/hvGAbvUbZjb8GmIVfwtkxOpdvck9k1DD5tlpV2Re09kmSu07NuoymMv2Ja9e7crvMcfvvLCSJYw==\n"
    },
    "uuid": "83679162-1378-4288-a2d4-70e13ec132aa"
}

➜  CheckPointCloudInit> cat ./config-2/openstack/latest/user_data
#cloud-config
password: letmein

➜  CheckPointCloudInit> cat ./config-2/openstack/latest/user-data
#cloud-config
password: letmein

➜  CheckPointCloudInit> cat ./config-2/openstack/latest/vendor-data

➜  CheckPointCloudInit>

 

0 Kudos
George_Rubin
Employee
Employee

Hi Bob,

Please follow this SK How to provide user data in KVM with Configuration Drive (checkpoint.com)

And make sure you've used the latest image from CloudGuard Network for Private Cloud images (checkpoint.com) as they were updated a few days ago.

If you're not using OpenStack as hypervisor, make sure you download the generic KVM image.

0 Kudos
Bob_Zimmerman
Authority
Authority

Every single one of my posts except the most recent one has said I followed the directions in sk180452. I didn't mention the SK explicitly in the most recent one, but I gave CLI output from the machine where I'm building the ISO image showing that I followed the directions. I used a different filename for my ISO image, but that should never be exposed to the guest. The ISO mounts just fine on all systems I've tried, including several R81.20 VMs which didn't get the password in it.

I think I downloaded the images on Sunday morning. I will confirm the ones I have are the latest available.

I'm not using the OpenStack userspace (I don't need ~95% of what OpenStack does, so I'm trying to build my own userspace tools to manage my VMs). I am using Hyper-V, and Hyper-V is a supported hypervisor for OpenStack. I have tried with both the OpenStack image and the generic KVM image.

 

My goal is to automate the creation of management VMs to let me write automated integration tests for some management API software I'm writing. It's harder to guarantee I interact correctly with the API when testing against a management server which retains state. Throwaway managements are far more predictable. If I can get this working, I'll just add a step in my integration testing workflow to spawn a new VM for the tests.

0 Kudos
George_Rubin
Employee
Employee

Hi Bob, 

The tree you've provided looks like it's incompatible (should be with the specific date). Make sure you follow the SK.

The images were updated during Sunday so make sure you have the latest ones as the old ones had an issue with generic environments.

Regarding the use of OpenStack with Hyper-V, are you suggesting you're using the qcow2 image with Hyper-V?

You can contact me directly at georger@checkpoint.com and we can go through the process to make sure everything works for you.

George

0 Kudos
Bob_Zimmerman
Authority
Authority

Just confirmed I was using 991001243_ (trailing underscore in original filename) and 991001243_unsecured. Hyper-V doesn't directly support qcow2, so I'm converting the images using qemu_img, just like OpenStack does when running with a Hyper-V hypervisor:

qemu-img convert ./ivory_main-631-991001243_unsecured.qcow2 -O vhdx -o subformat=dynamic ./ivory_main-631-991001243_unsecured.vhdx

I'm downloading 991001385 and 991001385_unsecured now. I should be able to try them this evening.

I tried it with the exact date mentioned in the SK previously, then switched to /openstack/latest based on some OpenStack documentation. I'll switch back to the exact date in the SK for the upcoming tests with 991001385. I'll also grab the boot.log to see what I see.

0 Kudos
Bob_Zimmerman
Authority
Authority

Just tried with 991001385_unsecured. No dice. It left the password as 'admin'. First, the creation of my ISO image:

➜  CheckPointCloudInit> tree file_structure/
file_structure/
└── openstack
    └── 2015-10-15
        ├── meta_data.json
        ├── user-data
        ├── user_data
        └── vendor-data

2 directories, 4 files

➜  CheckPointCloudInit> cat file_structure/openstack/2015-10-15/user_data
#cloud-config
password: letmein

➜  CheckPointCloudInit> mkisofs -r -V config-2 -o ./config_drive.iso file_structure/
I: -input-charset not specified, using utf-8 (detected in locale settings)
Using USER_000.;1 for  file_structure/openstack/2015-10-15/user-data (user_data)
Total translation table size: 0
Total rockridge attributes bytes: 931
Total directory bytes: 4096
Path table size(bytes): 42
Max brk space used 22000
180 extents written (0 MB)

➜  CheckPointCloudInit> 

Then information from the VM after I booted it and let it sit for ~15 minutes.

[Expert@gw-15a623:0]# cd $CGEDIR/log/

[Expert@gw-15a623:0]# ls -AFlh
total 8.0K
-rw-r--r-- 1 admin root 5.0K Sep 14 22:23 boot.log
-rw-r--r-- 1 admin root    0 Sep 14 22:23 user_data

[Expert@gw-15a623:0]# cat boot.log 
++ source /opt/CPcge/scripts/include/version.sh
++++ echo /opt/CPshrd-R81.20
++++ cut -dR -f2-
+++ version=81.20
++++ echo '81.20 * 100'
++++ bc -l
++++ cut -c 1-4
+++ version_number=8120
++ source /opt/CPcge/scripts/include/print.sh
++ ACTIVE_DB=/config/active
++ CLOUD_DB=/tmp/config_cloud
++ CLOUD_DB_DONE=/tmp/config_cloud_done
++ BASIC_CONF_YAML=/opt/CPcge/boot/basic.yaml
++ PLATFORM_CONF_YAML=/opt/CPcge/boot/platform.yaml
++ USERDATA_CONF_YAML=/opt/CPcge/boot/userdata.yaml
++ CONF_YAML=/opt/CPcge/boot/config.yaml
++ PREBOOT_BASH=/opt/CPcge/boot/preboot.sh
++ POSTBOOT_BASH=/opt/CPcge/boot/postboot.sh
++ POSTBOOT_CLISH=/opt/CPcge/boot/postboot.clish
++ USERDATA_RAW=/opt/CPcge/boot/userdata.raw
++ USERDATA_BASH=/opt/CPcge/boot/userdata.sh
++ JSON_SCHEMA=/opt/CPcge/bin/schema.json
++ CLOUD_USER_LOG_OLD=/var/log/cloud-user-data
++ AIO_FILE=/etc/in-aio
++ PAYG_IMAGE_FLAG=/etc/payg
++ VERSION=8120
++ USER_LOG=/opt/CPcge/log/user_data
++ cloud_platform_ex
+ PLATFORM=aio
+ '[' -z aio ']'
+ '[' -f /etc/in-aio ']'
+ '[' '!' -f /config/active ']'
++ cat /etc/in-aio
+ IMAGE=ivory_main-631-991001385.raw
+ rm -rf /etc/in-aio
++ cloud_platform_ex
+ PLATFORM=hyperv
+ echo ivory_main-631-991001385.raw
+ echo hyperv
+ /opt/CPcge/aio/hyperv.sh
modify /etc/appliance_config.xml - set platform name and icon
platform icon hyperv.png not found - setting virtual_old.png as the icon
copying public-cloud.crt to /var/opt/CPshrd-R81.20/conf/public-cloud/631-991001385.crt
200+1 records in
200+1 records out
102748 bytes (103 kB) copied, 0.0935326 s, 1.1 MB/s
chmoding /var/opt/CPshrd-R81.20/conf/public-cloud/631-991001385.crt to 440
Writing (overwrite mode) to /etc/cloud-version
Writing (overwrite mode) to /etc/cloud-version.json
sh: cannot set terminal process group (5448): Inappropriate ioctl for device
sh: no job control in this shell
sh-4.4# cat /opt/CPcge/version | jq '.projects.cloud.branch = "cloud_prod"' | jq '.projects.cloud.build = "991001385"' > /opt/CPcge/version.new
sh-4.4# exit
cloud-version content
release: R81.20
take: 631
build: 991001385
platform: hyperv
license: byol
deployment_method: ftw
cloud-version.json content
{
    "projects": {
        "cge": {
            "branch": "master",
            "commit": "a113638fc317a2c1d30eab95121d2efe623a9c12"
        },
        "cloud-connectors": {
            "branch": "master",
            "commit": "ba9d31c063e45e73972f1a25f10cbfb0f1bcb787"
        },
        "cloud-devops": {
            "branch": "master",
            "commit": "e4c52b1e0fabff0e1172a134e60a39cb7dbc4b91"
        }
    },
    "release": "R81.20",
    "take": "631",
    "build": "991001385",
    "platform": "hyperv",
    "license": "byol",
    "deployment_method": "ftw"
}
version
{
  "projects": {
    "cloud_builder": {
      "branch": "cloud_qa",
      "build": "991000212"
    },
    "cloud": {
      "branch": "cloud_prod",
      "build": "991001385"
    }
  }
}
#ca::ctrlaltdel:/sbin/shutdown -t3 -r now
enable CTRL-ALT-DEL reboot
set time zone to UTC...
Modifying pre-fw1boot to enable usermode
fixing sudoers
disabling icmp redirect sending...
remove /usr/share/syslinux...
Writing (append at end mode) to /etc/ppk.boot/conf/simkern.conf
Writing (append at end mode) to /var/opt/fw.boot/modules/fwkern.conf
testing for SSH keys...
Image modification finished successfully
End of commands
Image creation succeeded
Cloud: hyperv platform detected
Cloud: got execution script from user
Cloud: executing pre-boot script  [SUCCESS]
Cloud: instance initialized with BYOL license
sshd: ############################################################
sshd: -----BEGIN SSH HOST KEY FINGERPRINTS-----
sshd: 1024 MD5:54:3e:49:e6:2b:cb:2d:c8:6b:a8:eb:ac:d4:95:66:54 admin@gw-15a623 (DSA)
sshd: 1024 SHA256:cLNdxho1bmE58CRU47L3k6PPlJWgjwGntMs6K2Ze4zE admin@gw-15a623 (DSA)
sshd: 256 MD5:c9:b3:5b:bc:d9:b3:e8:f2:40:68:85:5c:27:c7:c0:aa admin@gw-15a623 (ECDSA)
sshd: 256 SHA256:VMT9LX7NavW3KmtN0szSws/rFi0ToTEU2Mj7UIleQak admin@gw-15a623 (ECDSA)
sshd: 256 MD5:ef:9d:b5:d5:6d:8e:85:3a:61:63:d3:cf:60:69:cd:61 admin@gw-15a623 (ED25519)
sshd: 256 SHA256:0ImyJY1a08P5f1NQyWByupzpavIQuD9sjDbHM5dfN2s admin@gw-15a623 (ED25519)
sshd: 2048 MD5:34:1a:73:24:9e:d5:53:f2:47:6b:13:35:fa:24:5c:00 admin@gw-15a623 (RSA)
sshd: 2048 SHA256:Cy8KP3gwbi6aGHFNKCHXyYZCLnGs+cyQK7rUjvwnxzs admin@gw-15a623 (RSA)
sshd: -----END SSH HOST KEY FINGERPRINTS-----
sshd: ############################################################
Updates state changed to on for component CME
AutoUpdater going down
Running installation command: "/opt/AutoUpdater/latest/bin/autoupdatercli install /var/log/upload/Check_Point_CME_AUTOUPDATE_Bundle_T246_AutoUpdate.tar"

	Install request of component CME version 246 handled. To see installation status, see logs: /opt/AutoUpdater/AutoUpdater.log and /opt/CPInstLog/AutoUpdateLogs/CME

Installation started for CME version 246
Starting installation monitor
Installation succeeded
exit 0
********** Done Installing CME Package **********
Cloud: executing startup script  [SUCCESS]
Cloud: executing post-boot script  [SUCCESS]

[Expert@gw-15a623:0]# mount  
/dev/mapper/vg_splat-lv_current on / type xfs (rw,inode32)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda2 on /boot type ext3 (rw)
/dev/mapper/vg_splat-lv_log on /var/log type xfs (rw,inode32)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda2 on /mnt/BlinkPlugAndPlay_usb type ext3 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

[Expert@gw-15a623:0]# ls -AFlh /mnt    
total 1.0K
dr-xr-xr-x 5 admin root 1.0K Sep 14 22:21 BlinkPlugAndPlay_usb/

[Expert@gw-15a623:0]# mkdir /mnt/config-2

[Expert@gw-15a623:0]# mount /dev/cdrom-sr0 /mnt/config-2/
mount: /dev/scd0 is write-protected, mounting read-only

[Expert@gw-15a623:0]# ls -AFRl /mnt/config-2/
/mnt/config-2/:
total 2.0K
dr-xr-xr-x 3 admin root 2.0K Sep 14 22:07 openstack/

/mnt/config-2/openstack:
total 2.0K
dr-xr-xr-x 2 admin root 2.0K Sep 10 19:58 2015-10-15/

/mnt/config-2/openstack/2015-10-15:
total 2.0K
-r--r--r-- 1 admin root 564 Sep 10 21:05 meta_data.json
-r--r--r-- 1 admin root  32 Sep 10 19:33 user-data
-r--r--r-- 1 admin root  32 Sep 10 19:33 user_data
-r--r--r-- 1 admin root   0 Sep 10 19:12 vendor-data

[Expert@gw-15a623:0]# cat /mnt/config-2/openstack/2015-10-15/user_data
#cloud-config
password: letmein

[Expert@gw-15a623:0]# ls -l /dev/cdrom*
lrwxrwxrwx 1 admin root 4 Sep 14 22:20 /dev/cdrom -> scd0
lrwxrwxrwx 1 admin root 4 Sep 14 22:20 /dev/cdrom-sr0 -> scd0

[Expert@gw-15a623:0]# ls -l /dev/scd*  
brw-r----- 1 admin disk 11, 0 Sep 14 22:20 /dev/scd0

[Expert@gw-15a623:0]# readlink /sys/dev/block/11\:0/device/driver 
../../../../../../../bus/scsi/drivers/sr

At a guess, maybe it isn't trying to mount /dev/cdrom-sr0.

0 Kudos
Bob_Zimmerman
Authority
Authority

After working offline with George for a while, we were able to find a small bug in the new cloud-init implementation. Got a new image to test (991001402). While I still haven't been able to get the Config Drive method working, I was able to make my own HTTP metadata service, which works well enough.

  • The OpenStack version in the path must be 2015-10-15. This string is the only version it will check right now.
  • It looks like it will only attempt the IPv4 metadata service address for now.
  • The HTTP metadata service makes a call to http://169.254.169.254/openstack/2015-10-15/, and GAiA doesn't know where to route APIPA addresses. If the metadata service is running on the default gateway (either the stock one GAiA uses or the gateway address handed out by DHCP), this isn't a problem. If the HTTP metadata server is somewhere else, you need to add a route to the default gateway to point 169.254.169.254 to it.
  • At /openstack/2015-10-15/, it expects to get a newline separated list of filenames. It then tries to download each file in the list. I'm serving the file list with the MIME type of application/octet-stream. I currently only list one file in this index, and I'm not sure how multiple files are combined.
  • The MIME type of the subsequent files does not appear to matter. I'm using the file name user_data with no extension, which my web server hands out as an application/octet-stream. The "#cloud-config" line is required in at least one file, probably the first. Might need to be in all files.

I'm using OpenBSD for my metadata service right now. My relevant files:

MetadataService# cat /etc/hostname.hvn0                                                                      
inet autoconf
inet alias 169.254.169.254 255.255.0.0
inet6 autoconf
inet6 alias fe80::a9fe:a9fe 64

MetadataService# cat /etc/httpd.conf                                                                         
server "metadata.zimmie.local" {
	listen on * port 80
	root "/htdocs/metadata"
	directory index index.txt
}

MetadataService# cat /etc/rc.conf.local                                                                      
httpd_flags=

MetadataService# cat /var/www/htdocs/metadata/openstack/2015-10-15/index.txt 
user_data

MetadataService# cat /var/www/htdocs/metadata/openstack/2015-10-15/user_data 
#cloud-config
password: letmein

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.