After a lot of back and forth and pulling my brains out I reached a conclusion 🙂
There is an issue with Cloudguard and ECDSA based CA.
I created my own CA, cert, all with RSA, imported it as OPSEC Trusted CA, disabled CRL -> all worked, I got IPSEC up.
I created my own CA, cert, all with ECDSA (prime256v1), imported it as OPSEC Trusted CA, disabled CRL -> vpnd process keeps restarting.
Initially I had a trust chain on Aviatrix (Root CA, Intermediate). When I built my custom ones I did it directly Root CA -> client cert, no more Intermediate to also eliminate this from the variables.
In the log dir ($FWDIR/log) 4 files see changes constantly:
-rw-rw---- 1 admin root 118557 May 17 13:56 fwd.elg
-rw-rw---- 1 admin root 1563209 May 17 13:56 vpnd.elg
-rw-r--r-- 1 admin root 58000 May 17 13:56 core_uploader.elg --> has just one bash.11759.core (I see Checkpoint runs separate bash shells to start its own processes)
-rw-rw---- 1 admin root 7034033 May 17 13:56 sxl_statd.elg
vpnd.elg logs:
Unable to open '/dev/fw6v0': No such file or directory
fw_get_kernel_instance_num: Invalid instance num 0 - return 0
Unable to open '/dev/fw6v0': No such file or directory
fwd.elg
[17 May 14:16:37] fwd: restarting vpnd
restarting in 4 seconds
[17 May 14:16:46] fwd: restarting vpnd
restarting in 4 seconds
[17 May 14:16:57] fwd: restarting vpnd
restarting in 4 seconds
[17 May 14:17:06] fwd: restarting vpnd
restarting in 4 seconds
On the positive side it does not coredump, it does seem not to like something while it probably loads the Trusted Imported ECDSA CA though and keep restarting in a loop.
Restarting the GW stopped the loop with the processing being reloaded.
tp_events.elg
05/17/23 14:14:03;****CI:0, IPS:0, MALWARE:0, TP:0****
05/17/23 14:15:03;****CI:0, IPS:0, MALWARE:0, TP:0****
05/17/23 14:16:03;****CI:0, IPS:0, MALWARE:0, TP:0****
05/17/23 14:17:03;****CI:0, IPS:0, MALWARE:0, TP:0****
05/17/23 14:18:03;****CI:0, IPS:0, MALWARE:0, TP:0****
05/17/23 14:19:03;****CI:0, IPS:0, MALWARE:0, TP:0****
05/17/23 14:20:03;****CI:0, IPS:0, MALWARE:0, TP:0****
(these log entries grow at a rapid rate)
P.S. I used the same fields in both RSA and ECDSA Trusted CA + certificate case to be sure that I narrow down the behaviour:
# with custom fields
[req]
default_bits = 2048
distinguished_name = req_distinguished_name
req_extensions = req_ext
[ req_distinguished_name ]
countryName = Country Name (2 letter code)
stateOrProvinceName = State or Province Name (full name)
localityName = Locality Name (eg, city)
organizationName = Organization Name (eg, company)
commonName = Common Name (e.g. server FQDN or YOUR name)
# Optionally, specify some defaults.
countryName_default = CH
stateOrProvinceName_default = Bern
localityName_default = Bern
organizationName_default = Fooling around
commonName_default = mihaigw.com
organizationalUnitName_default = research
emailAddress_default = mihai@mihaigw.com
[ req_ext ]
subjectAltName = @alt_names
[alt_names]
DNS.1 = mihai.mihaigw.com
DNS.2 = 10.1.0.36
Checkpoint sees the imported Trusted CA as:
Subject: Email=info@aviamix.com,CN=aviamix.com,OU=IT,O=Aviamix,L=Bern,ST=Bern,C=CH
Issuer: Email=info@aviamix.com,CN=aviamix.com,OU=IT,O=Aviamix,L=Bern,ST=Bern,C=CH
Not Valid Before: Wed May 17 16:33:08 2023 Local Time
Not Valid After: Fri Jan 1 16:33:08 2038 Local Time
Serial No.: 0085d75319cc2ea55b
Public Key: ECDSA (256 bits)
Signature: ECDSA with SHA256
Basic Constraint:
is CA
MD5 Fingerprint:
BD:5C:77:56:73:4A:A1:1E:3D:3E:CA:1B:8A:75:C5:11
SHA-1 Fingerprints:
1. D4:F8:BA:4E:8F:1F:05:69:39:CC:55:B9:20:DA:B3:5F:06:C9:63:7A
2. RUSS NOSE HANS IQ TRUE LUST SAC CALL CUTS TOO LAUD LIND
On the Aviatrix/Strongswan side I can see:
198[CFG] added vici connection: gw-10_1_0_36-137_117_143_50
198[CFG] initiating 'net-0_0_0_0_0-0_0_0_0_0'
198[IKE] initiating IKE_SA gw-10_1_0_36-137_117_143_50[25] to 10.10.10.4
198[ENC] generating IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) N(FRAG_SUP) N(HASH_ALG) N(REDIR_SUP) ]
198[NET] sending packet: from 10.1.0.36[500] to 10.10.10.4[500] (330 bytes)
211[IKE] retransmit 1 of request with message ID 0
211[NET] sending packet: from 10.1.0.36[500] to 10.10.10.4[500] (330 bytes)
216[IKE] retransmit 2 of request with message ID 0
216[NET] sending packet: from 10.1.0.36[500] to 10.10.10.4[500] (330 bytes)
221[IKE] retransmit 3 of request with message ID 0
221[NET] sending packet: from 10.1.0.36[500] to 10.10.10.4[500] (330 bytes)
226[IKE] giving up after 3 retransmits
226[IKE] establishing IKE_SA failed, peer not responding
so the Cloudguard does not send anything back.
At some point (before all tests) I thought it was an MTU problem and I decreased (I know, better to activate MSS clamping) the MTU on the physical interface to 1400 (both sides).
I did not look in Wireshark to see the packet length, was a bit lazy with having to disable RX/TX offloading in order to see the real packet size and not huge values.
Activating vpn debug trunc ALL=5 shows:
[vpnd 4335 4071397376]@checkpointgw2[17 May 15:00:34][io] [NEGOTIATIONS]: Adding negotiation to peer: <my remote ip>. Current negotiations=2
[vpnd 4335 4071397376]@checkpointgw2[17 May 15:00:34] findSAByPeer: Find SA with cookies 9ed9034f6bdbf05c,0000000000000000 from packet
[vpnd 4335 4071397376]@checkpointgw2[17 May 15:00:34] findSAByPeer: Valid ISAKMP SA was not found. me=0, peer=1404818e
..
[vpnd 4335 4071397376]@checkpointgw2[17 May 15:07:22] find_sa_by_ike_peer: Find IKE SA for IKE peer <<my remote ip>,0000000000000000>
[vpnd 4335 4071397376]@checkpointgw2[17 May 15:07:22] find_sa_by_ike_peer: No IKE SA for this IKE peer found
[vpnd 4335 4071397376]@checkpointgw2[17 May 15:07:22][ikev2] vpn1IKEConfiguration::hasExchangeFailed: Identified peer <my remote ip> in failed exchanges list
[vpnd 4335 4071397376]@checkpointgw2[17 May 15:07:22][ikev2] getIkeVersion: ikev2 exchange has failed. try ikev1 (peer: <my remote ip>), failoverFromIKEv2: -1
[vpnd 4335 4071397376]@checkpointgw2[17 May 15:07:22][tunnel] RequestByMethods_ikev1: enter
IKEview shows "waiting for arriving message", final status: failure, on Proposal 1.
I do know it's not Aviatrix not sending them...as when I switch to the RSA CA, then it all works.
Now here I got stuck and I have a feeling this is more for support / TAC :).