Solved: Re: R82 elasticXL lab - Page 2

the_rock · ‎2024-07-01

Hey boys and girls, ladies and gents,

I built R82 elasticXL lab and though I followed below link by @HeikoAnkenbrand , not sure if I cant make it work cause Im using eveNG or for what reason, but I created 2 separate elasticxl instances, but clustering part fails for some reason, so if anyone has an idea, happy to hear it 🙂

I could care less if this lab breaks, its super easy to rebuid anyway.

This is the link I was referring to. I also attached some screenshots and outputs.

Andy

https://community.checkpoint.com/t5/Security-Gateways/R82-Install-ElasticXL-Cluster/td-p/206235

[Expert@CP-EXL-1-s01-01:0]# cphaprob state

Cluster Mode: HA Over LS

ID Unique Address Assigned Load State Name

1 (local) 192.0.2.1 100% ACTIVE(P) CP-EXL-1-s01-01

Active PNOTEs: None

Last member state change event:
Event Code: CLUS-114904
State change: ACTIVE(!) -> ACTIVE
Reason for state change: Reason for ACTIVE! alert has been resolved
Event time: Mon Jul 1 19:40:49 2024
[Expert@CP-EXL-1-s01-01:0]#

[Expert@CP-EXL-02-s01-01:0]# asg monitor
Mon Jul 01 20:44:20 EDT 2024

^C
[Expert@CP-EXL-02-s01-01:0]#

[Expert@CP-EXL-02-s01-01:0]# cphaprob -a if

CCP mode: Automatic

Interface Name: Status:

eth2 UP
eth3 UP
Sync (S) UP
magg1 (LS) UP

S - sync, HA/LS - bond type, LM - link monitor, P - probing

[Expert@CP-EXL-1-s01-01:0]#
[Expert@CP-EXL-1-s01-01:0]# cphaprob -a if

CCP mode: Automatic

Interface Name: Status:

eth2 UP
eth3 UP
Sync (S) UP
magg1 (LS) UP

S - sync, HA/LS - bond type, LM - link monitor, P - probing

Virtual cluster interfaces: 5

lo 127.0.0.1
eth2 192.168.10.238
eth3 169.254.0.238
Sync 192.0.2.1
magg1 172.16.10.238

[Expert@CP-EXL-1-s01-01:0]#

Virtual cluster interfaces: 5

lo 127.0.0.1
eth2 192.168.10.237
eth3 169.254.0.237
Sync 192.0.2.1
magg1 172.16.10.237

[Expert@CP-EXL-02-s01-01:0]#

And since elasticxl cluster object does NOT have an option to add cluster members, there is something obvious Im missing, but cant figure out what, so will check it later.

Andy

Best,
Andy

genisis__ · ‎2025-02-10

We ended up setting the Image to Redhat 6, so there is clearly a setting in Redhat 8 (in ESXi) which needs to be altered.

Bob_Zimmerman · ‎2025-02-10

RHEL 8 supports Secure Boot, so the vmx preset in ESXi probably enables it in the boot ROM options. A preset for RHEL 7 would probably also work.

genisis__ · ‎2025-02-10

Thanks Bob.

genisis__ · ‎2025-05-24

Finally - I've got Proxmox up and running, created two R82 GWs with JHFA18 with ElasticXL and VSNext.
I can reach both devices via the magg1 interface (or should I say wrp0).

In SmartConsole, only interface discovered was wrp0 which has the management IP assigned. I had to add another unassigned interface and give it an IP before SmartConsole would let me install a policy.

In Proxmox I have a linux bridge defined (as below):

Interface Mappings:

eth0 Mgmt
eth1 eth1-Sync

For some reason I have two interfaces with the same MAC

Sync Link encap:Ethernet HWaddr BC:24:11:82:EB:57
inet addr:192.0.2.1 Bcast:192.0.2.255 Mask:255.255.255.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:101068 errors:0 dropped:0 overruns:0 frame:0
TX packets:46393 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:35323640 (33.6 MiB) TX bytes:24830205 (23.6 MiB)

eth1-Sync Link encap:Ethernet HWaddr BC:24:11:82:EB:57
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:102131 errors:0 dropped:0 overruns:0 frame:0
TX packets:46393 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:35688253 (34.0 MiB) TX bytes:24830205 (23.6 MiB)

add bonding group 1 mgmt
add bonding group 1024
add bonding group 1 interface Mgmt
add bonding group 1024 interface eth1-Sync
set bonding group 1 mode xor
set bonding group 1 mii-interval 100
set bonding group 1 down-delay 200
set bonding group 1 up-delay 200
set bonding group 1 xmit-hash-policy layer2
set bonding group 1024 mode active-backup
set bonding group 1024 primary eth1-Sync
set bonding group 1024 xmit-hash-policy layer2

I cannot see any configuration information other then what I have above regarding Sync or eth1-Sync, it seems to be dedicated to a virtual system so I can't see it in the main configuration (guess).

Now - how on earth do I change node2 to have a different 192.0.2.x IP? surely each node needs to be a unique IP. Also would like to ensure that this node is site 2 not site 1.

How can we troubleshoot sync interface connectivity if we cannot even see the interface?

I think the fact both nodes have the same sync IP address is why they cannot see each other, again a guess.

Bob_Zimmerman · ‎2025-05-24

In VSNext, the interface named Sync is a bond owned by VS 0. magg1 is owned by VS 500:

[Expert@DallasticXL-s01-01:0]# ls -1 /proc/net/bonding/
Sync
bond2

[Expert@DallasticXL-s01-01:0]# vsenv 500
Context is set to Virtual Switch mgmt-switch (ID 500).

[Expert@DallasticXL-s01-01:500]# ls -1 /proc/net/bonding/
magg1

Bonds share the MAC address of one of their members, which is why the interface named Sync (the bond) has the same MAC as the interface named eth1-Sync (the member).

I suspect you went through the initial config wizard on the second member. The second member of an ElasticXL cluster shouldn't have a magg1, VS 500, or wrp0 until after they are talking. You boot it, it offers to join the cluster, and you accept the join offer from a working member. The offer acceptance is where you specify which site it should join.

genisis__ · ‎2025-05-24

I thought that might be the case, so actually I just rebuilt the second node, and have not done configuration wizard.
I then ran "show cluster member info" to get the request-id via console. On the configured node, I then ran

add cluster member method request-id identifier <id no> site-id 1 format json

and then got a response 401 message.

I suspect its because they still can't see each other via the Sync, as this does not exists on the newly built member yet.

Has anyone got this working on Proxmox and could they share there experience? I wondering if there is a setting required on there needed to get the bridge interfaces talking?

I also did:

tcpdump -nnni Sync port 1135 - nothing showed up, however doing the tcpdump without the port reference showed CCP traffic:

8:22:05.958032 IP 0.0.0.0.8116 > 192.0.2.0.8116: UDP, length 44
18:22:05.958035 IP 0.0.0.0.8116 > 192.0.2.0.8116: UDP, length 44
18:22:05.958037 IP 0.0.10.1.8116 > 192.0.2.0.8116: UDP, length 81
18:22:06.008247 IP 0.0.0.0.8116 > 192.0.2.0.8116: UDP, length 40

I see this on both nodes:

Node1 (Sync)

Node2 (eth1 - which would be sync)

So this tells basic comms is there.

the_rock · ‎2025-05-24

Did you try what was suggested to me for eveng? It does work, though unsupported.

Andy

Best,
Andy

genisis__ · ‎2025-05-24

I'm just trying the renaming of interfaces via /etc/udev/rules.d

Here's what I have listed.

gw-57f232> show interfaces
Mgmt
Sync
eth1
eth1-Sync
eth2
lo

---
I think its worked. I actually now see "Pending Gateways" with 1 next to it.

the_rock · ‎2025-05-24

Sounds like that may had done it then.

Best,
Andy

genisis__ · ‎2025-05-25

So I finished off yesterday with a site-1 and site-2 node. two gateway objects in Smart Console, one for the VSNext cluster and one for the VS gateway (I guess its logical, but do wish it was still under the main VSNext cluster object so its clear that the VS is part of that cluster.
One thing I noticed was the JHFA installed on the active node, was not replicated to the standby node, am I right in saying that the when you introduce a new node, or replace a node in a cluster its supposed to basically clone the active node OS and and any jumbos to its identical?

Unless I'm missing some sort of clone commands to do this?

genisis__ · ‎2025-05-25

Good morning!

So fire up the lab today, started the two VMs and then went to connect to the management IP of the cluster, and was greeted with this:

This page could not be displayed. An internal error has occurred.

I could not find anything R82 related in the support site, anyone seen this on R82 ElasticXL? Clearly webui has a problem.

I did a netstat and only port listening on 443 Sync/eth1-Sync.

[Expert@NODE1-s01-01:0]# netstat -in | grep 443
Sync 1500 0 550605 0 0 0 694436 0 0 0 BMmRU
eth1-Sync 1500 0 550606 0 1 0 694436 0 0 0 BMsRU

The management ip sits on wrp0 leading to magg1

Yasushi_Kono1 · ‎2024-11-05

Hi ShaiF,

after having configured everything as described, I have got the 192.0.2.254 address on eth1. However, if I insert the command
add cluster member method request-id identifier xxxxxxxxxxxxxxxxxxxxxxxx site-id 1 format json

then I get the message info:
"message" : "No info for request-id with value xxxxxxxxxxxxxxxxxxxxxxxxxxxx",

How could this problem be resolved? Any hints?

Thanks in advance!

I try to do the configuration on a cloud-based lab environment specifically desgined for our CP classes (skillable).

ShaiF · ‎2024-11-05

Hi @Yasushi_Kono1,

have you tried the simpler way to add member using hostname / serial-number?

can you share the output of 'show cluster info provisioning' or 'show cluster info" from gclish?
BTW: you do not need to put XXX on the request-id value as it is public key, no one can do nothing with it and it meant to be exposed. the private key is on the member you want to join so he's the only one who can join the cluster if you are using request-id as method
Regards,

Shai.

Yasushi_Kono1 · ‎2024-11-05

Thank you for your prompt response.

I have added screen shots to clarify the issue. I tried that via Serial Number as well as the Request-ID.

I would expect to see the other node by typing "show cluster info provision" but this is not the case.

ShaiF · ‎2024-11-05

Hi @Yasushi_Kono1,

If you do not see the member in the show cluster info provisioning or show cluster it means SMO not earing new member and there is no use to continue and add it using any of the method.

you first need to check your connectivity. try ping from smo to 192.0.2.254 and from new member to SMO (192.0.2.1)

in addition see if you get udp traffic from 192.0.2.254 on port 1135 on SMO tcpdump -nnni Sync port 1135

VMs can build interfaces in boot time not in the right order. in most cases you need to match the mac on eth1 for example to the network adapter mac on the VM hypervisor settings and see it indeed connected to your Sync network.

you need to check as well on SMO (to get the original mac on Sync use ethtool -i Sync)

Regards,

Shai.

Yasushi_Kono1 · ‎2024-11-07

Hi Shai,

thank you for your response. In the meanwhile, I could get it run by re-installing the SMO from scratch.

That led me to another question: Is it possible to change the interface designation for the Sync interface, since eth1 is the expected interface for this. How can I swap to let's say eth4?

Thanks a lot again!

Kind regards,
Yasushi

ShaiF · ‎2024-11-10

HI @Yasushi_Kono1

If we're talking on VM then the best solution is to go to your VM setting and edit the network adapters.

there is also option to edit this file on the gw (per member):
/etc/sp_core/conf/vm_mapping.csv
so in your case content will be:
eth0 Mgmt
eth1 eth1
eth2 eth2

eth3 eth3

eth4 eth1-Sync

Regards,

Shai.

Yasushi_Kono1 · ‎2024-11-13

Hi Shai,

Thanks a lot for your valuable information. So, do you have to configure this file before running the FTW?

I will try that right away! You made my day!

Kind regards,
Yasushi

David_Robinson · ‎2024-12-03

Hi ShaiF

I'm trying to get ElasticXL working on a 3200 check point appliance. is there a work around to get it working on an appliance without a dedicated Sync port?

I have tried renaming eth1 to Sync by modifying

/etc/udev/rules.d/00-PB-10-00.rules

The first member is not seeing the second waiting to be provisioned.

emmap · ‎2024-12-03

Per the R82 release notes, the 3000 series appliances don't support ElasticXL. Nor do 5100 or 5200.

Then again, they also say it's not supported on VMs, but it works for lab purposes so there may be a workaround. You can maybe try that file that Shai mentioned a couple of posts up?

ShaiF · ‎2024-12-03

Hi @David_Robinson ,
The best solution is to rename eth0 and eth1 to -Mgmt and Sync (in the udev file). after reboot if you have this interfaces, you will need to re-register the detection daemon by running:
#

dbset process:exl_detectiond t
dbset :save

Do it on both members (before you run FTW on SMO). In this case appliances will fresh load with Mgmt and Sync, detection daemon will run and all should be good (did not tested myself but should be :)).
Regards,

Shai.

faridb · ‎2025-03-20

Hi dear expert,

i followed all the steps including the python trick , from SMO i can ping the 2nd member ip 192.0.2.254 and from the member ip 192.0.2.1 .

I can see traffic from UDP port 1135 requests and replies

from SMO , i now can see an available gateway from show cluster info provision

but it seems to be the SMO ??? not the 2nd member

maybe i missed something

any ideas ?

thx,

the_rock · ‎2025-03-20

Can you send the output/screenshot?

Andy

Best,
Andy

faridb · ‎2025-03-20

the_rock · ‎2025-03-20

Let me see if I can check in the morning, since lab where I have elastcxl is not available now, sorry : - (

Andy

Best,
Andy

faridb · ‎2025-03-20

sure , dude !

i will be glad to make it work

Fyi, i deployed the lab using the latest iso file

the_rock · ‎2025-03-20

Im 100% positive iso is not your issue 🙂

Andy

Best,
Andy

faridb · ‎2025-03-20

genisis__ · ‎2025-05-24

This is exactly what I'm experincing.

faridb · ‎2025-03-18

hi ,

Can you help me understand why i did not get the option to initialize the EXL cluster on the 1st setup wizard but only a checkbox to make this gw member of a clusterXL ?

i'm running R82 on eve-ng Platform

Regards

Are you a member of CheckMates?

R82 elasticXL lab