Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
the_rock
Legend
Legend
Jump to solution

R82 elasticXL lab

Hey boys and girls, ladies and gents,

I built R82 elasticXL lab and though I followed below link by @HeikoAnkenbrand , not sure if I cant make it work cause Im using eveNG or for what reason, but I created 2 separate elasticxl instances, but clustering part fails for some reason, so if anyone has an idea, happy to hear it 🙂

I could care less if this lab breaks, its super easy to rebuid anyway. 

This is the link I was referring to. I also attached some screenshots and outputs.

Andy

https://community.checkpoint.com/t5/Security-Gateways/R82-Install-ElasticXL-Cluster/td-p/206235

 

Screenshot_1.png

 

 

Screenshot_2.png

 

 

Screenshot_3.png

 

[Expert@CP-EXL-1-s01-01:0]# cphaprob state

Cluster Mode: HA Over LS

ID Unique Address Assigned Load State Name

1 (local) 192.0.2.1 100% ACTIVE(P) CP-EXL-1-s01-01


Active PNOTEs: None

Last member state change event:
Event Code: CLUS-114904
State change: ACTIVE(!) -> ACTIVE
Reason for state change: Reason for ACTIVE! alert has been resolved
Event time: Mon Jul 1 19:40:49 2024
[Expert@CP-EXL-1-s01-01:0]#

 

[Expert@CP-EXL-02-s01-01:0]# asg monitor
Mon Jul 01 20:44:20 EDT 2024

--------------------------------------------------------------------------------
| System Status - ElasticXL |
--------------------------------------------------------------------------------
| Up time | 39:27 minutes |
| Members | 1 / 1 |
| Version | R82 (Build Number 633) |
Mon Jul 01 20:44:21 EDT 2024
--------------------------------------------------------------------------------
| System Status - ElasticXL |
--------------------------------------------------------------------------------
| Up time | 39:29 minutes |
| Members | 1 / 1 |
| Version | R82 (Build Number 633) |
| FW Policy Date | 01Jul24 20:38 |
| AMW Policy Date | N/A |
--------------------------------------------------------------------------------
| Member ID Site1 |
| ACTIVE |
--------------------------------------------------------------------------------
| 1 ACTIVE |
--------------------------------------------------------------------------------


^C
[Expert@CP-EXL-02-s01-01:0]#

 

[Expert@CP-EXL-02-s01-01:0]# cphaprob -a if

CCP mode: Automatic

Interface Name: Status:

eth2 UP
eth3 UP
Sync (S) UP
magg1 (LS) UP

S - sync, HA/LS - bond type, LM - link monitor, P - probing

 

 

[Expert@CP-EXL-1-s01-01:0]#
[Expert@CP-EXL-1-s01-01:0]# cphaprob -a if

CCP mode: Automatic

Interface Name: Status:

eth2 UP
eth3 UP
Sync (S) UP
magg1 (LS) UP

S - sync, HA/LS - bond type, LM - link monitor, P - probing

Virtual cluster interfaces: 5

lo 127.0.0.1
eth2 192.168.10.238
eth3 169.254.0.238
Sync 192.0.2.1
magg1 172.16.10.238

[Expert@CP-EXL-1-s01-01:0]#

 

 

Virtual cluster interfaces: 5

lo 127.0.0.1
eth2 192.168.10.237
eth3 169.254.0.237
Sync 192.0.2.1
magg1 172.16.10.237

[Expert@CP-EXL-02-s01-01:0]#

 

And since elasticxl cluster object does NOT have an option to add cluster members, there is something obvious Im missing, but cant figure out what, so will check it later.

 

Andy

 

 

Screenshot_1.png

 

 

 

0 Kudos
64 Replies
Bob_Zimmerman
Authority
Authority

R82 offers to let you build an ElasticXL cluster out of a 3000-series unit, but it fails in rather spectacular fashion. Gets stuck in a boot loop which needs hands on to fix. You never get the boot menu, so you can't revert to factory defaults without someone cycling power. I get that ElasticXL isn't supported on the 3000-series boxes, but the UI offers it up. It's even the default cluster method for them unless you go out of your way to specify ClusterXL. I expect this will bite a LOT of people when boxes start shipping with R82 by default. Edit: just got word R&D has PMTR-114648 for this boot loop. I bet the fix will be to disable the option in the setup wizard to make the 3000-series into an ElasticXL cluster, but we'll see.

 

While it's not supported, it's possible to set up an ElasticXL cluster for lab use on a pair of 3000-series boxes. I've only tested it on 3600s, since that's what I physically have, but they're all identical in almost all of the ways which matter for this. To build the first member and set up the cluster:

  1. Install R82 (or some later version, I assume)
  2. Boot the system
  3. Connect via console
  4. Edit /etc/udev/rules.d/00-QB-10-00.rules (3600 and 3800 are QB-10; the file for a 3100 or 3200 is 00-PB-10-00.rules)
    1. Replace "eth1" with "Sync"
  5. Reboot
  6. Run the commands to make exl_detectiond check the system again
    1. dbset process:exl_detectiond t
    2. dbset :save
    3. tellpm process:exl_detectiond t
  7. Edit /etc/udev/rules.d/00-QB-10-00.rules
    1. Replace "Sync" with "eth1-Sync"
    2. Do not reboot!
  8. Run the first-time wizard or apply config_system. Be sure to select the ElasticXL clustering method.
  9. Once the system is configured, you will need to run 'add bonding group 1 interface Mgmt' in gclish.

To add another member, you follow steps 1-6, then have one of the working members accept the new member's join request.

Incidentally, a 3600 (or cluster of them, or probably a cluster of 3100 units, 3200 units, or 3800 units) can also run VSNext this way. I haven't yet tried, but I bet it would even work on a 2200, which uses the file 00-T-110-00.rules.

0 Kudos
Machine_Head
Collaborator
Collaborator

Regarding the interface names, this worked for me on VmWare:

> set interface-name by-name eth0 to Sync

 

Might be a new clish command

0 Kudos
Bob_Zimmerman
Authority
Authority

That command has existed since at least R80.40 (I don't have anything earlier to check). Only works on open servers and VMs, though.

emmap
Employee
Employee

I got my hands on a pair of 5100s that I have built into an EXL cluster, it's also not supported as they don't have Sync interfaces but with the info in this thread and some other trickery I made it work. I figured I'd document it here for posterity with the usual caveat - this is for educational purposes only and is not supported in an production environment.

Once R82 is fresh installed on the appliances, log in and get to expert mode. You need to edit the udev rules file, same as with a 3X00, but it's a bit more complicated because of the expansion card slot in the appliance. It'll look very different to a VM or a 3X00 - basically, delete what is in there and put this in:

ID=="0000:03:00.0", NAME="eth1"
ID=="0000:04:00.0", NAME="eth2"
ID=="0000:05:00.0", NAME="eth3"
ID=="0000:06:00.0", NAME="eth4"
ID=="0000:07:00.0", NAME="Sync"
ID=="0000:08:00.0", NAME="Mgmt"

Those ID line are PCI bus addresses for interfaces on a 5100, you can check what they are on your system by using the command 'lspci'. Those lines in there will rename the existing eth5 interface as Sync and keep the rest as-is. 

The next step is to turn off the 'see if we have a line card installed' code. Edit the  /etc/appliance_config.xml file and change the line <loop>yes</loop> to <loop>no</loop> - this file is not writable, so chmod it to +w before you edit it and then -w after.

Next you have to set the EXL detection like this:

dbset process:exl_detectiond t
dbset :save

Reboot the appliance. It may come up without the Sync interface enabled as it didn't exist in the config before. Set it to state 'on' in clish. Now edit the rules file again to change the Sync interface to eth1-Sync, on gateway 1 only. At this stage you should be able to run the FTW on your gateway 1.

After running the FTW on gateway 1 and getting the cluster going, you need to enable detection on all cluster members to be able to add more gateways into the group.

tellpm process:exl_detectiond t

Edit the Sync interface on your other gateways to be named eth1-Sync here.

At this point I could build the cluster per normal steps, except that one of the 5100s has an -HA license on it. Due to this being an online install, the appliance constantly fetches this license and applies it. This license being applied breaks policy install, because the appliance is not in a CXL cluster - even with an eval on there, it refuses to take a policy install. Unfortunately for me, I lost the coin toss and this appliance as initially my gateway 1_1, which meant I could not actually build the cluster, I had to rebuild it again while swapping over to setting the other appliance as my gateway 1_1. As it is now, every time gateway 1_2 boots up it goes down due to not taking the policy. I can try to get it working again by:

  1. Making sure I disable accelerated policy install
  2. Pulling the policy file from the SMO (cpha_blade_config pull_config policy 192.0.2.1)
  3. Deleting the appliance license
  4. Doing a 'fw fetch [mgmt server ip]'.
  5. Validate with asg_policy verify
  6. See that the policy install times are different but the policy signature is the same
  7. Flail about with repeating the above steps
  8. Give up because it seems to be working ok for now and it's lunch time

So keep all that in mind if you're planning on using existing cluster gateways that were purchased with the old -HA licensing in your new EXL cluster. It's a pain in the neck.

emmap
Employee
Employee

OK I fixed the license fetching issue by changing the MAC address on the Mgmt interface on that box (in local clish). Now it no longer fetches the -HA license and just uses the eval that's on there.

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events