Question about VSNext Requirements

Oscar_David_Gom · ‎2025-11-13

Hi,

I have couple of questions about VSNext.

We recently tried to deploy an ElasticXL Cluster with 2 19100 GW and using VSNext to replace 2 HA ClusterXL. 2 VS were created.

VGW External

VGW Internal

It is a simple architecture:

Internet < Bond10 > VS_EXT (19100 and MGMT's Default GW) < Bond11 > Core < Bond12 > VS_INT < Bond13 > Internal Networks

The activity and window was as many of you have exposed here in checkmates:

1. We configured everything on Member 1, interfaces, bonds, routes, dns, ntp, etc. on each VS as required (VS0, Only 1 VSW and it was the 500 for mgmt, VS EXT and VS INT). Pretty simple as VSNext promised. We had 2 firewalls in little time compared with traditional VSX.

2. We leave member 2 just clean installed.

Then we...

3. Cloned actual policy sets.

4. Connected M1 mgmt port.

5. Created the 3 objects (SMO/VS0, VS_EXT, VS_INT)

6. Replaced whatever we had to with new objects.

7. Installed policy, everything ok.

8. Connected only member 1 to the network like this

Internet < Bond10 > VS_EXT < Bond11 > Core > Normal ClusterXL

We wanted to test everything only for VS_EXT and only with SMO Member before clonning the 2nd one. Basically we disconnected every fiber from older External ClusterXL members and connected it to Bond10 and 11 on ElasticXL Member 1. AFAIK, traffic only needed to cross bond10, then fw, then bond 11. The other VS was irrelevant in this window.

Everything went from bad to worse. Traffic passed the FW but latency was huge, lot of intermittence, some traffic working, some traffic dont. Big differenced compared with normal ClusterXL HA. Resources were OK. 10% maximum CPU Usage and 8GB RAM out of 64GB used, 95% traffic being accelerated, we had to rollback.

I have some questions about this:

1. Having the other interfaces disconnected could drive into this problem? I dont think so but who knows.

2. Not having the second member joined and cloned could cause traffic issues? I've been thinking about this one a lot. Every admin guide, video and forum says, install policy, clone, test. Can you confirm if this could lead to this kind of problem?

3. Admin guide said we need 4 interfaces, MGMT (Connected), Sync (Disconnected), External (Connected, topology external), Internal (Connected, topology based on routes). That Not connected Sync can cause something like this?

4. Do you think this step by step its ok?

For us, the activity almost should have simply been, disconnect older platform, connect ElasticXL and thats it. Now I think we are missing something. I'll appreciate every comment. Thanks in advanced.

genisis__ · ‎2025-11-13

I wish I could say I have answers, but certainly when I created ElasticXL/VSNext in Proxmox there where issues that just screamed "Not ready".
My suggestion would be to install 'Legacy' VSX, if you need a known technology to deliver your business needs.

Chris_Atkinson · ‎2025-11-13

Proxmox isn't supported so it's not really a fair comparison in this context.

@Oscar_David_Gom Could you share some further detail, which JHF was the gateway deployed with and what about things like the CoreXL config? etc

CCSM R77/R80/ELITE

Oscar_David_Gom · ‎2025-11-13

Yea sure, R82. Take 39. Everything like out of the box, VS0 config was not touched, VS_INT and EXT with 8 CoreXL Instances each, every other configuration is as default, like I said, Out of the box (R82 Cleand installed, FTW on member 1, then hotfix, then VSNext config). Member 2 just Clean install and thats it, not even Sync was connected. Is an ElasticXL Cluster of 1 member hahaha (Joke, plan was that everything works fine, then clone)

Chris_Atkinson · ‎2025-11-13

sk183481 is something I would investigate further with TAC.

CCSM R77/R80/ELITE

emmap · ‎2025-11-16

On top of this, how were you going to add the second SGM to the cluster? At the moment there is a limitation meaning that you can't add multiple SGMs per site when using VSNext.

Oscar_David_Gom · ‎2025-11-16

Mmmm what do you mean by how? like every video or guide I've watched/read. Just adding it with Pending Gateways option in VS0.

That limitation you are talking about is about having a HA ElasticXL Cluster (not LS)? I saw it, but thats one my questions. Not having the other member (as another site) can cause this problem?

emmap · ‎2025-11-16

Yep sorry I was asking whether you were adding the second gateway to site 1 or site 2.

Having just gateway 1 in the cluster won't be the direct cause of these problems. It's more likely that the issue Chris pointed out is a cause here. Installing JHF take 44 resolves this.

Are you a member of CheckMates?

Question about VSNext Requirements