Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Colin_Campbell1
Contributor

VSX Performance limits

Jump to solution

Hi,

I am wondering if there are any inherent limitations in VSX that would cause a single VS to stop processing traffic at around 5000 connections/second. I saw a recent instance where a 21400 appliance did exactly that. According to the appliance comparison chart this should be able to process 130,000 connections/second. Of course I understand that those figures won't be when running VSX but 5000 conns/sec for one VS seems a bit low for me.

The setup:

o 21400 appliance

o R77.30 GAIA

o 12 CPU, 24GB RAM

o 8 x VS each with one fwk

o 2 x SND

Any thoughts/comments?

Colin

2 Solutions

Accepted Solutions
Maarten_Sjouw
Champion
Champion
Colin,
What is the reason you say it is failing this one VS at 5000 connections? Next to that VSX does not support automatic connections limit, so check your setting per VS and do not be to picky on the number, default is 15000 and with 8 VS's you can use around 2GB per VS without any issue.
Keep in mind that you should turn off the CoreXL on the machine itself, as in cpconfig you configure the number of cores for VS0, this should not handle traffic, so should be turned off.
Now in the VS itself you should set a number of cores in the CoreXL page of each VS itself. When your VS's are equally loaded just set them all to 2, you are allowed to over-subscribe.
Regards, Maarten

View solution in original post

HeikoAnkenbrand
Champion
Champion

Hi @Colin_Campbell1,

Critical is that you are running a firewall R77.30 without support!

---

The information given in the data sheet is very theoretical.

The bottleneck for the connection rate is the CPU speed and number of used cores.
More read here: R80.x - Gateway Performance Metrics 

Many good tips have been given here but please also take a look at this. Most points should still be valid for R77.30:

- SecureXL: Which paths are used: F2F, Medium Parh, Acceleration Path?
   If possible use Acceleration Path!
   More read here: R80.x - Security Gateway Architecture (Logical Packet Flow)
- CoreXL: Which blades are enabled? Some generate high CPU load e.g. https inspection...
- But what does the VS do and which blades are enabled?
   More read here: R80.x - Security Gateway Architecture (Content Inspection)
- Is Multi Queueing enabled? 1GBit/s or 10 GBit/s interface?
   More read here: R80.x - Performance Tuning Tip - Multi Queue
- Is Hyper Threading enabled (default on appliance)?
   More read here: R80.x - Performance Tuning Tip - SMT (Hyper Threading)
- Do you have many VPN traffic ( R77.30 > no multicore support)

All these points can affect your connection rate also.

More to performance tips and architecture read here:
Architecture and Performance Tuning Link Collection 
and here: 
R80.x - Top 20 Gateway Tuning Tips 

View solution in original post

10 Replies
Maarten_Sjouw
Champion
Champion
Colin,
What is the reason you say it is failing this one VS at 5000 connections? Next to that VSX does not support automatic connections limit, so check your setting per VS and do not be to picky on the number, default is 15000 and with 8 VS's you can use around 2GB per VS without any issue.
Keep in mind that you should turn off the CoreXL on the machine itself, as in cpconfig you configure the number of cores for VS0, this should not handle traffic, so should be turned off.
Now in the VS itself you should set a number of cores in the CoreXL page of each VS itself. When your VS's are equally loaded just set them all to 2, you are allowed to over-subscribe.
Regards, Maarten

View solution in original post

Colin_Campbell1
Contributor

Hi,

I didn't say 5000 connections. I said 5000 connections/second. I'm asking about the rate of connections not the number. The connection limit for this VS is manually set at 500,000. We only hit around 220,000 connections but that VS appeared to stop processing traffic when the connection rate hit 5000 connections/second. Nothing was getting through the firewall. For example, network monitors could not reach switches and routers that had to be reached through the firewall. 

CoreXL is off. VS0 has one CPU and the other VSes currently use the default allocation of one fwk.

Colin

0 Kudos
Reply
Maarten_Sjouw
Champion
Champion
So you rather run with cores unused, than assign more cores to a loaded VS?
2 SND + 8 for the VS's leaves 2 unused cores, I would really assign more cores to each VS that is under load and also look to see if you see swap being used, if so, lower the max connections to 300.000, remember you assign memory per connection that is on the maximum, in your case it reserves memory for all 500K.
Regards, Maarten
0 Kudos
Reply
Colin_Campbell1
Contributor

Hi,

I undertsand what you're saying but in reality and "normal" operations the box is way over-spec for what it does. We do get spikes of legitimate activity occasionally hence the 500,000 connections limit. "Normal" is around 120,000 total connections. However during the DNS flood (only 20Mbps) all symptoms pointed to the firewall being unable to handle the connection rate of around 5000 new connections per second. The "normal" rate is around 1000 conns/sec which as you'd expect, is handled with ease.

Colin

0 Kudos
Reply
Maarten_Sjouw
Champion
Champion
Still it is best practice to set enough cores to make sure it will be able to handle the peak traffic, once it is done the core will be released again. As said before assign enough cores to each VS that can experience these type of peaks. Again it is a max allowed number, not so much a fixed setting of number of assigned cores per VS.
Regards, Maarten
0 Kudos
Reply
Colin_Campbell1
Contributor

Hi,

 

Thanks. That's all useful to know. However we still really haven't answered my question. Is there something inherent in VSX that severely reduces the throughput of the system it's running on?

A 21400 running as a Security Gateway (I suppose) is capable, according to the Appliance Comparison charts of 130,000 connections per second. Using some possibly faulty logic, to me that says on a 21400 with 12 CPU and CoreXL enabled running as  Security Gateway:

  • 2 x SND can handle 130,000 connections per second. That's 65,000 connections/second each.
  • 10 x FWK, one each on the remaining CPUs can handle 130,000 connections/second. That's 13,000 connections/second each.

So, on a VSX gateway we still have 2 X SND. Is there some limitation in VSX that means they cannot handle 130,000 connections/second under VSX? There are 8 x VS on the gateway each with one FWK. Surely they can handle more than 5000 connections/second each. If not what's the limitation? Will adding more FWK for certain VSes lift the throughput for that VS?

Colin

0 Kudos
Reply
Maarten_Sjouw
Champion
Champion
I think you would be better of with the use of Multi-Queue to make sure that you have enough cores available to handle the interrupts, see SK98348.
Regards, Maarten
0 Kudos
Reply
FedericoMeiners
Advisor

Colin,

Hope you are doing fine, please find below my personal experience.

First of all I've seen performance boosts regarding VSX on newer versions compared to R77.30. I highly recommend that you upgrade as soon as possible.

Second, one of our customer had an outage in one of their deployments of VSX running R77.30, similar to yours but with pure tcp traffic (Only HTTPS was going through). The VS had 150k connections. Only way out was to perform a failover. After finishing the forensic we found that the cores associated with VS0 were way high together with other cores, also dynamic dispatcher is off by default in R77.30 and if I remember correctly there was a limitation with VSX.

In large deployments I like to assign 2 cores to VS0 itself, this is a recent screenshot of the CPU consumption of the cores assigned to VS0 in R80.30 (VSX cluster with 20 VS at the moment). Both cores are dedicated to VS0 only. There is not enough information on how much magic happens inside VS0 but I like to keep my management plane safe.

Screenshot_20191219_074734.png

To make things worse, many VSX deployments don't have their CoreXL affinity tunned for each VS, so the saturation of one core may lead to serious repercussion in other VS. In certain enviroments this simply can't be done (IE: Having more VSs than cores). First advise would be to see if you can perform this, therefore each VS will have their dedicated cores.

Also I think that one fwk for a VS that is running 150k concurrent connections is not enough, especially if you are using other blades (IPS, etc)

Second advise, stated by Maarten, I can hardly stress how important is to use MultiQ, even more if you are using SFP ports. My guess here is that your SND cores were saturated with your DNS flood and had a hard time processing new packets. If using SFP for a 21400 I find best to do a 4/8 split (4 SND and 8 CoreXL), this way you can use 4 cores un MultiQ for SFP.

And again, upgrade to R80.30, VSX works like a charm on it.

Hope it helps,

 

____________
https://www.linkedin.com/in/federicomeiners/
Colin_Campbell1
Contributor

HI,

 

Thanks for all of this. It is very helpful. R80 is in the wings but we have to upgrade the management servers first (starts in January hopefully).

 

Colin

0 Kudos
Reply
HeikoAnkenbrand
Champion
Champion

Hi @Colin_Campbell1,

Critical is that you are running a firewall R77.30 without support!

---

The information given in the data sheet is very theoretical.

The bottleneck for the connection rate is the CPU speed and number of used cores.
More read here: R80.x - Gateway Performance Metrics 

Many good tips have been given here but please also take a look at this. Most points should still be valid for R77.30:

- SecureXL: Which paths are used: F2F, Medium Parh, Acceleration Path?
   If possible use Acceleration Path!
   More read here: R80.x - Security Gateway Architecture (Logical Packet Flow)
- CoreXL: Which blades are enabled? Some generate high CPU load e.g. https inspection...
- But what does the VS do and which blades are enabled?
   More read here: R80.x - Security Gateway Architecture (Content Inspection)
- Is Multi Queueing enabled? 1GBit/s or 10 GBit/s interface?
   More read here: R80.x - Performance Tuning Tip - Multi Queue
- Is Hyper Threading enabled (default on appliance)?
   More read here: R80.x - Performance Tuning Tip - SMT (Hyper Threading)
- Do you have many VPN traffic ( R77.30 > no multicore support)

All these points can affect your connection rate also.

More to performance tips and architecture read here:
Architecture and Performance Tuning Link Collection 
and here: 
R80.x - Top 20 Gateway Tuning Tips 

View solution in original post