Re: HTTPS Inspection Limiting TCP receive window t...

af0e2c12-4d24-3 · ‎2020-12-02

Hello Checkmates,

Am I the only one that has noticed that with HTTPS-inspected traffic, Checkpoint gateways always initiate an outbound tcp connection with TCP receive window scale factor of 8, and that the gateway's receive window never scales beyond 262144 bytes (32,768*8) in either direction (facing the client or facing the internet server)? This limits the throughput of a single tcp connection and is a problem if either bandwidth or latency is high.

For example, bandwidthplace.com test uses a single TCP connection. On our 500 Mbps link, with HTTPS inspection we get around 90 Mbps down. With HTTPS client bypass, we get 350 Mbps down (since windows controlled by client instead of gateway). This is on an otherwise idle gateway with no CPU usage. It's also not a matter of other active blades, it still happens with HTTPS-inspected traffic even if I turn off everything else. Of course, most web browsers use a single tcp connection for web downloads so we also see the same thing on https downloads. It is very clearly a problem of the gateways limiting the tcp receive window.

We have 6500 Gateways on R80.40 take 83. I only noticed since we have a higher bandwidth link now and also some remote users tunneling through high latency paths. I realize on the aggregate of many connections we still get the throughput, but it's odd to artificially limit HTTPS throughput of a single connection and make the gateway the bottleneck. Maybe they have to make some arbitrary choice for scalability of number of TCP connections or something, but it would sure be better to simply follow what the client presents in the TCP Syn packet so the gateway doesn't throttle the throughput.

I have an SR open and we're running this up the tree, but I would be interested to know if anyone else has noticed this limitation or could confirm their gateways do the same thing. I did it by using cppcap to a file then looking at the connection that the gateway makes to the internet server after the client does, then looking at the maximum receive window it gets. In all cases HTTPS inspection makes the connections as described above, and gateway's receive windows never scales past 262144 bytes, whereas the client on its own without inspection would go up to tcp receive windows like 8 MB and thus get much better throughput.

Also, this has nothing to do with the OS settings such as rmem tcp_rmem and so on. If from the gateway you initiate a connection directly (telnet to some port), it honors the rmem settings and in my case puts window scaling factor of 1024 which matches the OS rmem setting. No change in OS tcp settings affects what the HTTPS-inspected connections initiated from the gateway do.

Thanks for any confirmation. I'm just curious if checkpoint is really artificially limiting single-connection TCP throughput for all their customers in similar latency/bandwidth scenarios that use HTTPS inspection and didn't bother to tell us.

Timothy_Hall · ‎2020-12-02

This limit would appear to be controlled by the following kernel variables:

cpas_tcp_xq_default_size = 262144
cpas_tcp_xq_max_limit = 1048576

However I'm not exactly sure what "xq" means in this context, so this could be a red herring. These variables do not seem to be documented anywhere, so I'd advise against changing them without consulting TAC.

Attend my Gateway Performance Optimization R81.20 course
CET (Europe) Timezone Course Scheduled for July 1-2

_Val_ · ‎2020-12-02

@Timothy_Hall Those parameters are related to active streaming and define the buffer size per stream. There are plenty of reasons why they are not documented. They are internal settings of the engine, and should only be changed in specific and very limited cases.

@af0e2c12-4d24-3 The issue is, with HTTPSi and in some other scenarios, you terminate connections on your security GW and open new ones, so data stream could be decrypted and extracted as clear text for further analysis. Receive window parameters should not cause you too much issues, unless you are routinely transfer huge files over HTTPS. If that is your case, consult with support. I would also challenge your need to decrypt such transfers, before anything else.

Window size is indeed lower than those maximum you can get with a browser. The reason is, it helps avoiding a situation when your HTTPS streams become "heavy connections" aka "elephant flows", thus draining overall performance of your security GW.

af0e2c12-4d24-3 · ‎2020-12-03

I appreciate the responses. I will continue with the SR and see where it lands.

I'm really having a hard time believing that after the industry worked hard to give us TCP window scaling beyond 64KB , that decades later a major firewall vendor would arbitrarily limit TCP receive windows for HTTPS traffic to only 256K. Hopefully there is something odd about our installation leading to this behavior and CheckPoint hasn't effectively put in a per-connection throughput limit that varies with latency. Of course customers want a normal TCP situation, let congestion control algorithms do their job during bandwidth contention, and get whatever performance the gateway hardware allows. We wouldn't want some arbitrary variable throughput throttling that no one ever told us about, nor would we want yet another reason we have to put in HTTPSi bypasses, as it's enough of a pain already.

256K is clearly inadequate for everyday bandwidth/delay situations and it is indeed throttling throughput for users and leading to complaints. For a US west-coast based company, it is effectively saying that US east-coast server HTTPSi connections will be limited to 32 Mbps, Europe 14 Mbps, India 8 Mbps, not to mention that widely varying latencies can be found even geographically close that will affect the throughput. Allow normally-scaled TCP windows and we get many many times the throughput.

_Val_ · ‎2020-12-03

@af0e2c12-4d24-3 Caleb, if you are 100% the current settings are problematic for you, please follow your support process.

af0e2c12-4d24-3 · ‎2020-12-03

I've had an SR open for a week and investigated other avenues with no meaningful response so far. Frankly this is a bad design and should be embarrassing for Check Point. Firewall vendor throttles throughput without telling customers. Idle gateways with 25% peak CPU and 25% peak memory usage have TCP throughput throttled 75% for no valid reason. Customers shell out money for hardware but don't get the performance.

A better default design would be to dynamically adjust windows within some min/max range depending on gateway resources.

_Val_ · ‎2020-12-03

Please send me SR number via PM. Thanks.

_Val_ · ‎2020-12-03

@af0e2c12-4d24-3 Message received and answered. This case is actually documented in sk98871, but without any further details. I have asked R&D to comment. It may take a bit of time, though.

af0e2c12-4d24-3 · ‎2020-12-03

Thanks. I did see sk98871 in my research but I found that even if I turned off all threat prevention blades, the results were still the same. Of course it's not just on speedtest sites but on any HTTPS traffic. In all cases the data I'm seeing matches the throughput math based upon the latency to target server with a 256K receive window.

Anyway, we'll see what R&D says.

af0e2c12-4d24-3 · ‎2021-01-13

After all this time, R&D said I could change cpas_tcp_xq_default_size to 1 MB.

Timothy_Hall · ‎2021-01-13

Interesting, thanks for the follow-up.

Attend my Gateway Performance Optimization R81.20 course
CET (Europe) Timezone Course Scheduled for July 1-2

Thomas_Eichelbu · ‎2022-03-03

@af0e2c12-4d24-3,
after changing this parameter "cpas_tcp_xq_default_size" from 262144 to 1048576 did you encounter any significant improvement?

would be good to know

Chris_Atkinson · ‎2022-03-03

Please investigate more recent jumbo hotfix takes that include the following enhancement as a first step.

UPDATE: Check Point Active Streaming (CPAS) TCP Window scale factor is now increased up to 6.

CCSM R77/R80/ELITE

Thomas_Eichelbu · ‎2022-03-03

OK i see it .. in R80.40 Take 150.

PRJ-32072,
STRM-737

Security Gateway

UPDATE: Check Point Active Streaming (CPAS) TCP Window scale factor is now increased up to 6.

but not yet for R81+ ?

Matt_Ricketts · ‎2022-03-03

Yes. R81.10 JHF_T38 has this Active Streaming update too.

https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut...

MatanYanay · ‎2022-03-05

Hi @Thomas_Eichelbu

For R81 version it will be part of our next ongoing jumbo

For R81.10 as @Matt_Ricketts mention its already release as part of Take 38 with PRJ-32074

Thanks

Matan.

Thomas_Eichelbu · ‎2024-03-08

Hello Matan,

i heard rumors and some chatter from collegues that this option or fixes are not included in R81.20 yet?
i did not found anything regarding this PMTR´s or "active streaming" or "MSS" in R81.20 relase notes yet.
Maybe iam blind, but how about support for this in R81.20?

R81.10 Take 38

Released on 21 February 2022

PRJ-32074,
STRM-737

Security Gateway

UPDATE: Check Point Active Streaming (CPAS) TCP Window scale factor is now increased up to 6.

PRJ-30295,
PMTR-73017

Security Gateway

Enhanced Check Point Active Streaming (CPAS). Refer to sk177025.

also this SK sk177025 seems to be deleted?
how about R81.20?

best regards

MatanYanay · ‎2024-03-09

Hi @Thomas_Eichelbu

I'm checking it internally and will get back to you once i will have a clear answer

Thanks

Matan.

MatanYanay · ‎2024-03-11

Hi @Thomas_Eichelbu

From my internal check with R&D, both fixes are part of R81.20 ( take 631)

if you see any issues with these 2 fixes in R81.20 please open ticket to TAC so we will handle it in the proper way

Thanks

Matan.

Thomas_Eichelbu · ‎2024-03-11

Hello Matan!

ok thank you for this information.
so if required we will open a TAC case ...

best regards

kbleb · ‎2022-03-07

Hi - I believe that while there was a significant improvement from 256K to 1 MB in test cases, it wasn't enough for the specific scenario we were looking at (some VPN users with tunnel-all through headquarters who had high-bandwidth connections, but with high or variable latency).

Seems like the active streaming update could help such scenarios though.

Thomas_K · ‎2024-03-25

Hi Mates,

we have a fresh installed 7000 test cluster with R81.20 / JHF 53 and with similar issues.

With only 1 client connected and adapting the following kernel parameters, we get only 60% throughput.

When bypassing HTTPSi, we get 95% throughput.

Please note that R&D recommends to NOT use this parameters in production and only adapt parameters after coordinating with TAC.

# fw ctl set -f int min_ssthresh 400000
# fw ctl set -f int cpas_max_burst 16
# fw ctl set -f int cpas_tcp_xq_default_size 20485760
# fw ctl set -f int cpas_tcp_xq_max_limit 20485760

Does anyone have similar issues and possible solutions?

What is the expected throughput with HPPTSi enabled and only 1 client connected?

TAC case is still open and I will update the article with the findings.

Thank you!

Cheers Thomas

Timothy_Hall · ‎2024-03-26

The kernel parameters you are listing will allow the TCP window to scale much further when active streaming is occurring via HTTPS Inspection. Generally, increasing these will not help you much unless there is high latency present (100ms or higher), in which case the larger active streaming TCP window allows more bandwidth to get used despite the higher latency.

The slowdown you are seeing with HTTPS Inspection sounds about right and is caused by:

1) Active Streaming transparently proxies the connection between two HTTPS connections, needing to decrypt the data from one and encrypt it back into the other. This will obviously incur much more overhead than the simple stateful inspection of HTTPS-encrypted traffic between the client and the server.

2) When first connecting to a new website with HTTPS Inspection active, the remote server's certificate must be forged then sent to the client. This will cause some initial latency when first connecting to the site with HTTPS but should abate once bulk encryption starts.

3) When HTTPS Inspection is active, suddenly there is much more cleartext data for your other enabled blades (APCL/URLF, Threat Prevention) to inspect thus incurring more overhead. Without HTTPS Inspection enabled, once the client and the server start bulk encryption there is nothing at all for those blades to look at. The more blades you have enabled the more overhead will be incurred as a result of this.

4) A single connection (including HTTPS) and all its packets that is being passively or actively streamed in the medium path can only be handled by a single core, unless Hyperflow is present which can offload some inspection operations to other less-busy cores. So the number of cores you have on your firewall is not relevant in helping to increase the performance of a single HTTPS connection.

Attend my Gateway Performance Optimization R81.20 course
CET (Europe) Timezone Course Scheduled for July 1-2

Thomas_K · ‎2024-03-27

Thank you @Timothy_Hall for confirming that 60% are ok and the detailed explanation. Hyperflow is enabled, but also a lot of Blades.

fw vpn cvpn urlf av appi ips identityServer SSL_INSPECT anti_bot mon

Cheers Thomas

Are you a member of CheckMates?

HTTPS Inspection Limiting TCP receive window to 262144 bytes and limiting throughput of tcp stream