Solved: Broken Pipe: Requested input buffer length to big ...

mackrispi · ‎2020-05-02

Hi,

In a company I use CheckPoint VPN on Windows 10. Logging through the GUI works well.

But as almost everyone in my team and across the company has changed to Linux (I have Ubuntu 20.04 in dualboot)

due to much better performance from development tools, I try to make my VPN connection to work using SNX.

But I just can't make it work and I did try it on Ubuntu 18.04,19.10,20.04 and MAcOs Catalina 10.15.4 .... On Mac If using GUI login then it works as on Win10, but if using SNX from CLI, then it fails as on Ubuntu .....

Whenever there is a bigger stuff to be downloaded from GIT repository I always get a
BROKEN PIPE. I even got a broken pipe on 10MB big repository ....but only once ...

And my download is ridiculous ... On Win10 I have a constant 1MB/s and on Ubuntu It starts from 800kb/s and
drops to even 6KB in a matter of minutes ..... being in avg of 250-300kb/s

All the network is inaccessible after broken pipe and even just before this happens ... no browser links work ... I get no reply from pinging DNS ...unless I disconnect SNX and reconnect ... then everything is OK again ....

I changed versions of SNX from 7075 to 8061, 10003 .... they all connect but non works as written above ....

I tried starting SNX from CLI and managed to install firefox 48 version and connecting through Java applet ...

Then I added debugging to SNX .....

And now I can see there "ssl_link:: handle_received_packet: received a too long packet" and "fwasync_mux_in: requested input buffer length to big (1148116496)"

I connect as : snx -s servername -u user .... we use only user/pass authentication

Please if someone can help me on this , cause I'm killing my self on this for the last 10 days now ....

Thank you very much.

Here is the log (also file in attachment):

snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] fwasync_mux_in: 1: got 147448 of 148736 bytes == 1288 bytes required
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ckpSSL_do_read: read 1288 bytes
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] fwasync_mux_in: 1: managed to read 1288 of 1288 bytes
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] fwasync_mux_in: 1: call: 80e2c70 with 4
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ssl_link_fwasync_client_handler: state: RECEIVE_PACKT - entering
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ssl_link_fwasync_client_handler: RECEIVE_PACKT state - expecting 148736 bytes
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ssl_link:: handle_received_packet: received a too long packet 
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ssl_link_fwasync_client_handler: received packet ok
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] fwasync_mux_in: 1: rc=1, next: 80e2c70 with 3, req: 8r, 0w
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ckpSSL_InputPending 1 pending bytes
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ckpSSL_InputPending 1 pending bytes
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] fwasync_mux_in: 1: got 0 of 8 bytes == 8 bytes required
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ckpSSL_do_read: read 8 bytes
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] fwasync_mux_in: 1: managed to read 8 of 8 bytes
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] fwasync_mux_in: 1: call: 80e2c70 with 3
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ssl_link_fwasync_client_handler: state: RECEIVE_TCPT_HEADER - entering
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ssl_link:: handle_TCPT_header: TCPT header: with len= 1148116496 and type= -906032325
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] fwasync_mux_in: 1: rc=1, next: 80e2c70 with 4, req: 1148116496r, 0w
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] fwasync_mux_in: requested input buffer length to big (1148116496)
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] fwasync_end_conn: scheduling the end of connection 1
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] fwasync_do_end_conn: closing connection 1 (conn=87e4ca0)
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ssl_link:: ssl_link_fwasync_end_handler: ending connection 
snx.elg.8:[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ssl_tunnel::link_failure_cb: got link failure, close tunnel
snx.elg.8:[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ssl_tunnel::link_failure_cb: scheduling reuse to 2000 milli-seconds
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ckpSSL_fwasync_close: start shutdown
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] fwasync_do_end_conn: end closing connection 87e4ca0 1
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ckpSSL_ShutdownHandler: rc=0 (1) SSL negotiation finished successfully
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ckpSSL_ShutdownHandler_in_sock: called
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ckpSSL_ShutdownHandler_in_sock: called
snx.elg.8-[ 5859 -141541312]@kmarin-HP-ZBook-15-G5[2 May 20:41:08] ckpSSL_ShutdownHandler_in_sock: called

mackrispi · ‎2020-05-07

Hello guys,

I want to share, how I fixed my problem with my problems and retransmissions happening all the time.

Just before I totally gave up, I came across "linux tuning" article. Something about changing different parameters ...

And to fix my problem only one had to be changed and that is (default is 1):

sudo sysctl -w net.ipv4.tcp_window_scaling=0

Then to make it permanently

Update /etc/sysctl.conf by adding this line at the end:

net.ipv4.tcp_window_scaling=0

Best regards,

Kris

View solution in original post

Timothy_Hall · ‎2020-05-03

Please see my response in this thread describing a somewhat similar issue:

https://community.checkpoint.com/t5/Endpoint-Security-Products/VPN-download-speed-is-very-slow-on-Li...

Your debug is interesting, the presence of term "TCPT" would seem to indicate the use of Visitor Mode which must be handled by the vpnd daemon on your gateway. Considering that this behavior has followed you through several versions of SNX, I'd suggest taking a closer look at vpnd on your gateway:

1) What code level and Jumbo HFA are you using on the gateway, as there have been many vpnd stability fixes over the years. The fact that your SNX client is receiving a TCPT header of stated length 1148116496 makes no sense, and may indicate some kind of bug in vpnd which constructed the header.

2) The SNX client could be somehow corrupting or misinterpreting the TCPT header, but this seems unlikely. Can't really think of any Linux component external to SNX that might potentially interfere with traffic like this, do you have any special or customized network/firewall components on the Linux box itself?

3) Is the vpnd daemon crashing and being immediately restarted by fwd on your gateway? That might explain some of the behavior you are seeing. Check the start time of the vpnd process on your gateway and look in $FWDIR/log/vpnd.elg. These lines from your debug give credence to this theory:

[ 5859 -141541312]@username-HP-ZBook-15-G5[2 May 20:52:15] fwasync_mux_in: 2: got 0 of 8 bytes == 8 bytes required
[ 5859 -141541312]@username-HP-ZBook-15-G5[2 May 20:52:15] ckpSSL_do_Read: failed to read: Connection reset by peer
[ 5859 -141541312]@username-HP-ZBook-15-G5[2 May 20:52:15] fwasync_mux_in: 2: read: Connection reset by peer

Gaia 4.18 (R82) Immersion Tips, Tricks, & Best Practices Video Course
Now Available at https://shadowpeak.com/gaia4-18-immersion-course

PhoneBoy · ‎2020-05-03

Highly recommend getting the TAC involved here.

The only alternative to SNX would be to use an L2TP client.
There's a couple of articles on CheckMates that explain how to make this work with the R80.30 and R80.40 releases, though it is not formally supported outside of a specific customer release you can obtain through your local Check Point office.

mackrispi · ‎2020-05-04

Thank you to both of you.

Will forward this to our IT to take a look at it.
Best regards,
Kris

mackrispi · ‎2020-05-04

Hi,

I did follow the steps as suggested in first reply from Timothy ... to identify slow network ...

In attachment there is an excerpt from my full log from Wireshark.... tcpdump.csv

And I added netstat before vpn and after ....

I have TCP retransmissions happening every 5sec .... and there is a lot of this ....

I don't know what to look for, so I kindly ask you to please take a sec and take a look.

Thank you,

Kris

Timothy_Hall · ‎2020-05-04

From your "after" netstat -s output:

1080 packets pruned from receive queue because of socket buffer overrun

It looks like the SNX client is not reading the receive socket buffer fast enough (or this could be a symptom of the overly large stated TCPT size), causing an overflow and packet loss which is killing your performance. The relevant Linux kernel values are:

> /proc/sys/net/ipv4/tcp_rmem
> /proc/sys/net/ipv4/tcp_wmem
> /proc/sys/net/core/rmem_max
> /proc/sys/net/core/wmem_max

This buffering issue appears to be happening at the Layer 4/TCP level. A few things to check:

1) Run netstat -ni and verify that the Layer 2 DRP/OVR/ERR counters are zero or near zero on the Ubuntu box.

2) Is the Ubuntu box heavily loaded CPU-wise? Any chance the SNX client isn't getting enough CPU slices? When running the SNX client from the command line, try prepending nice -n -20 to it to raise its CPU scheduling priority to maximum, so like this: nice -n -20 snx ... and see if that improves things.

3) Beyond that either the SNX client itself or the Linux kernel variables will need to be tweaked, you can read more about this topic here:

https://news.ycombinator.com/item?id=10598195

The packet capture didn't show anything useful, would recommend involving TAC from this point on especially looking at vpnd on your gateway...

Gaia 4.18 (R82) Immersion Tips, Tricks, & Best Practices Video Course
Now Available at https://shadowpeak.com/gaia4-18-immersion-course

mackrispi · ‎2020-05-04

Thank you for all the help guys.

I did check CPU and no one used CPU more then 10% ...

As well I did run netstat -ni and all counters were zero before and after

I will try as you guys suggested and see if we can make this work. Thank you.

Your help is much appreciated.

Best regards,

Kris

mackrispi · ‎2020-05-07

Hello guys,

I want to share, how I fixed my problem with my problems and retransmissions happening all the time.

Just before I totally gave up, I came across "linux tuning" article. Something about changing different parameters ...

And to fix my problem only one had to be changed and that is (default is 1):

sudo sysctl -w net.ipv4.tcp_window_scaling=0

Then to make it permanently

Update /etc/sysctl.conf by adding this line at the end:

net.ipv4.tcp_window_scaling=0

Best regards,

Kris

Timothy_Hall · ‎2020-05-08

Nice find, so it did turn out to be a problematic interaction between the SNX client and the Linux OS after all!

Gaia 4.18 (R82) Immersion Tips, Tricks, & Best Practices Video Course
Now Available at https://shadowpeak.com/gaia4-18-immersion-course

Are you a member of CheckMates?

Broken Pipe: Requested input buffer length to big (SNX)