Hello,
R81.20 Take 92 running on open servers.
We're experiencing a lot of First packet isn't SYN drops which also seem to affect legitimate traffic (HTTPS and RDP).
The TCP flag is almost always PUSH-ACK with occasional ACK.
It's not happening all the time, but only during peak hours, hence it seems to be linked to the load. Perhaps it's worth mentioning but it's also happening to the devices that act as proxies (for example Sophos ZTNA). Direct RDP connections from a server/workstation A to a server/workstation B work flawlessly, but connections from ZTNA gateways get occasionally interrupted (with First packet isn't SYN PUSH-ACK logged) and users experience disconnections and must reconnect.
From the application perspective, we see the following in the logs:
Reading from WebSocket failed. websocket: close 1006 (abnormal closure): unexpected EOF","time
The gateway has plenty of RAM (64GB) and CPU load is around 35-45%. The number of concurrent connections is around 25-30K during peak hours, with 55K configured as a maximum value.
What we tried so far:
- increase the timeout for RDP and HTTPS;
- disable "Smart Connection Reuse" feature;
- install the latest Take;
- fail-over to the standby and reboot.
Thank you in advance for any tips and hints!