Re: FTPS Connection dropped, but sometimes working

Daniel_Hainich · ‎2025-01-15

hi,

we migrated from 23800er ClusterXL R81.20 Take92 to Maestro with 1SG and 2 Gateways 9800 R81.20 Take92.

Since the migration we have an issue with an ftps-connection. Some time the connection is working, most of the time not.

The control connection is on port 6618 and the data connection is between 7000 and 7500.

All ports are opened for the specific destination.

If the connection is working, all is fast and directory browsing is no problem. But when it fails, directory listing fails with timeout.

In the logs. There are only some drops with „first packet isnt syn“. No other drops or blocks.

I disabled layer4 distribution within maestro, no change. I set 1 gw manually down, no change.

I captured the traffic on internal and external interface and got an working connection and 1 not working.

The difference i can see in the pcacp – when it is not working there is an „client hello“ from destination back to client on outside interface, but this packet is missing on inside interface. see attached images.

I rund an fw ctl zdebug, only the „1st packet isnt syn“ is logged.

In the smart logs, there is an detect-message from urlf, the ftps-cert isnt from an trusted CA. but that was also before we migrated to maestro.

i have no idea wtf is going on, and why sometimes it is working. any ideas?

thanks

daniel

AkosBakos · ‎2025-01-15

Hi,

This is really interesting. The client

In my old old memories: there was a behavior, where the FTP open connection back to the client for some reaseon. But i can't recall it.

Did you test from the same client (and settings) before and after the migration?
Passive is mode is enabled?
Is this connection is SFTP or FTP over SSL?
- if the second: it is not supported: https://support.checkpoint.com/results/sk/sk39793
- but it worked before? 🙂

Akos

----------------
\m/_(>_<)_\m/

Daniel_Hainich · ‎2025-01-15

hi, it is FTP passive mode over ssl. the control port is on tcp/6618 and the data ports are on an range tcp/7000-7500.

the poliy is configured with inside source, destination and bother services (control and data). this is what described in sk39793.

that was working until migration.

attached the working connection on control and data port.

AkosBakos · ‎2025-01-16

Hi,

As is undestood, you have 1 MHO with 2 SGMs. Usually 1_1 is the SMO.

Was there that kind of situation, where the 1_2 was the SMO, than 1_1 came up and got back the SMO task?

Why I'm asking this: We had a situation a few weeks ago, and one of the TCP connection outside of the MAESTO sometime works sometimes not. (the task handover was that what I described above) We were also bewildered.

The solution was to reboot the 1_2, which is not a deep-dive solution, but solved the problem.

Akos

----------------
\m/_(>_<)_\m/

Daniel_Hainich · ‎2025-01-16

so, the smo master is 1_1, right?

but i disabled one of them an its the same as both gateways are active.

AkosBakos · ‎2025-01-16

If flapping, thats not good. What is the reason?

What does show smo image md5sum say?

----------------
\m/_(>_<)_\m/

Daniel_Hainich · ‎2025-01-16

1_01:
Image md5sum is fd50c7813ff1250e857a323d57962e9c

1_02:
Image md5sum is fd50c7813ff1250e857a323d57962e9c

i dont know if it is flapping. when iam ssh into the the sg, i thought iam connected to the smo master? but sometimes iam on 1_1 and sometimes on 1_2. or is this related to loadbalancing?

AkosBakos · ‎2025-01-16

Exactly. When you connent to SG's IP you arrived always to the SMO. (always check the prompt)

I would suggest you, find find out why SMO is flapping.

What is in the /var/log/start_smo.log ?

A

----------------
\m/_(>_<)_\m/

Daniel_Hainich · ‎2025-01-16

how i can determine which member is the smo-master? i thought with "asg stat -i tasks" and the smo-master has the "local" attribute. right?

AkosBakos · ‎2025-01-16

Right, with "asg stat -i tasks" command. If you see "local" that means that is the SMO which you are on. I hope it is understandable.

Similar, like the "cphaprob stat" command. Same methodology.

Akos

----------------
\m/_(>_<)_\m/

Daniel_Hainich · ‎2025-01-16

ok, than 1_1 is the smo-master. but in fact, my ssh terminates sometimes on 1_1 and sometimes on 1_2. it seems that ssh is not bound to the smo through asg_excp_conf. i think, per default ssh to the sg-ip is loadbalanced.

but my inital probem with ftps isnt solved. 😞 i opened an tac-case, i hope the tac has an idea how to identify why the packet is drop and from which component.

Timothy_Hall · ‎2025-01-16

Sounds like it could be a distribution issue of some kind; to see if that is the case you can try to always "stick" all FTPS connections exclusively to the current SMO every time with the asg_excp_conf command, although I'm not sure that will help if the SMO is thrashing back and forth. This procedure was documented by "sk175584: Forwarding specific inbound connections to the SMO Security Group Member" but this SK no longer appears to exist or has been hidden. Looks like this content may have ended up here:

https://sc1.checkpoint.com/documents/R81.20/WebAdminGuides/EN/CP_R81.20_Chassis_AdminGuide/Content/T...

Gaia 4.18 (R82) Immersion Tips, Tricks, & Best Practices Video Course
Now Available at https://shadowpeak.com/gaia4-18-immersion-course

PhoneBoy · ‎2025-01-16

According to the SK (it exists internally), the information was moved into the admin guides. 😉

Lloyd_Braun · ‎2025-01-16

This looks similar to an issue I had where the firewall was doing the TLS negotiation, observable on outside interface pcap, but was not forwarding the packets to the internal host. If I remember correctly, it was holding on to the packets too long and the server side timed out the handshake. It may have been URLF blade trying to categorize the site, based on cert DN, to see if it fell into HTTPS inspection category set. It was also a nonstandard cert that logged as URLF accept.

It popped up after an upgrade as the default config started inspecting for URLF on more nonstandard ports.

Have you tried an explicit URLF and HTTPS inspection exception for the IP address(es)?

Timothy_Hall · ‎2025-01-16

@Lloyd_Braun has brought up an excellent point, this issue with FTPS could be caused by the firewall improperly pulling this traffic into Active Streaming for HTTPS Inspection. An easy mistake is to accidentally set the Service field to "Any" for an HTTPS Inspection rule with an "Inspect" action: sk118574: FTP/SSH/SFTP Traffic fails when HTTPS Inspection and Application Control. Also had the same issue in this thread: CP to Azure S2S vpn issue

Gaia 4.18 (R82) Immersion Tips, Tricks, & Best Practices Video Course
Now Available at https://shadowpeak.com/gaia4-18-immersion-course

Daniel_Hainich · ‎2025-01-16

i installed the root-cert for https-inspection and configured an https bypass. now there is an https bypass in the logs, but the problem is not solved. there is no any rule in https-policy.

for testing i also disabled l4 distribution mode. nothing changed. i also set one gw down, no change. is disabling l4 distri mode the same as configuring an exception with asg_excp_conf?

EDIT: i added an application rule for the ftps-server with service any and now its working.

Are you a member of CheckMates?

FTPS Connection dropped, but sometimes working