Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
David_Herselman
Advisor

HTTP POST fragments not re-assembled, only first packet forwarded

Hi,

 

I've been battling through Check Point support (6-0001758421) and was hoping someone here may be able to provide a work around regarding R80.30 Gaia 2.6 on a 6500 appliance not re-assembling a fragmented HTTP POST request and sending the first segment as the one and only frame.

 

The following is comparative packet capture between the ingress (bond1.316) and egress (bond1.2020) interfaces. The first packet fragment is selected in both views, frame 28 on the left and 29 on the right. The problem is with frame 29 not being re-assembled with 28 and its data not being included in the packet that Gaia sends out:

teraco_vigilent_fragment_not_reassembled_partial_frame_retransmitted.jpg

 

The first couple of things to raise suspicion were:

  • Data appears to have been aggressively fragmented.

These connections are to data centre monitoring gateways that collect telemetry from probes in each rack and other points of interest. The connections appear to use the ulxmlrpcpp library, which explains the packetisation size.

  • URI appears to be malformed

The only plausible answer I have here is that developers at Vigilent may have miss interpreted the XML RPC library and defined their connection string as proxy and port.

 

I'll get in to those in a moment. Check Point support provided R80.30 on-going hotfix accumulator take 50 which we installed on the standby cluster member and then switched over to. We have tried to ensure that established connections were reset by:

  • temporarily defining a 'block suspicious activity' rule in SmartView Monitor (c:\Program Files (x86)\CheckPoint\SmartConsole\R80.30\PROGRAM\SmartViewMonitor.exe) to force connections between the IPs to be removed from the connection tracking tables:

smartviewmonitor_define_suspicous_activity_rule2.jpg

Then loading a reject rule and removing it a minute later:

smartviewmonitor_define_suspicous_activity_rule_adding_reject.jpg

  • Installing policy
  • Temporarily defining the service object not to sync state, restarting the standby cluster member, waiting for sync to be current and then transferring role:

tcp_4445.jpg

 

These are the layers:

  • Network (Firewall blade only):

teraco_rule_network.jpg

  • Application (Applications & URL Filtering blade only):

image.png

 

Custom TCP service object which does not have protocol defined:

tcp_4445_no_protocol.jpg

 

Process logs warn about the categorisation failure but the connection is allowed:

teraco_vigilent_log1.jpg

Probably simply due to there being no IP, hostname or FQDN to categorise.

teraco_vigilent_log2.jpg

teraco_vigilent_log3.jpg

 

Some things to consider:

  • This is monitoring data for environmental control in several data centres in 5 geographically distributed buildings. The telemetry notifies about failures, hot zones and humidity problems and co-ordinates staff and equipment. This is a critical service.
  • The Check Point allows the connection, following the desired state.

 

The problem is easily reproducible, monitoring requests arrive every 8 seconds so I constructed a tcpdump filter to only show those packets where the first four characters of the payload matched 'POST':

tcpdump -nn -i bond1.316 -s 0 -A 'host 192.168.131.33 and host 192.168.204.67 and tcp[((tcp[12:1] & 0xf0) >> 2):4] = 0x504F5354'
tcpdump -nn -i bond1.2020 -s 0 -A 'host 192.168.131.33 and host 192.168.204.67 and tcp[((tcp[12:1] & 0xf0) >> 2):4] = 0x504F5354'

PS: Wireshark and tcpdump both combine fragmented packets automatically. The result is that people often miss-interprit the displayed output as being packet duplication whereas the packet in memory is simply appended with each fragment and the content subsequently displayed. The section high-lighted in the ingress view (top window) is the content of the second fragment, which now includes the header transmitted in the first fragment.

teraco_vigilent_tcpdump_comparative.jpg

The packet transmitted (bottom) is very clearly exclusively the content of the first fragment.

 

* I'll edit this post to provide details around the library used to generate these packets.

0 Kudos
8 Replies
Timothy_Hall
Legend Legend
Legend

Check Point firewalls usually perform virtual reassembly of fragmented packets for the sole purpose of inspection.  Once all fragments are received, virtually reassembled and inspected, assuming they pass inspection the *original fragments* are transmitted and the eventual recipient must still reassemble them.  It looks like you are enforcing APCL which will utilize just passive streaming (PSL) so there will not be any transmission of reassembled packets.  In the case of Active Streaming (CPAS) via the CPASXL path in R80.30+, the firewall may be actively tearing connections apart and putting them back together, so in that case data that enters the firewall in fragmented packets might be transmitted in a reassembled state.  There is not a straightforward way to invoke CPAS against certain connections other than by subjecting them to inspection by a blade that requires it.

The other wrinkle here is the ability of SecureXL to perform virtual reassembly of fragmented packets on its own starting in R80.20+, prior to this release any fragmented traffic would always go F2F for virtual reassembly.  Since you are using APCL it is unlikely that these web connections are fully accelerated and are probably in the PSLXL path, but I'd suggest selectively disabling SecureXL for these web connections anyway as described here: sk104468: How to disable SecureXL for specific IP addresses and see if it has any effect on the problem or what you are seeing in your captures.

Also tcpdump itself does not display reassembled fragments from the CLI (while fw monitor always does), unless you are viewing the tcpdump captures in Wireshark which reassembles fragments by default.

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
David_Herselman
Advisor

Firstly many thanks for taking the time to read and provide comments.

I don't see the behaviour you mention in R80.30, in that it sends out fragments just as it received them. The problem is perfectly reproducible irrespective whether or not SecureXL or IPS are enabled. The following is a dissection of one of the 'once in a blue moon' packets that traverses the firewall:

teraco_vigilent_fragments_passed_through.jpg

 

Defining a specific Threat Prevention Policy exception also made zero difference:

check_point_threat_prevention.jpg

 

I disagree with your opinion on tcpdump, it does re-assemble fragments it displays. Reviewing captures in Wireshark confirms this.

 

0 Kudos
David_Herselman
Advisor

Right, for those that feel that the Check Point is not sending the fragment due to it containing an invalid URI:

Why then not send the payload and exclusively the portion with the malformed URI?

 

For those that feel there is some kind of policy violation which results in the packet being ignored:

Nothing is logged and 4.5 hours worth of remote Zoom support sessions with numerous kernel debugs yielded no clue to any policy violation. Yes, SecureXL and IPS were disabled with the 'fwaccel off; ips off -n' commands.

Even more, some packets in a blue moon traverse the Check Point although there is absolutely no discernible difference between them. If it's somehow 'bad' then there is a bug with Check Point leaking these 'bad' requests through...

teraco_vigilent_some_are_transported.jpg

 

Herewith the 'Follow the HTML' stream view of an example exchange which was forwarded:

teraco_vigilent_some_are_transported_detail.jpg

 

Watching the concurrent tcpdump messages shows that there is a 5-6 second delay between the Check Point receiving both fragments and it subsequently only sending out the first as a complete frame. I assume this to be an inter-frame timeout as my view of this is that the Check Point is simply ignoring the second fragment and not logging anything in this regard.

PS: Packet captured were shared with Check Point support at the beginning of our interaction.

0 Kudos
David_Herselman
Advisor

I did a quick Google search for the user agent and found both documentation and source code for the XML RPC library Vigilent use. The odd fragmentation is explained perfectly by them having set the 'setTcpNoDelay' option, which dispatches the HTTP header whilst it's busy constructing the reset of the XML query.

This is a mature open source library with many vendors making use of it, especially on low powered RISC processor based probes used to collect environmental metrics. Herewith the section from the documentation:

image.png

 

Okay, so although odd it perfectly explains why the header is perfectly fragmented in to it's own packet whilst the XML data then arrives 0.422931 seconds later (referencing the tcpdump output in my first message). I do not detect any problems with how the library generates the segments, neither does Wireshark nor tcpdump.

 

Right, so what about the malformed URI?

The problem with the URL being non RFC compliant is that it is missing an IP, hostname of FQDN between the connection type classifier and the port specifier. The connection is quite simply: http://:4445/RPC2

I believe that developers at Vigilent may have interpreted the references to proxy connections in the ulxmlrpcpp library to refer to their gateways and subsequently defined the proxy server as the destination IP or hostname. This would result in connections being established to the gateway's IP address but would result in a malformed URI. Another option could be that developers generated minimal source code fragments and use a common base on the gateways that use relatively low powered RISC processors and the Vigilent server and used the proxy string in that context.
The following is again from the ulxmlrpcpp library documentation which explains that the /RPC2 resource is defaulted, whilst there is no reference to a defaulted hostname in the source code. The result is that defining a port and using the proxy connection would result in the URL being constructed in the fashion it is now, whilst the connection is established to the intended target:

image.png

 

So, even if you feel the Check Point should reject the packet because it contains a malformed URI, one should be able to configure the firewall to allow it through regardless. I also need to remind you that this data is part of the segment which is always forwarded.

 

 

0 Kudos
David_Herselman
Advisor

Attached please find simultaneous packet captures of the ingress (bond1.316) and egress (bond1.2020) interfaces.

 

Looks like the baton has been handed over to R&D, case 97336...

0 Kudos
David_Herselman
Advisor

Quite amazing that it's been almost 2 months and Check Point TAC are still asking for management server exports. This is, IMHO 100% a fragmentation re-assembly issue as both fragments arrive virtually simultaneously but the firewall only transmits the first fragment on 8 seconds later when some re-assembly timeout occurs...

 

Lack of resolution this long after logging a reproducible issue, on a $100k+ investment is really not good. Perhaps we should push for a refund and go to Palo Alto?

0 Kudos
JasMan
Participant

Hey,
Has your issue been resolved, or have you switched to Palo Alto?

We're having the same issue and CP is not able to solve it since two month now. 😟

As a workaround we put client and server into the same subnet.

Jas Man

0 Kudos
JasMan
Participant

Can anybody else help me with our issue? Would be greaten!!!

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events