Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Robert_OBrien
Participant

fw_accel selectively working

I am running into an issue where the fw_accel rules are not catching a particular elephant flow.

I am running Gaia 80.30 take 200 on Checkpoint 12600 appliances.

I have used the fw_accel rules to accelerate certain elephant flows that i have identified.  However, one particular flow, or more accurately, one source address and destination port pair doesn't get accelerated even though the rules are in place.

Has anyone had an issue where ssh on standard port 22 does not get captured properly by fw_accel rules?

I have other flows being captured and accelerated properly, so i know that the fw_accel service is running and the rule syntax is correct.  I have tried a very specific rule with both source and destination configured with /32 masks and i have tried more general masks to try to capture the traffic, however the hit count continuously shows zero and the flows continue to show up in the output of "fw ctl multik print_heavy_conn"

I am starting to think it might just be an issue with ssh?

0 Kudos
10 Replies
Timothy_Hall
Champion
Champion

Is it just not working for a particular port 22 connection, or for all port 22 connections regardless of other attributes such as IP addresses?

This shouldn't really matter, but do you have HTTPS/TLS Inspection enabled? 

I'm also wondering if this is some kind of SSHv1 vs. SSHv2 thing, and perhaps it can't be fast-accel'ed because then there is no way to ensure you aren't using SSHv1?

"Max Capture: Know Your Packets" Video Series
now available at http://www.maxpowerfirewalls.com
Robert_OBrien
Participant

Timothy,

Thanks for the reply.  I am using fw_accel based on the tips from your book (which is incredible), so I really do appreciate any feedback.

Further testing, is showing that it is not port 22 in general, I created other rules that include port 22 and they appear to work.  It seems to be isolated to this source IP.

I do not have https/tls inspection turned on.

I see the entries in the elephant flow logs, I see the live traffic when using fw monitor, so I know it passes through the firewall. But the hit count stays at 0.  

I've tried other ports (23 for instance) with the same results.  I would absolutely say it was a fat finger issue with my configuring the wrong subnets or hosts if I didn't quadruple check it, and have someone else review it and use cut and paste and expand the subnet out.

Fw monitor confirms the source of the traffic is the same source IP I wrote the rules for, so I am banging my head against the wall at this point.  I have a pair of 12600's and the same thing happens on both boxes, so it isn't specific to the hardware.

0 Kudos
Timothy_Hall
Champion
Champion

Any chance the source IP not being accelerated is an IP address assigned to a firewall interface?  Traffic to and from the gateway itself is never accelerated and always goes F2F.  The only other thing I can think of is that the source  IP address is somehow being considered a broadcast address, which is never accelerated either.

Once the SSH connection is established from that source IP (and hasn't matched the fast_accel) what path does fwaccel conns show it being processed in?  If it is F2F and you can figure out why that traffic is going F2F/slowpath, getting that connection out of the F2F path may make the fast_accel start working.

Beyond that, you'll probably have to run a debug with TAC to determine why that traffic is not being considered eligible for fast_accel.  I doubt it is a fast_accel rule matching problem; there is probably some other condition present making that specific SSH traffic ineligible for fast_accel.

"Max Capture: Know Your Packets" Video Series
now available at http://www.maxpowerfirewalls.com
Robert_OBrien
Participant

It is definitely not an IP assigned to the firewall. 

I have successfully gotten traffic on other ports from this IP to match fw_accel rules now.  I hadn't noticed that firewall policy was blocking telnet (port 23) so my tests using that as an alternate would never match a fw_accel rule since it was being denied.  Once I added a rule to allow it, it showed up in the fw_accel rule hit count.

So, this is very strange.  I do not see the connection in the fwaccel conns tables while it is connected and passing traffic.  I confirmed that the traffic is passing through the appliance using fw monitor and cpview shows it using way too much CPU.  I used the 'fwaccel conns | grep x.x.x.x' to search.  I tried both the destination and source IP and nothing comes back.

This is truly bizarre.  I will open a TAC case to see if i can get someone to work a debug with me to see if we can track down this traffic properly.

Ilya_Yusupov
Employee
Employee

Hi,

 

if you don't see it under fwaccel conns it means that the traffic is in slow path ,F2F.

fast_accel can't help for F2F, so please check why the connection is F2F.

 

 

Robert_OBrien
Participant

Illya,

 

I know you can use fwaccel dbg to see when things in the fwaccel conns table are listed as being in the F2F path, but since the connection does not show up there, i would guess that fwaccel debug will not capture any information as to why it is in the F2F path.  What other methods are available to check why the connection is F2F?

0 Kudos
Ilya_Yusupov
Employee
Employee

You can do fwaccel dbg it will show why the connection is not offloaded to ppak.

Robert_OBrien
Participant

To close this loop, i have found the fix.  Not sure I know the actual root cause, though

The access rule that allowed the ssh traffic was configured with a customized service object, that was essentially the SSHV2 object with an increased timeout value.  Once I changed the object to be the built in ssh service object, i saw the hit counts on the fwaccel rules incrementing and i no longer saw the entries for this traffic in the elephant flow report (fw ctl multik print_heavy_conn)

Not sure why that object caused the issue, but I now know it does.  I can't find any documented reason why we need the extended timeout for that rule, but will address that if I find out we need it, and can now focus on that object if i need to.

Thanks for the feedback Timothy Hall and Illya Yusupov

0 Kudos
Timothy_Hall
Champion
Champion

Was the custom SSH timeout larger than 43200 seconds?  If so that exceeds the maximum idle timeout that SecureXL supports and may be what caused the connection to go F2F and therefore not be eligible for fast_accel.  This behavior is somewhat alluded to here:

sk101232: Connections are dropped as Out-of-State after some idle time when SecureXL is enabled

 

"Max Capture: Know Your Packets" Video Series
now available at http://www.maxpowerfirewalls.com
Robert_OBrien
Participant

Thanks.  It was actually set right at 43200.  If they need a long time out, i will try a shorter version.

0 Kudos