Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Victor_MR
Employee
Employee

"It's not the firewall"

Jump to solution

Hi (check)mates,

We all know that "the firewall" is one of the first things people blame when there is a traffic issue. A security gateway (a "firewall") do a lot of "intelligent stuff" more than just routing traffic (and -in fact- many network devices today do also "a lot of things") so I understand there is a good reason for thinking about the firewall but, at the same time, there is a big number of times where it's not anything related to it, or when it's not directly related.

I'm looking to build a brief list of typical or somewhat frequent issues we face, where "the firewall" is reported as the root of the issue, but finally it isn't.

It's a quite generic topic, and in terms of troubleshooting it's probably even more generic. Probably there are several simple tools that one should use first, like: traffic logs, fw monitor, tcpdump/cppcap, etcetera. But what I would like to point is not the troubleshooting, but the issues themselves. Of course, assuming the firewall side is properly configured (which would be a "firewall issue" but due to a bad configuration).

To narrow down the circle, I'm specially focusing on networking issues, but every idea is welcomed.

Do you think it would be useful to elaborate such list? 🙂 What issues do you usually find?

Something to start

(I'll update this list with new suggested issues):

  • A multicast issue with the switches, impacting the cluster behavior.
  • A VLAN is not populated to all the required switches involved in the cluster communication, specially in VSX environments where not all the VLANs are monitored by default.
  • Related to remote access VPN (this year has been quite active in that matter), some device at the WAN side is blocking the ISAKMP UDP 4500 packets directed to our Gateways, but not the whole UDP 4500. Typically, another firewall 🤣
  • Asymmetric routing issues, where the traffic goes through one member and comes back through the other member of the cluster.
  • Static ARP entries in the "neighbor" routing devices, or ARP cache issues.
  • Any kind of issue with Internet access: DNS queries not allowed to Internet or to the corporate DNS servers (so we cannot solve our public domains), or TCP ports blocked, or any required URL blocked (typically by a proxy)...
  • Traffic delays: these are typically more difficult to diagnose. fw monitor with timestamps is one of our friends here.
  • Layer 1 (physical) issues. Don't forget to review the hardware interface counters!
  • Missed route at the destination, especially related to the routes related to the encryption domains in a VPN.
  • Why not: another firewall blocking the communication, of course 😊 Or a forgotten transparent layer-7 device in the middle (like an IPS), installed in a previous age. This may be a variant: "it's not my firewall"
  • An application or server issue. The simplest example is that the server is not listening in the requested port. A more complex one would be an application layer issue.

Lastly, a little humor. 😊

the-moment-when-you-prove-its-not-the-firewall.jpg5c9e05a465c753417dbde949ee285fd9e56f0739ad0254de838a7d8c61c1a318.jpg169j56.jpg1sg25m.jpg

 

(1)
47 Replies
_Val_
Admin
Admin

One of battle stories I tell on community Live sessions is about just that. When migrating MDS to new IP addresses, we stumbled on one of CMAs failing to install policies on remote FWs. Immediately I asked about third party, and the customer said firmly, no, we do not have anything else. After 30 minutes of arguing and some traffic traces, I have proven to them they had something else blocking traffic. Two hours later, they have identified it was a Juniper FW they forgot about eons ago...

CharlieFoxtrot
Explorer

I never expected it, but I've heard that this is not an uncommon distinction between network and telco people, that telco techs are used to thinking in circuits/loops but network guys are just into the packet flow and forget to make sure it can come back? I guess you get used to it before too long either way, but I definitely saw it happen with a lot of new people.

0 Kudos
spottex
Participant

Does other Vendor firewall interpretation of RFC's count?

(Albeit I sometimes wonder about CP as well - can't remember any examples right now but all VPN related)

i.e. Sonicwall / Check Point ikev1 IPSEC VPN using PKI

SonicWall firewall does not contain the full certificate chain so you have to install subordinate CA's into the Trusted section of CP. (could be the cert issuer not doing something correctly)


SonicWall Default expects IKE ID to be Distinguished Name (DN) CP sends Main IP Address.


SonicWall only accepts a Cert from CP if the Main IP is the first Alternate name added to the cert when generating the key. If this is not the case, one way VPN initation is still possible but fails if CP initiates the connection.

0 Kudos
Victor_MR
Employee
Employee

OMG, I'm sure that was hard to diagnose.

I've experienced a similar issue with another vendor several years ago.

I'll try to not open too much the can of worms of "issues with others firewalls" in the original list 😊

0 Kudos
eliadr
Participant

In one of the places I worked, we had a major update for the AV on the workstations.
This happened a week apart from upgrading the FWs appliances.
So, since upgrading the FWs (or so we thought), surfing was so sloooooow.
A co-worker of mine chased this for half a year, until we somehow figured out the cause was the web reputation engine in the AV.
The engine couldn't get the signatures updated, so every URl took forever, until the engine gave up.

0 Kudos
Cristobal_Johns
Employee
Employee

In a somewhat similar problem of not transferring files, the customer even escalated the ticket. It had been detected by Checkpoint, that the problem was on the ISP side. They did not believe it, they had already asked them if they had made changes, and they had sworn they had not.
Then by showing them with tcpdump that the problem was on the ISP side, with a reset of the connection, they admitted that they had implemented an improvement in the IPS software, they turned it off for our client and the problem went away.

0 Kudos

More firewall fun:

fw-fun-1.PNG

 

fw-fun-2.PNG

 

genisis__
Advisor

Put a smile on my face! 

Vladimir
Champion
Champion

Without recounting most of waht was written about above, here is the one from few years ago:

Client states that traffic going through check Point cluster to one of the multihomed AS/400s was being arbitrarely dropped.

As a proof, they have shared the graphs from the Linux MTR (my traceroute) tool.

On a surface of things, it did look like CP was the culprit.

Ended-up being completely unrelated issue, but to prove  that it was not a firewall, I had to create a moc environment and have discovered that they were using MTR default TTL 1 per hop, which was decrementing by 1 on each hop. Check Point's processing traffic on iIoO was decrementing the  TTL to 0 resulting in a false positives for the tool only, but not for the actual traffic.

CP_MTR_Screenshot0.jpg

CP_MTR_Screenshot4.jpg

 

CP_MTR_Screenshot1.jpg

CP_MTR_Screenshot2.jpg

CP_MTR_Screenshot3.jpg

 

genisis__
Advisor

Nice!  I've seen something like this before as well, on a poorly written application.

0 Kudos
Victor_MR
Employee
Employee

Good story!

Sometimes, it's quite difficult to reproduce an issue, test a device or create an identical lab without adding new conditions.

This reminds me several examples where the customer where trying to test the throughput capacity of the gateways (not only the firewalls), and how this is a very difficult thing specially when talking about a layer-7 security device.

I'll add something around this to the original list 🙂 Thanks for sharing!

Zerat
Explorer

In the memes section, you've missed the best one - I have it pinned to the board in my office:

dilbert_blame_the_firewall.jpg

If it's there, it must work. Hate to be beta-tester on GA
Timothy_Hall
Champion
Champion

A framed, full color copy of this Dilbert strip has been hanging in my ATC training room for years and is always good for a few laughs.

"Max Capture: Know Your Packets" Video Series
now available at http://www.maxpowerfirewalls.com
0 Kudos
Tony_Graham
Contributor

Well, most times it seems to be something in-between that botches the connection but I have a good story....

Set up a demo computer in a conference room so a company could come in and do a 'dog and pony' show (yah this was 'back in the day'). Computer was working perfectly. Hardwired and ready to go (pre-wireless networks). Don't recall the version of Checkpoint at the time, probably whatever came just after 2.1c. So vendor shows up with a team of like 4 people (why I have no idea). They are really trying to give a hard sell. They try to pull up their website and it doesn't work. 404 Not Found. I get called back into conference room. "It doesn't work!" Hmm, quick perusal of websites says it's fine, just doesn't work with their new fangled website. I tell them I have no idea. Well that got them in a huff and they call me all manner of names, question my competence etc. I just said, "I cannot help you, there is no problem with the machine and nothing is blocking the connection." Fast forward about 40 minutes later, one of their people and my manager decide to try it on my bosses PC....doesn't work there either. So the vendor gets on the line with her support...turns out, they were trying to connect to a URL that their support team had dismantled. Oops. Yah, it wasn't the firewall!

Robert_OBrien
Participant

Like others have said, the biggest one we see is that the remote side isn't even listening on the port they claim is blocked.

0 Kudos
_Val_
Admin
Admin

@Robert_OBrien nah, it is another firewall lol

Robert_OBrien
Participant

@_Val_ 

Sure is....the Windows firewall on the box.  🙂

 

Robin_H
Participant

IPS side effects.

Ten years ago a consultant asked me to please configure all Lync (later Skype) relevant port objects to be without protocol setting. Said and done, disregarding the fact that we hadn´t even enabled IPS/SmartDefense back then.

Recently a new external SBC and an internal analogue gateway (with static NAT IP) was installed and they wanted a fixed SIP tcp and a few SIP udp ports. This time I used the sip-tcp port object with protocol "SIP_TCP_PROTO" because the IPS activation project was coming along with already being in monitoring mode. IPS is important and needs to used as much as possible, shall it not?

The calls didn´t go through. Signaling happened but no sound.

During a three-hour session, walking through different configurations within the VoIP devices, the SBC admins finally noticed that the firewall had replaced the IP address in the SIP message.
The outgoing message from the external SBC to our analogue gateway showed the public IP in the CALL-ID.
The internal gateway behind the firewall received the message with the CALL-ID containing the private IP of the analogue gateway.

Using a different port object without a protocol setting solved the issue.

( I never actually followed up on this with Checkpoint. Let me know if you think I should )

0 Kudos