Hi (check)mates,
We all know that "the firewall" is one of the first things people blame when there is a traffic issue. A security gateway (a "firewall") do a lot of "intelligent stuff" more than just routing traffic (and -in fact- many network devices today do also "a lot of things") so I understand there is a good reason for thinking about the firewall but, at the same time, there is a big number of times where it's not anything related to it, or when it's not directly related.
I'm looking to build a brief list of typical or somewhat frequent issues we face, where "the firewall" is reported as the root of the issue, but finally it isn't.
It's a quite generic topic, and in terms of troubleshooting it's probably even more generic. Probably there are several simple tools that one should use first, like: traffic logs, fw monitor, tcpdump/cppcap, etcetera. But what I would like to point is not the troubleshooting, but the issues themselves. Of course, assuming the firewall side is properly configured (which would be a "firewall issue" but due to a bad configuration).
To narrow down the circle, I'm specially focusing on networking issues, but every idea is welcomed.
Do you think it would be useful to elaborate such list? 🙂 What issues do you usually find?
Something to start
(I'll update this list with new suggested issues):
- A multicast issue with the switches, impacting the cluster behavior.
- A VLAN is not populated to all the required switches involved in the cluster communication, specially in VSX environments where not all the VLANs are monitored by default.
- Related to remote access VPN (this year has been quite active in that matter), some device at the WAN side is blocking the ISAKMP UDP 4500 packets directed to our Gateways, but not the whole UDP 4500. Typically, another firewall 🤣
- Asymmetric routing issues, where the traffic goes through one member and comes back through the other member of the cluster.
- Static ARP entries in the "neighbor" routing devices, or ARP cache issues.
- Any kind of issue with Internet access: DNS queries not allowed to Internet or to the corporate DNS servers (so we cannot solve our public domains), or TCP ports blocked, or any required URL blocked (typically by a proxy)...
- Traffic delays: these are typically more difficult to diagnose. fw monitor with timestamps is one of our friends here.
- Layer 1 (physical) issues. Don't forget to review the hardware interface counters!
- Missed route at the destination, especially related to the routes related to the encryption domains in a VPN.
- Why not: another firewall blocking the communication, of course 😊 Or a forgotten transparent layer-7 device in the middle (like an IPS), installed in a previous age. This may be a variant: "it's not my firewall"
- An application or server issue. The simplest example is that the server is not listening in the requested port. A more complex one would be an application layer issue.
Lastly, a little humor. 😊