Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Kaspars_Zibarts
Employee Employee
Employee

Spoofing group calculated with subnets from wrong interface

Before I jump and raise an SR, just wondering if anyone has seen it (don't seem to find any relevant SK)

One of our admins did a straight forward routing change on a plain firewall cluster (R80.10), single route. Then did "get interfaces with topology" and the whole hell broke lose at that site.

At this stage I can see from audit logs that auto topology applied all subnets from interface "X" to spoofing group on interface "Y". So the only affected interface is "Y" (since it has full list of wrong subnets). Interface "X" has correct own topology as per static routes. Or I can say contents of spoofing groups for X and Y interfaces is 100% identical at the moment.

The only correlation between two is that they having exactly opposite interface element indexes in cluster member object on corresponding gateways.

Really strange. If anyone can chip in idea would be great.

0 Kudos
10 Replies
Vladimir
Champion
Champion

Kaspars, are you saying that the interfaces on cluster members are inverted?

I.e. Clustered interface External; Member1/eth0 ; Member2/eth1

      Clustered interface Internal ; Member1/eth1 ; Member2/eth0

If I understood your description accurately, and the above scenario is correct, it may be that the Get Interfaces with Topology executes against active cluster member, which may have changed since initial ingestion.

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

Actually no. 

It's only one interface that's wrong. And it has copied routes not from its own interface but from totally different one to spoofing group. In full.

The details how it happened unfortunately are sketchy - it's a remote site managed locally but audit logs comfirm that wrong subnets were injected into the group. Strange

I'm still trying to replicate it in the lab..

0 Kudos
Timothy_Hall
Legend Legend
Legend

As a rule, I never select "Get Interfaces with Topology" unless it is a brand new firewall install and I'm positive that the routing table of the firewall is correct.  Even then I still look it over carefully afterwards, and I even warn about this situation in my book although it is not directly performance-related.  In my opinion once a firewall is in production, one should always select "Get Interfaces" from that point on and manually configure antispoofing for any new interfaces, as the consequences of an improper topology calculation can be so dire as you regrettably discovered.  Hopefully you know about this downtime-free trick I covered in my CPX presentation to disable firewall antispoofing on the fly if it is cut off from the SMS:

fw ctl set int fw_antispoofing_enabled 0
sim feature anti_spoofing off; fwaccel off; fwaccel on

Now that I've climbed off my soapbox, in your specific situation do the two interfaces involved have a relatively long interface name or are subinterfaces with a long VLAN tag appended?  I'm wondering if perhaps the topology calculation has some kind of length limit for interface names.  So for example if the limit was (a purely hypothetical) 9 characters, it would confuse the first two interfaces in this list but the third one would be fine:

eth12.1001

eth12.1007

eth12.1011

Only other thing I could possibly think of is some kind of VLSM or subnet overlap between the routes associated with the two interfaces?  Perhaps if you could post the ifconfig and routing details of the two interfaces it intermingled (redacted as needed of course) that might help.

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
Kaspars_Zibarts
Employee Employee
Employee

We were following manual updates up until R77.30 but then it became fine balance between "not that experienced" administrators that either forgot to update spoofing after routing changes or did it wrong way so we ended up with outages anyway, so we moved to full auto mode after R77.30 and it actually has been working faultlessly for couple of years. Till now really. It is really beneficial for big routing / interface move type changes as then usually you have to update hundreds of subnets and multiple groups manually. Get interfaces with topology does it in one click Smiley Happy  And we use autogenereated topology on all VSX platforms that has hundreds of routes and interfaces so it felt like it should be all worked out by now.

I will update this thread once I have had time to do full replica in the lab with all the steps that my admin took to see if I can replicate the problem - i tried yesterday adding dummy route (similar action that caused the original problem) and it worked just fine

0 Kudos
Tomer_Sole
Mentor
Mentor

Would you consider Compliance Blade to assist in those cases? 

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

0 Kudos
Jason_Dance
Collaborator

Unfortunately it's not free. You might want to talk to your SE for a trial if you're interested in looking at it further though.

-Jason

0 Kudos
Tomer_Sole
Mentor
Mentor

1 year evaluation license upon purchase of a new Management appliance, and then it's a subscription license. 

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

Just thought to update with the solution. Now that I know what the problem was it's rather embarrassing.. Smiley Happy

Since site is managed by a local administrator you fully rely on some basic stuff like keeping routing table in sync. It was not the smallest table with nearly 30 interfaces and 300 routes, many of those /32 host routes. And the site has gone through major migration project when VLANs and subnets were moved and relocated multiple times in short span of time. Chuck in HW/SW upgrade for full picture.

It turned out one /28 route was missing from secondary cluster member and one /32 had different last octet on both members.. So no wonder spoofing calculation went nuts. After "syncing" up routing table "get interfaces with topology" works like a clock again.

Of course it's easy to say now that I should have diffed routing tables first thing, but one must have some trust in others Smiley Happy And in hindsight could be nice if topology calculation would throw errors for cluster interfaces that do not have matching routes associated with them.

And before someone suggests - yes we could have used cloning group but that's whole different discussion Smiley Happy

So it nutshell - do your diff on routing table when topology acquisition seems to fail to calculate. Obviously going into our daily automated checks now.

Vladimir
Champion
Champion

I'll take a pound of the fresh hindsight Smiley Happy

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events