Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Vladimir
Champion
Champion

Traffic is originating from a VS with the VSX internal communication address

And no, I do not have the 192.168.196.0/24 anywhere on my network as sk101448 Traffic is originating from a VS (virtual system) with the VSX internal communication address describes.

One of my clients has just reported seeing same issue in his environment and I've span-up the VSX VSLS cluster on ESXi to see what's what and am seeing same thing (each VS is active on a different cluster member):

and from practically identical VS on the same cluster:

At the end of the previously mentioned sk, there is a workaround suggesting:

"If the 192.168.196.0/24 network can not be removed from the local network, then the 192.168.196.0/24 network must be removed from the NAT policy"

 

But I'd like to hear about the implications of doing it before trying it out.

27 Replies
Kaspars_Zibarts
Employee Employee
Employee

From memory (and I might be wrong, will check tomorrow) to get around this I had to add explicit rule with the source of vs2 object and appropriate destination and port. How does your rule #1 look like?

0 Kudos
Vladimir
Champion
Champion

Hi Kaspars.

Here it is:

Both VS' external ports are connected to the same VSWitch1.

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

Total bummer - log indexing crashed last night / yesterday at some point on MLM so it's on backfoot atm trying to catch up with logs Smiley Sad so can't compare to mine and lab is dead atm too as I'm upgrading mgmt to R80.20... I'll check next week Smiley Happy

0 Kudos
Vladimir
Champion
Champion

Ouch! Good luck with recovery and the upgrade!

Are you doing export/import from 80.10 to 80.20?

I've done one recently in non-MDS environment.

So far so good, but that is a brand new infrastructure and the load on it is very light at the moment.

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

Actually I just completed full upgrade (primary/secondary MDS + MLM, 20 domains) using CPUSE. Took quite long but that's purely my lab ESX io speed and CPU limitations. Tested with pushing topologies and policies to 20 odd VSes on two different VSX clusters, looking good so far Smiley Happy Will comment on upgrade documentation soon Smiley Happy

0 Kudos
Vladimir
Champion
Champion

Knock on wood that this will become the way to do upgrade in the future.

CPUSE does work awesome, especially comparing to the mess we've had in pre-R77 years.

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

Same in my VSX. Originates from 192.168 and is NATed behind interface IP. when leaving the box Is it causing issues for your customer? Because if it isn't then there's nothing to worry about Smiley Happy

Vladimir
Champion
Champion

I am trying to sort their issues out.

To the best of my understanding, they've had an issue with one of the VS in VSLS not failing over properly from one unit to the other.

They have rebuild the VS and now are observing this behavior. I think that VS itself is working now, but will have to verify.

They were just a bit stumped by this NAT action and, frankly, I am as well. Searching KB does not really return anything meaningful.

The thing is, these guys are relying on dynamic routing and I am a bit concerned how that will behave if the additional NAT is introduced on the traffic from VS that is also participates in the OSPF.

Check Point guys and girls, please chime in with explanation of the VSX VSLS implied XLATE behavior for virtual systems and it possible impact on dynamic routing.

Thank you,

Vladimir

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

But these internal VSX subnets 192.168 should not be distributed anywhere or part of anything. NAT does the correct thing and hides those. Why would you want to take away it?

You only want to see internal nets between nodes for clusterXL. that should be straight L2 VLAN

But then again we never run dynamic routing on VSX Smiley Happy hard to comment

0 Kudos
Vladimir
Champion
Champion

At this point, I'm thinking that it is working as designed and am not inclined to do anything about it, but simply would like to understand the mechanism of what is going on in the background better. 

0 Kudos
Vladimir
Champion
Champion

By the way, if all of them would've behaved identically, I could dig that, but as you can see in my screenshots, source for one of the VS is the assigned IP, for another is internal network address.

Perhaps there is a logic behind it, but I have no idea what it is.

0 Kudos
Maarten_Sjouw
Champion
Champion

The screen shots show log from the same VS2 but the direction is swapped, so yes this is what you would expect.

In the first you´re ping is from VS2 to VS1 with a source NAT on VS2

In the second you´re doing a ping from VS1 to VS2 with destination NAT on VS2

When you would look at the same log entries from VS1 you would see the opposite.

Regards, Maarten
0 Kudos
Vladimir
Champion
Champion

Thank you for pointing this out! 

For a second there, I thought that it all makes sense. Alas, it still does not.

As you have mentioned, when logs looked at from each Origin VS, Xlate (NAT) seem to be working correctly:

From VS1:

and from VS2:

But when I've pinged the external host from each VS, this is what I am seeing:

and:

Unless I am reading this wrong, there is a discrepancy in Source and Origin data in the logs.

0 Kudos
Maarten_Sjouw
Champion
Champion

They all look perfectly fine to me, you will always see on the originating ping side VS, that it will do source NAT, this does not differ if you ping another VS or any other host.

When you ping another VS you will see the real IP (172.31.196.x) is being NATted to the VS official IP (the 10.10.10.x in above sample) before it leaves the first VS, on entry on the second VS the official IP 10.10.10.y will be destination NATted back to the real IP 172.31.196.y

Let's assume you have these IP's:

VS1                                                                VS2                                                               Host

real IP                          VS-IP                         real IP                          VS-IP                        IP

172.31.196.101           10.10.10.1                 172.31.196.102           10.10.10.2                 10.10.10.10

Now we start a ping from VS1 to VS2

Step 1   172.31.196.101  -->  10.10.10.2           from kernel to outbound interface

Step 2   10.10.10.1          -->  10.10.10.2           from interface to 'cable' leaving VS1

Step 3   10.10.10.1          -->  10.10.10.2           from 'cable' to interface entering VS2

Step 4   10.10.10.1          -->  172.31.196.102   from inbound interface to kernel

Now we start a ping from VS1 to Host 

Step 1   172.31.196.101  -->  10.10.10.10           from kernel to outbound interface

Step 2   10.10.10.1          -->  10.10.10.10           from interface to 'cable' leaving VS1

Step 3   10.10.10.1          -->  10.10.10.10           from 'cable' to interface entering Host

Step 4   10.10.10.1          -->  10.10.10.10           from inbound interface to kernel

I hope this will make the picture complete and more understandable, when you go to a external host the 4th step will not be NATted.

To make it all even more crazy, when both VS'es would run on the same host you would see that both VS'es have the IP 172.31.196.101 as the VS will use 172.31.196.101 on host 1 and 172.31.196.102 on host 2.

Regards, Maarten
0 Kudos
Vladimir
Champion
Champion

Maarten,

I think you meant to write 10.10.10.10 on the rite sight of your example above in "Now we start a ping from VS1 to Host " section. Please correct it for others to follow and for my poor brain not to get even more twisted over this Smiley Happy

I still do not comprehend why source VSX_VS2 getting Source NATed to IP of the VS1:

I'll regroup and try to look at it tomorrow again.

Thank you for trying to help me to get to the bottom of this.

0 Kudos
Maarten_Sjouw
Champion
Champion

Previous post updated. 😉

I think this is due to the confusing naming you used, and the fact as I mentioned, the real IP's are reused in different VS' es and then you can be led to believe that the other VS is handling the traffic.

It does not make sense what I just wrote but I did see this before and it should not happen.

Regards, Maarten
Vladimir
Champion
Champion

OK then, just to nail it down, I'll rename the VS' to VSA and VSB and rerun the tests.

Pretty much, if the real IPs are being re-used dynamically by the VSX within clusters, we cannot rely on those at all for troubleshooting based on baseline data.

I would really like for someone in Check Point VSX R&D to get involved and explain the logic of this to us.

Dameon Welch-Abernathy‌, any chance you can rattle some cages to make it happen?

0 Kudos
PhoneBoy
Admin
Admin

The internal communication VS communication addresses generally shouldn’t show up like this (to the best of my knowledge).

We probably need a TAC case.

0 Kudos
Vladimir
Champion
Champion

My client, who has encountered his in production, should have an SR open.

I am working in the lab environment with no contracts.

Speaking of which: please let me know if you have figured out how to swing the accounts to another UC without the loss of content, history, and brag factors.

If I recall correctly, when you've tried doing something like that for Heiko, something didn't go as intended.

I will probably be better off moving to my own account, since I do carry a support contract.

0 Kudos
PhoneBoy
Admin
Admin

I have swapped accounts for people in the past, it just has to be coordinated.

Contact me offline Smiley Happy

0 Kudos
Vladimir
Champion
Champion

OK. I am calling a BUG:

Environment consists of 2 VSX units in VSLS deployed on ESXi.

R80.20 management with R77.30 VSX'.

VSA   10.10.10.101

VSB   10.10.10.102

VSC   10.10.10.103

In this example, the ping from VSB to VSC, I am seeing Xlate (NAT) Destination IP labeled as VSX1_VSA.

If the Internal Network IPs are actually rotated and are used dynamically, then in the NAT properties it may make sense to either not display the name of the VSX host and VS at all if those cannot be attributed correctly.

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

I'm getting consistent results from all VSes. VSX running R80.10, managed by R80.10 MDS.

Traffic always originates from "internal" VSX subnet and is translated to real interface IP

Ping from VS to VS

Ping to another IP

0 Kudos
Vladimir
Champion
Champion

Perhaps VSX R80.10 behaves differently, but you see my results above with R77.30. Would be nice to get CP input on it.

0 Kudos
PhoneBoy
Admin
Admin

Traffic should never be originating from those IPs.

Someone else I know saw a similar issue when they upgraded to R80.20 (don’t know the specifics).

We probably need a TAC case to gather the relevant debug.

0 Kudos
Simon_Taylor
Contributor

This resembles a case we had on R77.10 which was resolved by TAC by adding the following:

  • fwha_enable_state_machine_by_vs=1
  • fwx_old_icmp_nat=0

Perhaps this can fix your issue?

0 Kudos
Vladimir
Champion
Champion

Thank you Simon!

I'll give it a shot tomorrow and will let you know.

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

From my notes they are not applicable anymore Smiley Happy

for old_icmp_nat

for state machine

I've removed them from our fwkern 

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events