Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
flachance
Advisor

Getting lots of time out on Internet access

We had an issue today where our Internet access was weird. A lot of sites would  timeout and work eventually.

For example on the active gateway I could ping 8.8.8.8 it would work most of the time but regularly we’d get 100% packet loss.

Looking at it with fw monitor we could always see that traffic was getting out.

 

About 3 weeks ago we changed our cluster gateways hardware (OpenServer on HP Proliant).

We also went to R81.20 JHF take14

 

As a test we tried forcing a failover to make sure there wasn’t an issue with the active gateway. It didn’t make a difference.

We then rebooted the now standby gateway and triggered another failover to make it active again. After that everything was working smoothly again.

 

Was it something else and a coincidence that everything started working again I don’t know.

 

Is there anything that can be looked at and investigated to find what was going on with the gateway?

If the issue starts again what can we look at that could explain a behavior like that?

 

Thanks

Francis

0 Kudos
14 Replies
the_rock
Legend
Legend

Hey Francis,

Personally, I would investigate any relevant logs from that time period, as well as run cpview and then check history. Example...run cpview -t from expert, then press letter t and choose time frame. That would also give you some details. 

Andy

biskit
Advisor

Did you check out whether anything else is using the same IP as the firewall or VIP?  That sounds suspiciously like an ARP clash...  What else besides the firewall is in the switch/VLAN that uplinks to the ISP router?

0 Kudos
flachance
Advisor

no duplicate IP. I'll check with the network guys if there is something else on that VLAN but there shouldn't be.

0 Kudos
genisis__
Leader Leader
Leader

Long shot but did you check your duplex settings on the open servers just in case you have a duplex mismatch issue which could be cause the packet loss?

0 Kudos
flachance
Advisor

Was worth checking but duplex settings are all matching

0 Kudos
the_rock
Legend
Legend

Were you able to get any more details from the logs at all?

Andy

0 Kudos
flachance
Advisor

No not seeing anything that stands out to me.

0 Kudos
the_rock
Legend
Legend

I have another suggestion...run below commands on both members and check the time when this issue happened...maybe you will get more details:

grep -i /var/log/messages* DOWN

grep -i /var/log/messages* CLUSTER

Andy

0 Kudos
the_rock
Legend
Legend

Sorry, this is right syntax

Andy

grep -i DOWN /var/log/messages* 

grep -i CLUSTER /var/log/messages* 

grep -i PNOTE /var/log/messages*

0 Kudos
flachance
Advisor

don't see anything at time of issue. Everything has been fine since. Not even 100% sure the firewall was the cause. Hopefully it doesn't happen again

0 Kudos
genisis__
Leader Leader
Leader

What about connections table limit, again long shot but lets get these out of the way.

the_rock
Legend
Legend

Thats a very good point, you never know...

0 Kudos
flachance
Advisor

Not sure on this one (how to see the limit). I have the maximum limit for concurrent connections set to Automatically.Capture.JPG

0 Kudos
the_rock
Legend
Legend

Thats good, leave it as such, thats better option anyway, as gateway calculates on its own as far as number of connections, based on cpu/memory load. Btw, Im just slightly concerned (though I want to be positive) that you may have cluster issues, considering problem went away when you failed over.

Just to be sure, can you run below commands on both members, just to make sure all is good?

Andy

cphaprob state

cphaprob roles

cphaprob -a if

cphaprob list

cphaprob syncstat

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events