Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
hti
Participant
Jump to solution

Lightspeed accelerated connection problems after R81.20 JHF Take 111

Hello,

is anybody else experiencing issues with lightspeed accelerated connections after upgrading to R81.20 JHF Take 111 or 113?

We've been running R81.20 JHF Take 98 with some custom hotfixes on QLS450 and it was quiet stable. After upgrading to JHF Take 111 lightspeed accelerated connections do not work. They are not shown as dropped but packets just disappear.

To rule out any issues with the custom hotfixes we performed a clean install on R81.20 JHF Take 113 and still have the same problems. By disabling the low latency mode a lot less connections are lightspeed accelerated and the packet loss is reduced. No error message in the log files.

Checkpoint support was unable to recreate the issue so far but as it affects a clean install with just the firewall blade enabled I suspect there are a lot more customers affected.

Any help is appreciated.

Best regards,
hti

0 Kudos
1 Solution

Accepted Solutions
hti
Participant

Hi all,

last week we did receive a private hotfix for JHF Take 113. We have been monitoring the new patch closely and it seems it completely solved our issues. Another customer was having the same problem and the provided hotfix also worked for our environment.

Issue description:

It was caused by the "peer_bind_done_map", which was not being set for the correct peer. So, unbind wasn't being done when the interface was brought down. The next time the interface is brought up, bind would fail, disrupting inter-port hair-pinned traffic. And the fix I will provide is correcting the code and setting the peer_bind_done_map on the correct index to map to the peer interface.

View solution in original post

11 Replies
PhoneBoy
Admin
Admin

That sounds like a possible UPPAK-related issue.
In any case, keep us posted.

0 Kudos
Lesley
MVP Gold
MVP Gold

fwaccel stat ouput?

You run it as security gateway, maestro, vsx or vsxnext?

QLS450 need special image did you keep this in account?

Why installation of take 113 if 118 is now recommended? 

Any crashes here ? /var/log/dump/usermode/

-------
Please press "Accept as Solution" if my post solved it 🙂
0 Kudos
hti
Participant

Hi Lesley,

it's hard to gather fwaccel stat output while the gateway is active, as we loose connectivity when doing so. We will capture fwaccel stat output in our next maintenance from the serial console.

The QLS450 is running as a security gateway with only the firewall blade active.

We did take the wrong blink image when doing the clean install, indeed! Thanks for pointing this out! We will try again using the correct image. 

Take 113 was the recommended take a few weeks ago when we reimaged the appliance. We will upgrade to take 118 after the clean install with the correct image.

No crash dumps in /var/log/dump/usermode/

0 Kudos
Lesley
MVP Gold
MVP Gold

Ok, for the crash files I was thinking about: https://support.checkpoint.com/results/sk/sk180298

I don't see any relevant fixes in take 118 for this issue

Might be worth to run also hcp -r all during issue (if possible)

Is all traffic affected or only VPN traffic? 

Extra tip, if you are troubleshooting and see high load with top, uptime etc this is normal -> https://support.checkpoint.com/results/sk/sk180299

Take 98 was not a good take for you, they fixed this in your current higher take. Maybe this is now bothering you (this was a critical bug)

https://support.checkpoint.com/results/sk/sk183181

 

-------
Please press "Accept as Solution" if my post solved it 🙂
0 Kudos
hti
Participant

Hi Lesley,

TAC said that a special image for QLS appliances is not needed anymore. This was only necessary in R81.10. For R81.20 the normal blink images can be used as there is no difference.

0 Kudos
the_rock
MVP Gold
MVP Gold

Did you open TAC case for it?

Best,
Andy
0 Kudos
the_rock
MVP Gold
MVP Gold

Please update the thread when it gets solved.

Best,
Andy
0 Kudos
hti
Participant

Hi all,

last week we did receive a private hotfix for JHF Take 113. We have been monitoring the new patch closely and it seems it completely solved our issues. Another customer was having the same problem and the provided hotfix also worked for our environment.

Issue description:

It was caused by the "peer_bind_done_map", which was not being set for the correct peer. So, unbind wasn't being done when the interface was brought down. The next time the interface is brought up, bind would fail, disrupting inter-port hair-pinned traffic. And the fix I will provide is correcting the code and setting the peer_bind_done_map on the correct index to map to the peer interface.

the_rock
MVP Gold
MVP Gold

Excellent, thanks for the update!

Best,
Andy
0 Kudos
hti
Participant

The fix PRHF-41893 has been included in JHF Take 119 released today.

the_rock
MVP Gold
MVP Gold

Indeed.

PRJ-63769,

PRHF-41893

SecureXL

When VLAN interfaces are created on top of bond interfaces configured for Load Sharing, connections may not be hardware accelerated if the bond uses multiple ports from the same physical NIC.

Best,
Andy
0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events