- CheckMates
- :
- Products
- :
- CloudMates Products
- :
- Cloud Network Security
- :
- Discussion
- :
- What happens to long-term (long open) connections ...
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What happens to long-term (long open) connections on a scale in event.
What happens to long-term (long open) connections on a Scale In event where the connection/s are being handled by the gateway marked for termination, which is then terminated?
I checked the documentation and it is not clear to me:
This is what they have in there right now:
“Scale In
A scale in event occurs as a result of a decrease of the current load. When a scale in event triggers, Azure Autoscale designates one or more of the gateways as candidates for termination. The External Load Balancer stops forwarding new connections to these gateways, and Autoscale ends them. The Check Point Security Management Server detects that these CloudGuard Network Security Security Gateways are stopped and automatically deletes these gateways from its database.
Note - We recommend that you have at least two Security Gateways for redundancy and availability purposes.”
This is what I have sent in as feedback:
“This sentence does not seem to make sense:
" The External Load Balancer stops forwarding new connections to these gateways, and Autoscale ends them. "
It will help to understand the Azure and Check Point behaviour with regards to connections handling during Scale In events and deleted gateways.
One detail missing is handing of long-term connections by the deleted gateway and the connection possibly moving to another gateway where there is no synchronisation in the VMSS group.”
I wonder what is meant by "Autoscale ends them".
Can't test this now.
Any feedback or shared experience appreciated.
Don
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The way I read this is: they die because the load balancer won’t forward the packets to a different gateway.
Even if the load balancer did, we don’t sync state information between gateways in this situation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"we don’t sync state information between gateways in this situation" - Agreed. Done by design.
"they die because the load balancer won’t forward the packets to a different gateway."
Thoughts:
Azure seems to have some options which I need to look into.
This one does not seem to be well described. I can't find anything on it in their docs:
'Apply force delete to scale-in operations' (also see attachment/screenshot)
This one looks interesting but would is work for CloudGuard?
Maybe there is an Azure VMSS best practice or ATRG that I have missed, or maybe they don't exist but they want to 😉
Nothing CloudGuard in here:
https://support.checkpoint.com/results/sk/sk111303
The Admin Guide has lots of useful info but the Scale In doesn't seem to have enough details.
https://sc1.checkpoint.com/documents/IaaS/WebAdminGuides/EN/CP_VMSS_for_Azure/Content/Topics-Azure-V...
Maybe the Scale in policy can be configured to satisfy the draining of connections for a limited time.
But it would be good to hear from R&D on this.
This is good info too:
Autoscaling guidance - Best practices for cloud applications | Microsoft Learn
Cheers,
Don
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Don_Paterson - TCP flows and connection draining are all based on the standard azure load balancer (az lb) healh probe function. For example, if the az lb health probe that is configured for the backend pool marks a gateway as unhealthy then the default TCP timeout is 60 seconds. UDP flows would immediately move to a healthy gateway.
https://learn.microsoft.com/en-us/azure/load-balancer/load-balancer-custom-probe-overview
A probe down signal always allows TCP flows to continue until idle timeout or connection closure in a Standard Load Balancer.
In order to ensure a timely response is received, health probes have built-in timeouts. The following are the timeout durations for TCP and HTTP/S probes:
- TCP probe timeout duration: 60 seconds
- HTTP/S probe timeout duration: 30 seconds (60 seconds for establishing a connection)
Addtional TCP Flow timer information:
Azure Load Balancer has a 4 minutes to 100 minutes timeout range for Load Balancer rules, Outbound Rules, and Inbound NAT rules.
By default, it's set to 4 minutes. If a period of inactivity is longer than the timeout value, there's no guarantee that the TCP or HTTP session is maintained between the client and your cloud service.
https://learn.microsoft.com/en-us/azure/load-balancer/load-balancer-tcp-reset
HTH
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That is great info. thanks Bryan!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Adding a note here after feedback from R&D via Gil Frantsus:
"This is the information we received from RnD: The Azure Load Balancer does not support connection draining, which means that the connection will be lost, however, the Azure Application Gateway does support it.
The use of a Gateway Load Balancer is supported and mentioned in the Azure VMSS admin guide.
Refer to sk170304 for instructions on how to enable connection draining within CloudGuard Network Security. I added the SK to the Azure VMSS admin guide.
For Azure Application Gateway with connection draining support refer to https://learn.microsoft.com/en-us/azure/application-gateway/features."
Preview of sk170304:
"Solution
If you would like this feature to be added to the Azure load balancer, contact Microsoft or your Microsoft partner and request it."
"fw tab -t connections -s
fw ctl get int cloud_balancer_port
fw ctl set int cloud_balancer_port 0
fw tab -t connections -s"