Jonathan Lebowitsch

Not web-based, not proxied traffic through an Autoscaling Group

Discussion created by Jonathan Lebowitsch Employee on Jun 9, 2017
Latest reply on Oct 5, 2017 by Vladimir Yakovlev

As you may know, the normal, documented deployment of an autoscaling group (in AWS and elsewhere) is always sandwiched between loadbalancers.

For egress traffic, the documented way is that of using an internal loadbalancer to act as a front end proxy. The ELB (which is deployed as part of the cloud formation template) is configured to load-balance on the ASG.  The ASG is configured to act as http/https proxy. Customers then have to have their web-clients (browsers, wget, curl, WindowsUpdate, etc) point to that ELB as their proxy, and then that ELB will forward the traffic to the right instance in the vSEC ASG.


The challenge is that not infrequently some egress traffic is not web based, or sometime it's operationally challenging to enforce proxy settings. In such cases it would be beneficial if we could use more conventional, routing based approach to make sure that outbound traffic is inspected by some member of the ASG. 


There are 2 approaches one can take here: First is that of "reserving" some instances of the ASG, so that they won't be "scaled in" , and then manually setting routes against these specific instances, making them the default gateway for internet bound traffic. While this works, it does not provide any measure of resiliency or high availability.


The second approach is what this post is about. Below you'll find a link to a PoC of a lambda function, plus some instructions on how to set the trigger for this function. When deployed properly, the lambda function will listen to notifications and alerts about the ASG and in response, it will automatically maintain an optimized mapping of route tables to active gateways. This allows a hand-free use of members of the ASG as default gateways and will thus allow the protection by the ASG of any type of outbound traffic, and not only proxied web based. As currently implemented, whenever a member of the ASG becomes unavailable or "OutofService" according to the ELB, the code will find all the route tables that used this GW and will move them to another, healthy, gateway. 


The PoC code and deployment instructions are available here.


Note1: that this approach could be combined with other lambda-based solutions to maintaining synchronization between route53 and ASG, so as to allow also non-TCP ingress traffic to flow through the ASG. 


Note2: this code was not extensively tested. I'd recommend some additional tests if you want to use it in production.