Outbound SSL Inspection: A war story

FedericoMeiners · ‎2019-07-20

Hello everyone,

I'm looking to share my experience and concerns regarding the current state of SSL Inspection in general and with Check Point, I'm also looking to see what is your approach in this matter.

Important: This post is strictly to Outbound SSL Inspection, my experience with inbound is limited with Check Point.

I'll start by making a vendor agnostic statement: One of the things that bothers me most about this subject is that there are many people that states that they have this technology implemented in their organizations and have no issues at all, after I met those organizations I found one of the following:

They have 10 users at most.
Their SSL Inspection technology is shady: SSL Inspection is enabled but there is a working and HUGE fail open at the bottom of the engine, if there's something that may cause issues, the connection is accepted, the log shows that the packet was "Inspected" and everyone is happy.
They think they have SSL Inspection but they don't have it enabled, instead they have a feature similar to Categorize HTTPS sites.

My personal experience

We all know the importance to inspect HTTPS traffic in our network due to visibility and security, but I times I feel like this way of thinking is my InfoSec persona speaking that only gives woo woo advices to the business regarding security and doesn't even know how to install an antivirus software. Why? Because properly implementing this solution is just painful. I also know that most customers don't even want to dive into this matter and prefer to lay this layer of security into their endpoint solutions, even if it's not the same I respect that.

Within our customers we have one in particular that is very tech savvy with a 900+ users and likes to turn on all the features in their NGFW and one of the requirements was to enable full outgoing ssl inspection in the Check Point firewall. We started with this customer years ago with R77.30 in gateways so you can imagine that I've been through all kinds of fun experiences with SSL Inspection.

We have fully tested SSL Inspection in the following versions: R77.30 / R80.10 / R80.30

During this journey we went through a lot of information, there are many awesome posts here in Check Mates, SKs, SRs with the TAC, you name it.

Issues that we faced

The following is a summarized list of issues that we had.

Heavy performance issues in R77.30 (Fixed in R80.10+)
There are many pages that fail to load or load sometimes.
There are many pages that work but after you enter a specific section just a login page, they fail.
Sophos Antivirus solution doesn't work, either install or updates: There are some post about this issue
AWS Connectors failing: (Solved in R80.30)

The issue with Check Point and Outbound SSL inspection

I really like Check Point firewalls, but sometimes I feel that they take control from the administrator. One of the first thing that we did was to disable all the options regarding dropping connections that don't follow the RFC line by line, allowing connections to not trusted certificates, we have tried it all, and still we faced a lot of issues.

Bypassing the connection? Good luck with that, Check Point firewalls always inspect the first packet and only by that there are many connections that fail.

Probe bypass to mitigate the previous issue? Sure, be prepared to have other issues due to SNI verification.

Fail open in probe bypass? This was a huge surprise after it was changed in Take 189, but we still had a lot of issues after enabling this flag.

WSTLSD debug? Many times, good luck not taking down the firewall with it and be prepared to wait a long time for the TAC to inspect it, not their fault, it's just really hard to troubleshoot these issues.

The only way that we found to properly bypass connections was to exclude it COMPLETELY from the SSL policy. Example: Let's say that you have two network segments and you only want to inspect traffic in one of them:

What most people do

In this example all traffic from 10.0.0.0 will be inspected, however you probably will have some issues in the 192.168.0.0 network as well since the bypass action enforces to inspect the first packet of the SSL Handshake.

Only way do nothing with the connections

In our research we found that the only way to properly bypass a connection was excluding it completely from the policy. Obviously this approach is not scalable and somewhat utopic in a big network.

The most stable scenario that we reached

We reached a state of stability in R80.10 with the JHF prior to 189 and by enabling the following flags and features:

appi_urlf_ssl_cn_allow_not_rfc_ssl_protocols=1 (Don't know where I get this, also there is not documentation about it)
enhanced_ssl_inspection 1 (Probe bypass)
bypass_on_enhanced_ssl_inspection 1 (Fail open probe bypass)
Almost all features that drop packets turned off in SSL Inspection.
HTTPS Categorization was turned on.

Sophos antivirus worked fine, issues with web pages went down to a minimum.

The journey to R80.30

We decided to migrate one of the cluster members to R80.30 to test the new SSL Inspection engine and to solve some issues that we had with UserCheck. After the deployment we had some issues regarding Proxy ARP:

https://community.checkpoint.com/t5/Enterprise-Appliances-and-Gaia/Proxy-ARP-after-upgrade-to-R80-30...

Some Inspection Settings that started to cause issues in R80.30 and not previously.

We sorted them all and the first impression was just great:

AWS Connectors worked flawesly without enabling any of the previously stated kernel flags.
Sophos Antivirus could get updated without enabling any of the previously stated kernel flags.
All the services detailed in the preliminary testing document worked great.

The next day a waterfall of user complains started to appear:

Sophos Antivirus could not be installed: Updates worked fine but installation failed. After looking into the logs no traffic was dropped (Logs and output from fw ctl zdebug). The only log was a Detect regarding untrusted certificates which we configured to accept it in the SSL settings. We tried the flags, setting up a FQDN object just for *.sophos.com and setting this up in the SSL Inspection policy and still failed.
The main billing service of the company stopped working, again, no logs or possible leads why. We even looked at PCAPs and all seemed fine in the firewall perspective. At soon as we routed this traffic through the pfSense everything worked flawlessly.
Another invoice service stopped working: Again, no leads whatsoever, after we routed this traffic through pfSense all started to work.
Webpages that did not load properly or have some functions affected.

We tried everything, perform captures by turning off SecureXL, even the bypass flags could not solve these issues in R80.30, we had these issues in R80.10 and after turning on the different flags all worked, but not in this new version.

At this point there were many issues impacting production, we blocked one hole and another 10 appeared. It was just impossible to properly troubleshoot each issue, we had no other option but to go back to our most stable version.

Future plans

There is no way to deploy SSL Inspection without issues, problem is that these issues will probably affect your production environment heavily, there is no way that you can test all your organization use cases, also there is no way to properly assure functionality in a lab enviroment.

Our main concern now is the remote possibility that we will have to live for life in R80.10, we know that we will have to update sooner or later that's why no we are implementing a parallel CHKP Frontier such as we have our failback pfSense, this new gateway will have R80.30 with the same features, thing of it as a hybrid testing/production enviroment.

The main idea is to route certain subnets to the R80.30 gateway and study their behaviour and troubleshoot without all the user complaining.

Concerns regarding the current state of SSL Inspection

Check Point firewalls don't provide a proper solution to bypass desired SSL Traffic, therefore is really hard to deploy this solution on a big environment.
Lab tests are not representative, only way to test is in production.
Really difficult to troubleshoot: Many times there are no leads and everything seems fine in the firewall, forcing you to perform PCAPs on different parts of the network and debugging.
Check Point current approach on SSL Inspection impacts image perception of the brand: I heard it all the time "I have a friend who inspects SSL traffic with YYY and has no issues". Most people don't know that it's more of a technology issue in general regarding SSL/TLS instead of a Check Point fault, but other vendors offer this functionality and have fail backs mechanism that works without the user knowing, it's less secure but at the end of the day the main metric is functionality and not security in most cases.

Hope this post helps you in implementing this feature in a harmless way.

Regards,

____________
https://www.linkedin.com/in/federicomeiners/

HeikoAnkenbrand · ‎2019-07-21

Yes, in the history there were problems with the SSL interception. For example, SNI is only supported from version R80.30. For https inspection Check Point use CPAS. More info you can find in my article R80.x Security Gateway Architecture (Content Inspection). If you have performance problems with certain urls, please give us an example. In the first step I would upgrade everything to R80.30 to have SNI support and be able to use newer chipers.

This is new in R80.30 (more see here R80.30 Release Notes

SSL Inspection

Server Name Indications (SNI)
- Improved TLS implementation for TLS Inspection and categorization.
- Next Generation Bypass - TLS inspection based on Verified Subject Name.
TLS 1.2 support for additional cipher suites
- TLS_RSA_WITH_AES_256_GCM_SHA384.
- TLS_RSA_WITH_AES_256_CBC_SHA256.
- TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256.
- TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256.
- TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384.
- TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256.
- TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384.
- TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256.
- X25519 Elliptic Curve.
- P-521 Elliptic Curve.
- Full ECDSA support.
- Improved fail open/close mechanism.
- Improved logging for validations.
- For the complete list of supported cipher suites see sk104562.

Infos to CPAS:

Active Streaming (CPAS) - Check Point Active Streaming active streaming allow the changing of data, we play the role of “man in the middle”. CPAS breaks the connection into two parts using our own stack – this mean, we are responsible for all the stack work (dealing with options, retransmissions, timers etc.). An application is register to CPAS when a connection start and supply callbacks for event handler and read handler. Several protocols uses CPAS, for example: Client Authentication, VoIP (SIP, Skinny/SCCP, H.323, etc.), Data Leak Prevention (DLP) blade, Security Servers processes, etc. In some scenarios, HTTP is also using CPAS, when IPS Web Intelligence protections are enabled. As for now, CPAS doesn't support accelerated traffic, this mean that each CPAS packet will be F2F.

CPAS works through the F2F path in R80.10 and R77.30. Now CPASXL is offered in SecureXL path in R80.20. This should lead to a higher performance.

General overview:

CPAS breaks the connection into two parts using our own stack – this mean, we are responsible for all the stack work (dealing with options, retransmissions, timers etc.)
An application is register to CPAS when a connection start and supply callbacks for event handler and read handler.
On each packet, CPAS send the application the packet data with cpas_read, allow the application to change the data as it like, and send the data forward with cpas_write.
As for now, CPAS doesn't support accelerated traffic, this mean that each CPAS packet will be F2F.
Since all the packet will pass SecureXL (even if they were F2F) it need to offload the connection to SecureXL.
The offload is done on the outbound, without CPAS, the offload will occur when the SYN packet leave to the server, because CPAS break the connection, the first outbound will occur on the SYN-ACK that is sent to the Client
CPAS does not synchronize connection between cluster members. If a failover occur and the connection was handled by CPAS then the connection will terminate. The only exception is VOIP connection.

CPAS is visible in the input fw monitor chain as "TCP streaming (cpas)" and in the output chain as "TCP streaming (cpas)" and "TCP streaming post VM (cpas)".

Active Streaming – https:

With PSL, connection that is encrypted with SSL (TLS) was not supported, the reason for this is that the encryption keys are known only to the Client and Server since they are the one that initiated the connection (preformed the SSL handshake), because of this we couldn’t get the data out of the packet and the application couldn’t scan it for malicious information. CPAS plays the rule of “man in the middle”, because of this, it can intercept the SSL handshake and change the keys so he will be able to understand the encryption. The Client preform an SSL handshake with the gateway (thinking it is the Server) while the Server preform SSL handshake with the gateway (thinking he is the Client). The gateway have both keys and he’s able to open the encryption, check the packet and re-encrypt the packet with the corresponding keys. In order to encrypt / decrypt the SSL connection, CPAS add another layer before the application queue. The new layer will send the packet to the SSL engine for decryption/encryption and then resume the normal flow.

Active Streaming – https content step by step:

Packets of SSL handshake are passed to the SSL engine to exchange keys. When the connection and the SSL handshake is fully established, an hook will be register for this connection to handle the decrypt / encrypt of the packets. When a packet arrive to CPAS, a trap will be sent and the SSL engine will receive the encrypted packet, decode the packet and return it to CPAS. The packet will enter the receive queue and the application will be able to work on it, once he done he will send it to the write queue. The packet will pass to the SSL engine for encryption and pass to the other side (Client, Server).

➜ CCSM Elite, CCME, CCTE ➜ www.checkpoint.tips

FedericoMeiners · ‎2019-07-21

Heiko,
Thanks for sharing, I've been through all your posts and your contributions are great.

I'm aware of how SSL Inspection works in a theorical level, however my point is how unreliable is to properly make exceptions:
If you are deploying a solution in a productive environment most of the time it's assumed that these kind of features will work out of the box and that they will require minor adjustments.

I know by experience that Check Point makes thing different than other vendors to favor security but you always have some tricks or settings to bypass these, some examples:
- During a migration from ASA to CHKP: After we deployed our firewall all VoIP phones stopped working, further troubleshooting indicated that there were some signatures in the Inspection Settings that saw SIP traffic as malformed, we removed all these protections and it just worked, You have clear ways to troubleshoot this.
- IPS: It's not secret that it's probably that after deploying IPS some services will be affected, but you can clearly troubleshoot this and make the exceptions.

Going back to SSL Inspection, implementing it without issues it's more luck than knowledge and skill, even more if we consider that there are no ways to properly bypass traffic: It doesn't matter if you set it to fail open or set to not drop revoked certificates/untrusted, issues still will appear for the end users.

Regards,

____________
https://www.linkedin.com/in/federicomeiners/

Nick_Doropoulos · ‎2019-07-21

Thank you for sharing 🙂

FedericoMeiners · ‎2019-07-24

Hello,

Just wanted to share with you that today I confirmed with Check Point that in R80.30 the probe bypass feature doesn't work anymore.

fw ctl set int enhanced_ssl_inspection 1
fw ctl set int bypass_on_enhanced_ssl_inspection 1

I suspected this since the new SSL engine has another inspection method, the main purpose to enable these flags prior R80.30 was to solve some connections issues with various applications. However, take for example the installation process of the Sophos Antivirus, prior R80.30 we could make it work with these flags, but in the latest version we can't find a way to make it work.

If you are using probe bypass I strongly advice to test your company applications that need it in a parallel R80.30 gateway before upgrading.

Thank you for reading,

Federico Meiners

____________
https://www.linkedin.com/in/federicomeiners/

_Val_ · ‎2019-07-31

Very nice write-up, I appreciate your efforts to write this down.

David_T · ‎2019-07-31

Hi

It's good to see that we are not the only one with these problems. we upgraded to R80.20 and had huge problems. we were not able to login to any website anymore. This problem got fixed. but we still have problem with many sites which we now had to bypass.

after all these problems we decided to evaluate an alternative proxy solution.

514numbers · ‎2020-02-05

This sounds quite familiar to the issues encountered.

Keep up the good write-ups!

Fabio_Fukushima · ‎2021-11-29

FedericoMeiners,

You changed my life sharing your knowlegde of "Only way do nothing with the connections"!

Exclude the source IP can't be done for many reasons, however, I'm working excluding the destination IP address (when it is possible) using a personal "Internet object" as destination for the entire Https Inspection rulebasethe. This personal Internet object type is a "network object --> group --> group with exclusions..." that contains All_internet except an object created with the IP address of websites that I want to create the exception of https inspection.

It's working very well for me in a lot of customers.

Many thanks for sharing!!

FedericoMeiners · ‎2021-11-30

You made my day with your post 🙂

Great workaround! Are you still using these techniques post R80.40?

Thanks!

____________
https://www.linkedin.com/in/federicomeiners/

Fabio_Fukushima · ‎2021-12-02

Yes, on a customer running R81, in spite of in this environment, the action "bypass" seems to work better than on previous GAIA versions.

These techniques saves time...

Daniel_Kavan · ‎2022-10-28

Thank you Federico,

Just curious, if after the implementation you've witnessed outbound https inspection blocking malicious attacks.

Outbound HTTPS Inspection protects internal users and perimeter servers from malicious attacks coming from the Internet on connections originated from inside the organization

https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut...

Are you a member of CheckMates?

Outbound SSL Inspection: A war story