Filters: Example 2
The customer has a very large deployment and has hundreds of GB of logs per day.
His vendor charges him per bit sent, and the customer is looking to reduce his footprint anyway he can.
He needs all his logs, so filtering out logs is not an option. Instead, he’s looking to reduce the size of each log sent.
The whitelist approach won’t work this time as there are simply too many different fields. Instead, the customer is trying to identify fields which don’t contain relevant information and filter them out (a sort of blacklist approach).
Let’s start with a random sample log.
<134>1 2018-03-21 18:09:26 MDS-72 CheckPoint 13752 - [action:"Accept"; conn_direction:"Outgoing"; flags:"6307840"; ifdir:"inbound"; ifname:"eth1"; logid:"320"; loguid:"{0x5a9fba17,0x0,0x5b20a8c0,0x149c}"; origin:"192.168.32.91"; originsicname:"CN=GW91,O=Domain2_Server..cuggd3"; sequencenum:"1"; time:"1521648566"; version:"5"; __policy_id_tag:"product=VPN-1 & FireWall-1[db_tag={BACD59B6-0BBC-A544-A5F9-A136152F0B37};mgmt=Domain2_Server;date=1520501096;policy_name=Standard\]"; aggregated_log_count:"242539"; bytes:"17427"; client_inbound_bytes:"5647"; client_inbound_packets:"65"; client_outbound_bytes:"11780"; client_outbound_packets:"62"; connection_count:"121149"; creation_time:"1520417303"; dst:"111.11.11.21"; duration:"1231263"; hll_key:"6523019790322755370"; inzone:"Internal"; last_hit_time:"1521648533"; layer_name:"Network"; layer_name:"Application"; layer_uuid:"d2787740-4872-4342-a0c1-58470e2d9bef"; layer_uuid:"cdeb4bd1-f11f-4d36-a78f-03cfa317d06d"; match_id:"1"; match_id:"16777219"; parent_rule:"0"; parent_rule:"0"; rule_action:"Accept"; rule_action:"Accept"; rule_name:"Network_Rule_One"; rule_name:"Appi_Cleanup rule"; rule_uid:"51419c04-5fc4-4263-8cca-e5d14f2dcf56"; rule_uid:"5440fb90-dd92-4e6a-8191-8957c279f3a9"; outzone:"External"; packets:"127"; product:"VPN-1 & FireWall-1"; proto:"17"; protocol:"DNS-UDP"; server_inbound_bytes:"11780"; server_inbound_packets:"62"; server_outbound_bytes:"5647"; server_outbound_packets:"65"; service:"53"; service_id:"domain-udp"; sig_id:"12"; src:"10.32.91.190"; update_count:"2054"; ]
We are starting out at 1549 bits.
Right off the bat, we need to decide if we need the header.
No? Let’s remove it for 55bits (per log).
Next, I notice that the default field separator is semi-colon+space. I can reduce this to just the semi-colon for an extra 53bits.
Now we try to identify fields which don’t interest me. Obviously, this is customer specific but I’d say that these fields can probably safely be removed in most cases:
flags, originsicname, sequencenum, version, __policy_id_tag, layer_uuid, server_inbound_bytes, server_inbound_packets, server_outbound_bytes, server_outbound_packets ( those are duplicates of client_outbound which already exists)
I could be much more aggressive with what I cut out but this is a good start.
Down to 982bits.
Now the next step is something that’s a bit more extreme but is from an actual use case where it was done.
There is no point in sending out client_inbound_packets:"65" which has a large key and small value when I can just as easily send out F11:”65”.
I can create a mapping file on the receiving end (assuming the SIEM supports this) which knows to translate F11 back into the relevant key field.
So I create the relevant mapping file where fields which I want to cut get the ‘<exported>false</exported>’ property, and the rest of the fields will be mapped to relevant alpha-numeric codes.
We end up with:
F1:"Accept";F2:"Outgoing";F3:"inbound";F4:"eth1";F5:"320";F6:"{0x5a9fba17,0x0,0x5b20a8c0,0x149c}";F7:"192.168.32.91";F8:"1521648566";F9:"242539";F10:"17427";F11:"5647";F12:"65";F13:"11780";F14:"62";F15:"121149";F16:"1520417303";F17:"111.11.11.21";F18:"1231263";F19:"6523019790322755370";F20:"Internal";F21:"1521648533";F22:"Network";F22:"Application";F23:"1";F23:"16777219";F24:"0";F24:"0";F25:"Accept";F25:"Accept";F26:"Network_Rule_One";F26:"Appi_Cleanup rule";F27:"51419c04-5fc4-4263-8cca-e5d14f2dcf56";F27:"5440fb90-dd92-4e6a-8191-8957c279f3a9";F28:"External";F29:"127";F30:"VPN-1 & FireWall-1";F31:"17";F32:"DNS-UDP";F33:"53";F34:"domain-udp";F35:"12";F36:"10.32.91.190";F37:"2054";
687bits. We started with 1549bits so a reduction of ~55%
And I can get better results if I’m willing to be fairly aggressive with my field filters.
Edit: As was correctly pointed out, the unit of measurement is not actually bits, but bytes (number of characters as given by a text editor word count).