- CheckMates
- :
- Products
- :
- CloudMates Products
- :
- Cloud Network Security
- :
- Discussion
- :
- Re: Azure Data Center Objects - Inaccessible
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Azure Data Center Objects - Inaccessible
Scenario
- Smart-1 Cloud
- Cloudguard Azure HA cluster R81.20
We configured integration with Azure as a data center object. Today it stopped working, giving
- validation errors in Smart Console on objects derived from this integration (inaccessible/doesn't exist)
- can't browse objects in the object explorer (after Import in rulebase)
- Connection test works
- curl_cli --verbose https://management.azure.com --cacert $CPDIR/conf/ca-bundle-public-cloud.crt is ok
- azure_ha_test.py is ok
- Azure side seems ok
I found an sk referencing HTTP/1.1 429 error and a forum article Understand how Azure Resource Manager throttles requests - Azure Resource Manager | Microsoft Learn.. Can't find either of them now!
Anyway, I found the azure_had.elg file which had loads of errors for a long time. But only today do we see any manifestation in SmartConsole. A couple of questions:
- We have not configured any objects in the rulebase yet. If we had, would this be service affecting?
- Does the cloud_proxy.elg file exist in R81.20? Is it in the management server?
- Anywhere else I can look for clues?
Thanks in advance
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
1. Data Center Objects has caching on the gateway.
The time this cache is stored varies according to the configuration. by default, it is 1 week (10080 minutes), and can be changed to be up to 1 month.
The configuration should take place in the file $FWDIR/conf/vsec.conf and uses the following values;
# TTL (mins) for objects expiration on GW in case there are no updates
# from the Controller
# min value=5
# max value=43200
# Default value: 10080
enforcementSessionTimeoutInMinutes=10080
See CloudGuard Controller configuration parameters documentation for additional information
R81.20 CloudGuard Controller Administration Guide
This is a security feature that aims to prevent cases of obsolete data being used in firewall rule enforcement.
2. $FWDIR/log/cloud_proxy.elg file exists in R81.20 and is the first log to look at when facing issues with data center objects.
If you wish to attach it here (or send it privatly) we will be happy to have a look and advise our inputs on it.
3. If you have already opened an SR, I will be happy to take a look if you can share the SR # of a dm.
Thanks,
Aviv Shabo
CloudGuard Network R&D
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the reply Aviv,
I have opened a case but it is with our Collaboration Support partner just now.
So cloud_proxy.elg is on the CloudGuard Controller i.e. the management server? (We have Smart-1 Cloud)
I just found the CloudGuard Controller ATRG (sk115657) and just checked the logs (blade:CloudGuard IaaS). It shows mapping ok with failures every few minutes.
I'll attach a few screenshots that might help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your first point is relevant if the SMS/MDS loses connectivity to the Data Centre Object. My understanding is if the SMS/MDS loses trust with the Generic Data Centre object (remote certificate is changed, certificate in local certificate store gets deleted/corrupted), then by design the gateway will clear the object cache, impacting traffic until trust is re-established - so worth bearing in mind.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So there are indeed 2 scenarios here:
- Mgmt is no longer able to complete data center scanning
- Mgmt is no longer able to communicate with the gateway
In the first case, as long as communication between mgmt and gw is working, Data Center Objects (DCOs) time to live (TTL) will get extended, this is because the CloudGuard Controller that is running on the Management, understands that this is a scanning issue, so enforcement should continue working using existing information.
In the second case, the Management is no longer able to send updates to the gateway, on the gateway side, we cannot assume the reason for this, so once the TTL will expire, these DCOs will not longer be enforced.
For this reason, our best practice is to use DCOs for whitelisting (allow rules) rather than blacklisting (blocking rules).
Looking at the validation errors you are getting suggest that the access you provided for scanning your Azure data center was enough to properly establish a connection, but not enough to scan any supported Data Center Object.
Our best practice for providing Azure access to CloudGuard Controller is to create a service principal.
The minimum recommended permission is Reader.
You can assign the Reader permission in one of these ways:
Assign to all Resource Groups, from which you want to pull an item
Add the permission on a subscription level
If you hadn't had a chance to have a look at the CloudGuard Controller for Azure section of the CloudGuard Controlelr admin guide, I might be worth your while to do so now.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Seems like the issue started on at the time of the first mapping failure (13.14:29).
