- CheckMates
- :
- Products
- :
- CloudMates Products
- :
- Cloud Network Security
- :
- Discussion
- :
- Re: Cloudguard controller debug guides
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Cloudguard controller debug guides
Hi
Does anyone have any description for the cloudguard debug option? I haven't been able to find any sk on this.
Its the debug option for troubleshooting Cloudguard Controller which connect to Azure as Datacenter objects.
what kind of debug are being logged with the "cloudguard debug on" vs. "cloudguard debug full" and where can one find the debug logs?
Looking forward to hear your answers.
Thanks
Kim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Have you looked at sk115657 ATRG: CloudGuard Controller?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Tal_Ron yes I have and it doesn't explain how it worked.
I have used it a lot the last couple of weeks with TAC and nothing really useful output from them.
I have seen the debug feature but I don't know were to look for the debug information.
The sk should include more information in regards to cloudguard debug controller feature etc.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I cannot read sk136352 - Common Azure HTML API Error codes in Azure CloudGuard or CloudGuard Controller logs
Could it be this one you think of?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have retired the debug on and debug full options as they generate too much noise.
You can enable debug for specific parts of the product in $VSECDIR/lib/log4j.properties (or .xml) file.
Depends on what you want to debug, change the lines from error to debug or trace.
The output is in $FWDIR/log/cloud_proxy.elg file.
What is the issue you need to debug?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Gil_Sudai I am having two issues. CMA is not able to automatically fetch dynamic objects from Azure and when I want to import new Azure Objects in SmartConsole I am getting a blank dialog without any objects. After a while I am getting the error message "The Data Center is still initializing, it may take a moment. Please try again later."
As a work around I can from SmartConsole "
Objects started getting imported from Azure after performing an "Install database" on the management server.
The DC Scanner should fetch this automatically.
We have adjusted the azure.connectTimeoutInMilliseconds=60000 to: azure.connectTimeoutInMilliseconds=6000000 in the file $FWDIR/conf/vsec.conf - which didn't help.
I have been following this ARTG :: Cloudguard guide sk115657 which show no error while troubleshooting.
I still get the following error in cloud_proxy.elg
[Expert@mgmt:0]# tail -f cloud_proxy.elg
com.checkpoint.datacenter.util.exception.UnknownProblemException: Failed querying Azure, unknown problem
at com.checkpoint.datacenter.scanner.azure.AzureDeployment.getAzureResponse(AzureDeployment.java:223)
at com.checkpoint.datacenter.scanner.azure.AzureScanner.innerRun(AzureScanner.java:135)
at com.checkpoint.datacenter.scanner.DcScanner.run(DcScanner.java:120)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522)
at java.util.concurrent.FutureTask.run(FutureTask.java:277)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.lang.Thread.run(Thread.java:820)
16/03/23 09:39:14,490 ERROR datacenter.scanner.DcScanner [scanner-Azure-207686665]: Mapping of Data Center Azure [Application id xxxxxx, directory id xxxxxx] failed . Next mapping is in 300 seconds.
^C
I am getting another issue also found in cloud_proxy.elg
20/03/23 13:09:23,598 ERROR enforcement.amon.AmonRequestGwsStatusManager [amon-request-sender:30012]: Failed to send Amon request to FWM port 30012
javax.xml.ws.WebServiceException: Could not send Message.
at org.apache.cxf.jaxws.JaxWsClientProxy.invoke(JaxWsClientProxy.java:150)
at com.sun.proxy.$Proxy54.getAndUpdate(Unknown Source)
at com.checkpoint.datacenter.enforcement.amon.AmonRequestGwsStatusManager.sendAmonRequest(AmonRequestGwsStatusManager.java:28)
at com.checkpoint.datacenter.enforcement.amon.AmonRequestGwsStatusManager.access$100(AmonRequestGwsStatusManager.java:37)
at com.checkpoint.datacenter.enforcement.amon.AmonRequestGwsStatusManager$AmonRequestRunner.run(AmonRequestGwsStatusManager.java:12)
at java.lang.Thread.run(Thread.java:820)
Caused by: java.net.ConnectException: ConnectException invoking https://localhost:30012/amonstatus_service: Connection refused (Connection refused)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:83)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:57)
at java.lang.reflect.Constructor.newInstance(Constructor.java:437)
at org.apache.cxf.transport.http.HTTPConduit$WrappedOutputStream.mapException(HTTPConduit.java:1365)
at org.apache.cxf.transport.http.HTTPConduit$WrappedOutputStream.close(HTTPConduit.java:1349)
at org.apache.cxf.transport.AbstractConduit.close(AbstractConduit.java:56)
at org.apache.cxf.transport.http.HTTPConduit.close(HTTPConduit.java:652)
at org.apache.cxf.interceptor.MessageSenderInterceptor$MessageSenderEndingInterceptor.handleMessage(MessageSenderInterceptor.java:62)
at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307)
at org.apache.cxf.endpoint.ClientImpl.doInvoke(ClientImpl.java:516)
at org.apache.cxf.endpoint.ClientImpl.invoke(ClientImpl.java:425)
at org.apache.cxf.endpoint.ClientImpl.invoke(ClientImpl.java:326)
at org.apache.cxf.endpoint.ClientImpl.invoke(ClientImpl.java:279)
at org.apache.cxf.frontend.ClientProxy.invokeSync(ClientProxy.java:96)
at org.apache.cxf.jaxws.JaxWsClientProxy.invoke(JaxWsClientProxy.java:139)
... 5 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:380)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:236)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:218)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
at java.net.Socket.connect(Socket.java:682)
at com.ibm.jsse2.av.connect(av.java:694)
at sun.net.NetworkClient.doConnect(NetworkClient.java:187)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:494)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:589)
at com.ibm.net.ssl.www2.protocol.https.c.<init>(c.java:193)
at com.ibm.net.ssl.www2.protocol.https.c.a(c.java:204)
at com.ibm.net.ssl.www2.protocol.https.d.getNewHttpClient(d.java:70)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1174)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1068)
at com.ibm.net.ssl.www2.protocol.https.d.connect(d.java:71)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1352)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1327)
at com.ibm.net.ssl.www2.protocol.https.b.getOutputStream(b.java:82)
at org.apache.cxf.transport.http.URLConnectionHTTPConduit$URLConnectionWrappedOutputStream.setupWrappedStream(URLConnectionHTTPConduit.java:183)
at org.apache.cxf.transport.http.HTTPConduit$WrappedOutputStream.handleHeadersTrustCaching(HTTPConduit.java:1308)
at org.apache.cxf.transport.http.HTTPConduit$WrappedOutputStream.onFirstWrite(HTTPConduit.java:1268)
at org.apache.cxf.transport.http.URLConnectionHTTPConduit$URLConnectionWrappedOutputStream.onFirstWrite(URLConnectionHTTPConduit.java:210)
at org.apache.cxf.io.AbstractWrappedOutputStream.write(AbstractWrappedOutputStream.java:47)
at org.apache.cxf.io.AbstractThresholdOutputStream.write(AbstractThresholdOutputStream.java:69)
at org.apache.cxf.transport.http.HTTPConduit$WrappedOutputStream.close(HTTPConduit.java:1321)
... 15 more
[Expert@mgmt:0]#
the following issue which was found while looking into cloud_proxy.elg
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
After editing vsec.conf you need to run "vsec stop; vsec start" .
Right above the Azure errors there should also be Python errors. What they say?
In CloudGuard Controller atrg sk there is a command line how to run the Azure scanning code directly. Please try it, what is the error that you get?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, yes I did restart the service with “vsec stop; vsec start" and wanted 30 minutes because the DC scanner runs every 30 minutes.
I had to quickly extract the information from the logs. I can send you the SR ticket no if you can acces that?
Kim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please ask ask our support engineer that handle the SR to contact me or my team.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
On Checkmates with wrong account.
Thanks. I have updated the your support engineer to reach out to you or your team.
I am missing some better way of troubleshooting the Cloudguard controller. The ARTG Cloudguard Controller guide does provide some initial guidance.
the Azure debug query did curl and fetch Azure objects without any isssues (no errors) and still the cloud_proxy.elg drops an error stack exception and TAC support was searching in east and west which in my eyes are not related to the issue.
Just seaching for cloud_proxy.elg in support center doesnt provide any hints at all.
If you would like to have feedback on this please dont hesitate to reach out to me. I’ve have been working close with RnD Team managers.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is the whole error exception found in cloud_proxy.elg
23/03/23 09:48:05,775 ERROR util.process.ProcessExecutor [scanner-Azure-2065078223]: java.util.concurrent.TimeoutException
23/03/23 09:48:05,775 ERROR util.process.ProcessExecutor [scanner-Azure-2065078223]: Timeout reached: 1200 seconds, killing process
23/03/23 09:48:05,776 ERROR util.process.ProcessExecutor [pool-2136-thread-1]: protectedWait: java.lang.InterruptedException
23/03/23 09:48:06,118 ERROR util.process.ProcessExecutor [Thread-235]: ProcessStreamReader: stderr - run: java.io.IOException: Stream closed
23/03/23 09:48:06,120 ERROR scanner.azure.AzureDeployment [scanner-Azure-2065078223]: com.checkpoint.datacenter.util.exception.ProcessExecutionException: Failed running process
23/03/23 09:48:06,120 ERROR datacenter.scanner.DcScanner [scanner-Azure-2065078223]: Error during scan - attempting to reconnect for scanner Azure [Application id xxxxxx, directory id xxxxxx]
com.checkpoint.datacenter.util.exception.UnknownProblemException: Failed querying Azure, unknown problem
at com.checkpoint.datacenter.scanner.azure.AzureDeployment.getAzureResponse(AzureDeployment.java:223)
at com.checkpoint.datacenter.scanner.azure.AzureScanner.innerRun(AzureScanner.java:135)
at com.checkpoint.datacenter.scanner.DcScanner.run(DcScanner.java:120)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522)
at java.util.concurrent.FutureTask.run(FutureTask.java:277)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.lang.Thread.run(Thread.java:820)
23/03/23 09:48:06,121 ERROR scanner.util.DcScannerUtils [scanner-Azure-2065078223]: Exception while connecting to Azure [Application id xxxxxx, directory id xxxxx]. Return unknown problem.
com.checkpoint.datacenter.util.exception.UnknownProblemException: Failed querying Azure, unknown problem
at com.checkpoint.datacenter.scanner.azure.AzureDeployment.getAzureResponse(AzureDeployment.java:223)
at com.checkpoint.datacenter.scanner.azure.AzureScanner.innerRun(AzureScanner.java:135)
at com.checkpoint.datacenter.scanner.DcScanner.run(DcScanner.java:120)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522)
at java.util.concurrent.FutureTask.run(FutureTask.java:277)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.lang.Thread.run(Thread.java:820)
23/03/23 09:48:06,121 ERROR datacenter.scanner.DcScanner [scanner-Azure-2065078223]: Mapping of Data Center Azure [Application id xxxxxx, directory id xxxxxx] failed . Next mapping is in 300 seconds.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have been working on the troubleshooting editing the specific parts of the the product in $VSECDIR/lib/log4j.properties (or .xml) file.
Does all of the parts support ERROR, DEBUG and TRACE or is it just some of the pecific parts of the the product in $VSECDIR/lib/log4j.properties only support ERROR and DEBUG but not TRACE?
What are the time frame for the change in $VSECDIR/lib/log4j.properties to take effect to be shown in $FWDIR/log/cloud_proxy.elg file?
Do I need to run "vsec stop; vsec start" after every change?
BR
Kim
Kim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi.
All parts support trace > debug > info > error.
According to the data you sent:
23/03/23 09:48:05,775 ERROR util.process.ProcessExecutor [scanner-Azure-2065078223]: Timeout reached: 1200 seconds, killing process
It looks like the Azure scanning take longer than 1200 seconds.
Can you confirm that by running the vsec.py command line manually and checking how much time it takes? You can use 'date' and such.
I suggest that you edit vsec.conf and change entry (or add if missing)
azure.connectTimeoutInMilliseconds
to high value. For example:
azure.connectTimeoutInMilliseconds=5000000
Then run "vsec stop;vsec start" to restart the CloudGuard Controller process.
Does it helps?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Gil_Sudai
Thats true the raising the value of azure.connectTimeoutInMilliseconds in $FWDIR/conf/vsec.conf solved the timeout. Though the values 5000000 is too high and the system cannot figure how it should be handled. I lowered the value to 600000 and that seems to have solve the issue in cloud_proxy.elg "Timeout reached: 1200 seconds, killing process"
How is it with the function RequestGwsStatusManager- how does that function work?
enforcement.amon.AmonRequestGwsStatusManager [amon-request-sender:30012]: Failed to send Amon request to FWM port 30012
When I do tcpdump on mgmt and open another ssh session to same host. If I do "telnet 127.0.0.1 30012" I then see connection closed and output of the tcpdump dump. If I keep the tcpdump running I get multiple tries which matches the date/time from cloud_proxy.elg.
RequestGwsStatusManager does this search for a secure gateway and it is blocked or not responding or how is it?
Br
Kim
Kim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
try to restart the FWM app. Did it fix it? You can use the "cpwd_admin" command on the mgmt for that and there are also usage examples if you just run "cpwd_admin"