Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Moberg
Participant

Cloudguard controller debug guides

Hi

Does anyone have any description for the cloudguard debug option? I haven't been able to find any sk on this.

Its the debug option for troubleshooting Cloudguard Controller which connect to Azure as Datacenter objects.

Cloudguard controllerCloudguard controller 

 

what kind of debug are being logged with the "cloudguard debug on" vs. "cloudguard debug full" and where can one find the debug logs?

Looking forward to hear your answers.

Thanks

Kim

0 Kudos
14 Replies
Tal_Paz-Fridman
Employee
Employee

0 Kudos
Moberg
Participant

@Tal_Ron yes I have and it doesn't explain how it worked.

I have used it a lot the last couple of weeks with TAC and nothing really useful output from them.

I have seen the debug feature but I don't know were to look for the debug information.

The sk should include more information in regards to cloudguard debug controller feature etc.

 

0 Kudos
Moberg
Participant

I cannot read sk136352 - Common Azure HTML API Error codes in Azure CloudGuard or CloudGuard Controller logs

Could it be this one you think of?

0 Kudos
Gil_Sudai
Employee
Employee

We have retired the debug on and debug full options as they generate too much noise.

You can enable debug for specific parts of the product in $VSECDIR/lib/log4j.properties (or .xml) file.  

Depends on what you want to debug, change the lines from error to debug or trace. 

The output is in $FWDIR/log/cloud_proxy.elg file.

What is the issue you need to debug?

0 Kudos
Moberg
Participant

@Gil_Sudai I am having two issues. CMA is not able to automatically fetch dynamic objects from Azure and when I want to import new Azure Objects in SmartConsole I am getting a blank dialog without any objects. After a while I am getting the error message "The Data Center is still initializing, it may take a moment. Please try again later."

As a work around I can from SmartConsole "

Objects started getting imported from Azure after performing an "Install database" on the management server.
The DC Scanner should fetch this automatically.

We have adjusted the azure.connectTimeoutInMilliseconds=60000 to: azure.connectTimeoutInMilliseconds=6000000 in the file $FWDIR/conf/vsec.conf - which didn't help.

I have been following this ARTG :: Cloudguard guide sk115657 which show no error while troubleshooting.

I still get the following error in cloud_proxy.elg

[Expert@mgmt:0]# tail -f cloud_proxy.elg
com.checkpoint.datacenter.util.exception.UnknownProblemException: Failed querying Azure, unknown problem
at com.checkpoint.datacenter.scanner.azure.AzureDeployment.getAzureResponse(AzureDeployment.java:223)
at com.checkpoint.datacenter.scanner.azure.AzureScanner.innerRun(AzureScanner.java:135)
at com.checkpoint.datacenter.scanner.DcScanner.run(DcScanner.java:120)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522)
at java.util.concurrent.FutureTask.run(FutureTask.java:277)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.lang.Thread.run(Thread.java:820)
16/03/23 09:39:14,490 ERROR datacenter.scanner.DcScanner [scanner-Azure-207686665]: Mapping of Data Center Azure [Application id xxxxxx, directory id xxxxxx] failed . Next mapping is in 300 seconds.
^C

 

I am getting another issue also found in cloud_proxy.elg

 

20/03/23 13:09:23,598 ERROR enforcement.amon.AmonRequestGwsStatusManager [amon-request-sender:30012]: Failed to send Amon request to FWM port 30012
javax.xml.ws.WebServiceException: Could not send Message.
at org.apache.cxf.jaxws.JaxWsClientProxy.invoke(JaxWsClientProxy.java:150)
at com.sun.proxy.$Proxy54.getAndUpdate(Unknown Source)
at com.checkpoint.datacenter.enforcement.amon.AmonRequestGwsStatusManager.sendAmonRequest(AmonRequestGwsStatusManager.java:28)
at com.checkpoint.datacenter.enforcement.amon.AmonRequestGwsStatusManager.access$100(AmonRequestGwsStatusManager.java:37)
at com.checkpoint.datacenter.enforcement.amon.AmonRequestGwsStatusManager$AmonRequestRunner.run(AmonRequestGwsStatusManager.java:12)
at java.lang.Thread.run(Thread.java:820)
Caused by: java.net.ConnectException: ConnectException invoking https://localhost:30012/amonstatus_service: Connection refused (Connection refused)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:83)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:57)
at java.lang.reflect.Constructor.newInstance(Constructor.java:437)
at org.apache.cxf.transport.http.HTTPConduit$WrappedOutputStream.mapException(HTTPConduit.java:1365)
at org.apache.cxf.transport.http.HTTPConduit$WrappedOutputStream.close(HTTPConduit.java:1349)
at org.apache.cxf.transport.AbstractConduit.close(AbstractConduit.java:56)
at org.apache.cxf.transport.http.HTTPConduit.close(HTTPConduit.java:652)
at org.apache.cxf.interceptor.MessageSenderInterceptor$MessageSenderEndingInterceptor.handleMessage(MessageSenderInterceptor.java:62)
at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307)
at org.apache.cxf.endpoint.ClientImpl.doInvoke(ClientImpl.java:516)
at org.apache.cxf.endpoint.ClientImpl.invoke(ClientImpl.java:425)
at org.apache.cxf.endpoint.ClientImpl.invoke(ClientImpl.java:326)
at org.apache.cxf.endpoint.ClientImpl.invoke(ClientImpl.java:279)
at org.apache.cxf.frontend.ClientProxy.invokeSync(ClientProxy.java:96)
at org.apache.cxf.jaxws.JaxWsClientProxy.invoke(JaxWsClientProxy.java:139)
... 5 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:380)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:236)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:218)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
at java.net.Socket.connect(Socket.java:682)
at com.ibm.jsse2.av.connect(av.java:694)
at sun.net.NetworkClient.doConnect(NetworkClient.java:187)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:494)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:589)
at com.ibm.net.ssl.www2.protocol.https.c.<init>(c.java:193)
at com.ibm.net.ssl.www2.protocol.https.c.a(c.java:204)
at com.ibm.net.ssl.www2.protocol.https.d.getNewHttpClient(d.java:70)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1174)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1068)
at com.ibm.net.ssl.www2.protocol.https.d.connect(d.java:71)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1352)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1327)
at com.ibm.net.ssl.www2.protocol.https.b.getOutputStream(b.java:82)
at org.apache.cxf.transport.http.URLConnectionHTTPConduit$URLConnectionWrappedOutputStream.setupWrappedStream(URLConnectionHTTPConduit.java:183)
at org.apache.cxf.transport.http.HTTPConduit$WrappedOutputStream.handleHeadersTrustCaching(HTTPConduit.java:1308)
at org.apache.cxf.transport.http.HTTPConduit$WrappedOutputStream.onFirstWrite(HTTPConduit.java:1268)
at org.apache.cxf.transport.http.URLConnectionHTTPConduit$URLConnectionWrappedOutputStream.onFirstWrite(URLConnectionHTTPConduit.java:210)
at org.apache.cxf.io.AbstractWrappedOutputStream.write(AbstractWrappedOutputStream.java:47)
at org.apache.cxf.io.AbstractThresholdOutputStream.write(AbstractThresholdOutputStream.java:69)
at org.apache.cxf.transport.http.HTTPConduit$WrappedOutputStream.close(HTTPConduit.java:1321)
... 15 more
[Expert@mgmt:0]#

 

the following issue which was found while looking into cloud_proxy.elg

0 Kudos
Gil_Sudai
Employee
Employee

After editing vsec.conf you need to run "vsec stop; vsec start" .

 

Right above the Azure errors there should also be Python errors. What they say?

 

In CloudGuard Controller atrg sk there is a command line how to run the Azure scanning code directly. Please try it, what is the error that you get?

0 Kudos
Kim_Moberg
Advisor

Sorry, yes I did restart the service with “vsec stop; vsec start" and wanted 30 minutes because the DC scanner runs every 30 minutes.

I had to quickly extract the information from the logs. I can send you the SR ticket no if you can acces that?

Best Regards
Kim
Gil_Sudai
Employee
Employee

Please ask ask our support engineer that handle the SR to contact me or my team. 

Moberg
Participant

On Checkmates with wrong account. 

Thanks. I have updated the your support engineer to reach out to you or your team.

I am missing some better way of troubleshooting the Cloudguard controller. The ARTG Cloudguard Controller guide does provide some initial guidance.

the Azure debug query did curl and fetch Azure objects without any isssues (no errors) and still the cloud_proxy.elg drops an error stack exception and TAC support was searching in east and west which in my eyes are not related to the issue.

Just seaching for cloud_proxy.elg in support center doesnt provide any hints at all.

If you would like to have feedback on this please dont hesitate to reach out to me. I’ve have been working close with RnD Team managers.

 

0 Kudos
Moberg
Participant

This is the whole error exception found in cloud_proxy.elg

23/03/23 09:48:05,775 ERROR util.process.ProcessExecutor [scanner-Azure-2065078223]: java.util.concurrent.TimeoutException
23/03/23 09:48:05,775 ERROR util.process.ProcessExecutor [scanner-Azure-2065078223]: Timeout reached: 1200 seconds, killing process
23/03/23 09:48:05,776 ERROR util.process.ProcessExecutor [pool-2136-thread-1]: protectedWait: java.lang.InterruptedException
23/03/23 09:48:06,118 ERROR util.process.ProcessExecutor [Thread-235]: ProcessStreamReader: stderr - run: java.io.IOException: Stream closed
23/03/23 09:48:06,120 ERROR scanner.azure.AzureDeployment [scanner-Azure-2065078223]: com.checkpoint.datacenter.util.exception.ProcessExecutionException: Failed running process
23/03/23 09:48:06,120 ERROR datacenter.scanner.DcScanner [scanner-Azure-2065078223]: Error during scan - attempting to reconnect for scanner Azure [Application id xxxxxx, directory id xxxxxx]
com.checkpoint.datacenter.util.exception.UnknownProblemException: Failed querying Azure, unknown problem
at com.checkpoint.datacenter.scanner.azure.AzureDeployment.getAzureResponse(AzureDeployment.java:223)
at com.checkpoint.datacenter.scanner.azure.AzureScanner.innerRun(AzureScanner.java:135)
at com.checkpoint.datacenter.scanner.DcScanner.run(DcScanner.java:120)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522)
at java.util.concurrent.FutureTask.run(FutureTask.java:277)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.lang.Thread.run(Thread.java:820)
23/03/23 09:48:06,121 ERROR scanner.util.DcScannerUtils [scanner-Azure-2065078223]: Exception while connecting to Azure [Application id xxxxxx, directory id xxxxx]. Return unknown problem.
com.checkpoint.datacenter.util.exception.UnknownProblemException: Failed querying Azure, unknown problem
at com.checkpoint.datacenter.scanner.azure.AzureDeployment.getAzureResponse(AzureDeployment.java:223)
at com.checkpoint.datacenter.scanner.azure.AzureScanner.innerRun(AzureScanner.java:135)
at com.checkpoint.datacenter.scanner.DcScanner.run(DcScanner.java:120)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522)
at java.util.concurrent.FutureTask.run(FutureTask.java:277)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.lang.Thread.run(Thread.java:820)
23/03/23 09:48:06,121 ERROR datacenter.scanner.DcScanner [scanner-Azure-2065078223]: Mapping of Data Center Azure [Application id xxxxxx, directory id xxxxxx] failed . Next mapping is in 300 seconds.

0 Kudos
Kim_Moberg
Advisor

@Gil_Sudai 

I have been working on the troubleshooting editing the specific parts of the the product in $VSECDIR/lib/log4j.properties (or .xml) file.  

Does all of the parts support ERROR, DEBUG and TRACE or is it just some of the pecific parts of the the product in $VSECDIR/lib/log4j.properties  only support ERROR and DEBUG but not TRACE?

What are the time frame for the change in $VSECDIR/lib/log4j.properties to take effect to be shown in  $FWDIR/log/cloud_proxy.elg file?

Do I need to run "vsec stop; vsec start" after every change?

BR

Kim

Best Regards
Kim
0 Kudos
Gil_Sudai
Employee
Employee

Hi.

All parts support trace > debug > info > error.

According to the data you sent: 

23/03/23 09:48:05,775 ERROR util.process.ProcessExecutor [scanner-Azure-2065078223]: Timeout reached: 1200 seconds, killing process

It looks like the Azure scanning take longer than 1200 seconds.

Can you confirm that by running the vsec.py command line manually and checking how much time it takes? You can use 'date' and such.

I suggest that you edit vsec.conf and change entry (or add if missing)

azure.connectTimeoutInMilliseconds

to high value. For example:

azure.connectTimeoutInMilliseconds=5000000

Then run "vsec stop;vsec start" to restart the CloudGuard Controller process.

Does it helps?

0 Kudos
Kim_Moberg
Advisor

Hi @Gil_Sudai 

Thats true the raising the value of azure.connectTimeoutInMilliseconds in $FWDIR/conf/vsec.conf solved the timeout. Though the values 5000000 is too high and the system cannot figure how it should be handled. I lowered the value to 600000 and that seems to have solve the issue in cloud_proxy.elg "Timeout reached: 1200 seconds, killing process"

How is it with the  function RequestGwsStatusManager- how does that function work?

enforcement.amon.AmonRequestGwsStatusManager [amon-request-sender:30012]: Failed to send Amon request to FWM port 30012

 

When I do tcpdump on mgmt and open another ssh session to same host. If I do "telnet 127.0.0.1 30012" I then see connection closed and output of the tcpdump dump. If I keep the tcpdump running I get multiple tries which matches the date/time from cloud_proxy.elg.

RequestGwsStatusManager does this search for a secure gateway and it is blocked or not responding or how is it?

Br

Kim

Best Regards
Kim
0 Kudos
Gil_Sudai
Employee
Employee

try to restart the FWM app. Did it fix it?  You can use the "cpwd_admin" command on the mgmt for that and there are also usage examples if you just run "cpwd_admin"

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.