Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Arik_Ovtracht
Employee
Employee
Jump to solution

Skyline - a new monitoring solution for Check Point devices - on EA now

Hi,

I am excited to announce the availability of Skyline - Check Point’s new solution for real-time monitoring of the Quantum Family devices.

Skyline uses modern technologies (based on OpenTelemetry) to report telemetry data from Check Point devices, and is designed to fit your existing monitoring environments - or you can create a simple new monitoring server using Prometheus and Grafana.

 

You can view a short presentation + demo of Skyline in this video.

More details on Skyline and how to set it up can be found in sk178566.

Disclaimer: This Early Availability version reports a basic set of monitoring data, that will be enhanced in the future.

Please contact me for any questions on Skyline.

(1)
65 Replies
JozkoMrkvicka
Mentor
Mentor

Just to link the upcoming TechTalk happening on September 28, 2022, related to Skyline here:

https://community.checkpoint.com/t5/CheckMates-Events/Introduction-to-Skyline-a-new-monitoring-solut...

Kind regards,
Jozko Mrkvicka
0 Kudos
Arik_Ovtracht
Employee
Employee

Hi,

I am excited to announce the release of Skyline - Check Point’s new solution for real-time monitoring of the Quantum Family devices!

Skyline has been released as part of the ongoing R81.10 JHF take 79, and will be released for versions R80.40 and R81 within the relevant JHF takes in the following weeks.

For more information about Skyline, and download links for some sample Grafana dashboards, see sk178566

JozkoMrkvicka
Mentor
Mentor

Can you please elaborate on following 2 known limitations:

  • On a VSX Gateway / VSX Cluster, Skyline shows the information only for the context VS0 (the VSX Gateway / VSX Cluster Member itself). Does it mean that there are no data from any of configured VSs, only from VS0 ?
  • Skyline deployment on a VSX Gateway / VSX Cluster with many Virtual Systems may increase the load on CPU cores. It is recommended to install it only on systems of up to 10 Virtual Systems. When will be this limitation fixed ?

Kind regards,
Jozko Mrkvicka
0 Kudos
Arik_Ovtracht
Employee
Employee
  • There will be data from VSs other than VS0, however due to a bug, that data will be the same as the data from VS0. This bug is relevant for the R81.10 and R81 current releases of Skyline, and will be fixed in the next JHF take on those versions. The R80.40 version (which is not yet released) will not contain this bug.
  • The limitation for <10 VSs is in all versions at the moment, but we are currently working to remove it. I hope we will be able to remove it fairly soon, possibly in a couple of months.
Simon_Macpherso
Advisor

Hi @Arik_Ovtracht , is Maestro (R81.10 JHF79) fully supported in the GA version? 

If i run a tcpdump on the prometheus server for a few minutes only the following metrics are received. 

.__name__..skyline_build_info
.__name__..system_uptime
.__name__..target
.__name__..cpview_info

Regards,

Simon

0 Kudos
Arik_Ovtracht
Employee
Employee

Same answer as in https://community.checkpoint.com/t5/Security-Gateways/Skyline/m-p/167360/emcs_t/S2h8ZW1haWx8c3Vic2Ny...

Yes, they are fully supported.

Would it be possible for me to investigate your environment, to get a better understanding of the issue?

You can contact me directly at ariko@checkpoint.com 

0 Kudos
Simon_Macpherso
Advisor

I have emailed you.

There is definitely an issue either specific to our environment our support for Maestro. 

I upgraded an HA cluster to R81.10 JHF 81 today and all expected metrics are received in Prometheus.   

0 Kudos
Arik_Ovtracht
Employee
Employee

Same answer as https://community.checkpoint.com/t5/Security-Gateways/Skyline/m-p/167371/emcs_t/S2h8ZW1haWx8c3Vic2Ny...

Yes, they are fully supported.

Would it be possible for me to investigate your environment, to get a better understanding of the issue?

You can contact me directly at ariko@checkpoint.com 

0 Kudos
milunb
Participant

Hi,
Can you help me to stop skyline.

-Note: To disable Skyline completely, change the "enabled" attribute inside the payload file (the top one), to "false" and re-run the script.-
I've been trying to track this but skyline still works
#
{
"enabled": "false",
"export-targets": {"add": [
{
"enabled": true,
"type": "prometheus-remote-write",
"url": "http://x.x.x.x:9090/api/v1/write"
}
]}
}
#

Thanks

0 Kudos
Elad_Chomsky
Employee
Employee

Hi @milunb , 

Please contact me on private on eladch@checkpoint.com, so we can try to assist you with your issue.

Thanks, Elad

0 Kudos
Norbert_Bohusch
Advisor

Hi,

has anything changed for R81.20? After upgrade of my test environment from R81.10 JHF 79 I had to reconfigure it via the REST.py script. This has the first issue that it doesn't find python. I could fix this on my own.

But afterwards it starts the exporter which then seems also to connect (netstat shows port established, tcpdump shows some packets), but the only data I see in the dashboards is now the uptime!

 

Here the output of my test configuration:

[Expert@cp-mgmt:0]# ./GetOTDynamicConfig.sh
{"exporters": {"prometheusremotewrite": {"endpoint": "http://10.2.232.82:9090/api/v1/write", "tls": {"ca_file": "/opt/CPshrd-R81.20/conf/ca-bundle.crt"}}}, "service": {"pipelines": {"metrics": {"exporters": ["prometheusremotewrite"]}}}}

./REST.py --show_open_telemetry
{"export-targets": [{"enabled": true, "type": "prometheus-remote-write", "url": "http://10.2.232.82:9090/api/v1/write", "client-auth": {"token": {"header-bearer-token": "N/A", "custom-header": {"key": "N/A", "value": "N/A"}}, "basic": {"username": "N/A", "password": "N/A"}}, "server-auth": {"ca-public-key": {"type": "Default", "value": "N/A"}}}], "enabled": true}

[Expert@cp-mgmt:0]# ./CPotelcolCli.sh show
{
"active_after_reboot":"true",
"status":"Collector is up"
}

 

2022-11-22T15:57:09.088+0100 info service/telemetry.go:102 Setting up own telemetry...
2022-11-22T15:57:09.089+0100 info service/telemetry.go:137 Serving Prometheus metrics {"address": ":8888", "level": "basic"}
2022-11-22T15:57:09.090+0100 info extensions/extensions.go:42 Starting extensions...
2022-11-22T15:57:09.090+0100 info extensions/extensions.go:45 Extension is starting... {"kind": "extension", "name": "health_check"}
2022-11-22T15:57:09.090+0100 info healthcheckextension@v0.56.0/healthcheckextension.go:44 Starting health_check extension {"kind": "extension", "name": "health_check", "config": {"Port":0,"TCPAddr":{"Endpoint":"0.0.0.0:13133"},"Path":"/","CheckCollectorPipeline":{"Enabled":false,"Interval":"5m","ExporterFailureThreshold":5}}}
2022-11-22T15:57:09.120+0100 info extensions/extensions.go:49 Extension started. {"kind": "extension", "name": "health_check"}
2022-11-22T15:57:09.120+0100 info pipelines/pipelines.go:74 Starting exporters...
2022-11-22T15:57:09.120+0100 info pipelines/pipelines.go:78 Exporter is starting... {"kind": "exporter", "data_type": "metrics", "name": "prometheusremotewrite"}
2022-11-22T15:57:09.123+0100 info pipelines/pipelines.go:82 Exporter started. {"kind": "exporter", "data_type": "metrics", "name": "prometheusremotewrite"}
2022-11-22T15:57:09.123+0100 info pipelines/pipelines.go:86 Starting processors...
2022-11-22T15:57:09.123+0100 info pipelines/pipelines.go:90 Processor is starting... {"kind": "processor", "name": "batch", "pipeline": "metrics"}
2022-11-22T15:57:09.123+0100 info pipelines/pipelines.go:94 Processor started. {"kind": "processor", "name": "batch", "pipeline": "metrics"}
2022-11-22T15:57:09.123+0100 info pipelines/pipelines.go:98 Starting receivers...
2022-11-22T15:57:09.123+0100 info pipelines/pipelines.go:102 Receiver is starting... {"kind": "receiver", "name": "otlp", "pipeline": "metrics"}
2022-11-22T15:57:09.123+0100 info otlpreceiver/otlp.go:70 Starting GRPC server on endpoint /opt/CPotelcol/grpc_otlp.sock {"kind": "receiver", "name": "otlp", "pipeline": "metrics"}
2022-11-22T15:57:09.123+0100 info pipelines/pipelines.go:106 Receiver started. {"kind": "receiver", "name": "otlp", "pipeline": "metrics"}
2022-11-22T15:57:09.123+0100 info healthcheck/handler.go:129 Health Check state change {"kind": "extension", "name": "health_check", "status": "ready"}
2022-11-22T15:57:09.123+0100 info service/collector.go:215 Starting otelcol... {"Version": "CPotelcol_0.56.0", "NumCPU": 8}
2022-11-22T15:57:09.123+0100 info service/collector.go:128 Everything is ready. Begin running and processing data.

 

0 Kudos
Arik_Ovtracht
Employee
Employee

Hi @Norbert_Bohusch ,

Skyline is not included in the R81.20 GA release, it will be added in one of the first Jumbo Hotfix takes on top of it. 

What you have is only part of the Skyline installation, which is automatically installed when upgrading, but is does not actually transmit any data. 

If you would like to use Skyline on top of R81.20, please hold on a few more weeks until we can release the JHF with it.

0 Kudos
Norbert_Bohusch
Advisor

Oh ok,

I mean the folders are there through AutoUpdater after some minutes after upgrade and according to the "resolved issues and enhancements" SK it is based on R81.10 JHF 79 which includes Skyline. So I thought it should work.

But thanks for clearing this up. Looking forward to the release of Skyline for R81.20 then.

0 Kudos
JozkoMrkvicka
Mentor
Mentor

Will there by any new features and known limitation solved within Skyline for R81.20 ? Or it will be the same version like is currently for R81.10 ?

Kind regards,
Jozko Mrkvicka
0 Kudos
Arik_Ovtracht
Employee
Employee

Skyline has (and will have) the same basic content in all version. We will release updates to the GA version in all of them, partly via JHFs and partly as automatic updates.

That said, there are some basic differences between the versions, which would mean that some metrics are only available in certain versions (and forward).

RamGuy239
Advisor
Advisor

@Arik_Ovtracht do you have any estimate on when we can expect at least an on-going JHF for R81.20 adding support for Skyline? It's been quite some time now, and there has yet to be released anything for R81.20 thus far. GA on November 21st, and we are in March without even an on-going JHF for R81.20 seems rather strange.

Certifications: CCSA, CCSE, CCSM, CCSM ELITE, CCTA, CCTE, CCVS, CCME
0 Kudos
Arik_Ovtracht
Employee
Employee

The R81.20 JHF should be released as on-going next week, including Skyline.

milunb
Participant

Finaly 
thanks

0 Kudos
JozkoMrkvicka
Mentor
Mentor

Can we expect within R81.20 JHF that the Skyline limitation of max 10 VS per VSX to be dismissed ?

Kind regards,
Jozko Mrkvicka
0 Kudos
JozkoMrkvicka
Mentor
Mentor

I see update on skyline homepage as max 25 VSs are supported with R81.10 Take 93.

Kind regards,
Jozko Mrkvicka
0 Kudos
Arik_Ovtracht
Employee
Employee

Actually, it was a mistake - we removed the VS number limitation completely 😀

Arik_Ovtracht
Employee
Employee

Check out this video tutorial on how to set up Skyline easily:

https://www.youtube.com/watch?v=FO2Rp9x31i0

Huge thanks to @Manning for creating this tutorial!

(1)
Diego_Escobar_A
Employee Employee
Employee

Hi checkmates,

To complement this post, I have taken the liberty of creating a repository in github. In which a docker stack is defined with docker compose, composed of 3 docker (prometheus, grafana and nginx that will work as a reverse-proxy and will be responsible for balancing the requests to these two containers using certificates and https to secure navigation.

Here you have:
https://github.com/dearevalillo/easy_telemetry_chkp_majoraccount

grafana_machines_overview.png

grafana_machines_overview2.png

maestro_5.png

Enjoy

darkdefender
Explorer
Explorer

Hi Diego

I do have some problems while implementing your dockers.

grafana docker is always restarting and nginx server has bad request.

I think I've made a lot of progress and if I can get through this i will be able to complete it.

0 Kudos
Arik_Ovtracht
Employee
Employee

Hi everyone,

I am happy to share that Skyline now supports AWS Managed Prometheus as a data target, in addition to a regular Prometheus server!

Check sk178566 to learn how to configure it to transmit the data to your AWS Managed Prometheus server.

0 Kudos
Arik_Ovtracht
Employee
Employee

Hi everyone, 

Our plan is to add Skyline integration with other 3rd-party monitoring tools besides Prometheus, and for that purpose we are conducting a survey - which targets would you like to see Skyline supporting next?

Your answers to this survey will affect our plans, so this is your opportunity to influence it!

Please take the survey here: 

https://forms.office.com/r/rQSPNUB5f6

Sven_Glock
Advisor

Hi Arik,

are there thoughts to give the opportunity to switch between metric push and pull.
I my environment Check Point is the only product that is pushing the metrics to prometheus.
Other products we have to pull like we did with SNMP. 
In general it seems that pulling has more pro aspects than pushing.
For example it is easier to control the return or you can better define configurations like pull interval and this on one central point.

Can you additionally please share why you decided to push metrics in skyline?
Maybe I missed some good aspects.

Thanks in advance.
Regards
Sven

0 Kudos
Arik_Ovtracht
Employee
Employee

Hi @Sven_Glock ,

We have decided to go with pushing the data due to 2 main reasons:

1. It means that the communication goes only in 1 direction, from the monitored device to outside, and it reduces the risk on the monitored device (no incoming communication means less angles of attack)

2. Polling the monitored device like in SNMP can be more difficult to control, especially in performance-sensitive devices like Security Gateways. Excessive polling can cause performance impact which will lead to traffic drops etc. That is why we chose to use a fixed interval (15 seconds), and did not allow users to change it, at least for the moment.

That said, I will check about the option to allow users to select data scraping instead of pushing the data.

0 Kudos
Elad_Chomsky
Employee
Employee

Hi @Sven_Glock ,

We are now working on the support for Prometheus to scrape data from Skyline. Please contact me at eladch@checkpoint.com, I am interested to hear about your use case, so we can incorporate the feedback into our product.

Thanks, Elad

0 Kudos
Sven_Glock
Advisor

Hi @Elad_Chomsky ,

in the meanwhile we had to move from Prometheus to Grafana Mimir.
I guess this is less interesting for you, isn't it?

Regads
Sven

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events