Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Arik_Ovtracht
Employee
Employee
Jump to solution

Skyline - a new monitoring solution for Check Point devices - on EA now

Hi,

I am excited to announce the availability of Skyline - Check Point’s new solution for real-time monitoring of the Quantum Family devices.

Skyline uses modern technologies (based on OpenTelemetry) to report telemetry data from Check Point devices, and is designed to fit your existing monitoring environments - or you can create a simple new monitoring server using Prometheus and Grafana.

 

You can view a short presentation + demo of Skyline in this video.

More details on Skyline and how to set it up can be found in sk178566.

Disclaimer: This Early Availability version reports a basic set of monitoring data, that will be enhanced in the future.

Please contact me for any questions on Skyline.

(1)
1 Solution

Accepted Solutions
Arik_Ovtracht
Employee
Employee

Hi,

I am excited to announce the release of Skyline - Check Point’s new solution for real-time monitoring of the Quantum Family devices!

Skyline has been released as part of the ongoing R81.10 JHF take 79, and will be released for versions R80.40 and R81 within the relevant JHF takes in the following weeks.

For more information about Skyline, and download links for some sample Grafana dashboards, see sk178566

View solution in original post

64 Replies
Daniel_
Advisor

Looks good.

Do you know netdata? I still dreaming about such a monitoring solution on Check Point. You can not just see a high CPU on second basis, you also have a chance to find out which process produce this. Even if it's running in docker...

https://learn.netdata.cloud/docs/agent/demo-sites

Arik_Ovtracht
Employee
Employee

Yes, Skyline is quite similar to netdata, but more geared toward Check Point devices and their relevant data.

Yunusyavuz
Explorer

Very nice. Thank you for the explanation;
where can i get the datasheet

Yunus YAVUZ

yunus.yavuz@neteks.com.tr

_Val_
Admin
Admin

Datasheet of what? SK is liked above, and if you have more questions, ask here.

0 Kudos
Kivanc
Participant

Hi 

 

Does it support Maestro ?

Thanks

Kıvanç

Arik_Ovtracht
Employee
Employee

Yes, Maestro will be fully supported in the GA version.

In the EA version, Skyline works on Maestro devices, but there aren't any Maestro-specific meaningful metrics in the exported data yet.

JozkoMrkvicka
Mentor
Mentor

It was mentioned in the video, but it would be really great if in near future the Skyline can monitor specific Virtual Systems on VSX as separate "device", not every VS together as a whole for VSX box.

Will there be support for SNMP queries ? Some data are not available in CPview, but are available using SNMP. If so, will it support SNMPv3 ?

The idea is to push monitoring data FROM monitored device TO monitoring server? What in case the monitored device is too busy to handle such a traffic ? The real productive traffic should be preffered over monitoring traffic. Are there some measures in this area ?

Kind regards,
Jozko Mrkvicka
0 Kudos
Arik_Ovtracht
Employee
Employee

VSX is supported already, and you can select a specific VS and view it as a separate device even with the EA version.

We don't plan to support SNMP, as the new Open Telemetry method is much better and is designed to replace SNMP. However, even data which is not currently in CPview can be added to Skyline. If you have a list of such data which you would like to see in Skyline - please tell me!

Regarding your question about traffic handling - we are considering saving a local cache of data in cases where the device is not able to send it out due to load. This is not planned to be part of the GA version, but probably in one of the next versions.

Mikael
Collaborator
Collaborator

The SK mentions :

  • Skyline deployment on a VSX Gateway / VSX Cluster with many Virtual Systems may increase the load on CPU cores.

Can you please share some more information on the environment this was observed in?

How many cores / VS's / amount of traffic...

Anything that can help us predict the behaviour would be good...

Cheers

Arik_Ovtracht
Employee
Employee

I don't have the exact numbers yet, we are still testing it on multiple environments with different numbers of VSs to determine those numbers. We will share them in the SK once we have them.

0 Kudos
Vincent_Bacher
Advisor
Advisor

Did you already test on systems with man VSs? We have up to 30.

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
Arik_Ovtracht
Employee
Employee

A new version of Skyline will soon be released, where we have removed the limitation of number of VSs.

We have tested it successfully for 25 VSs, and believe that the actual limit is much higher, and depends mostly on the strength of the VSX device.

0 Kudos
Vincent_Bacher
Advisor
Advisor

That's fine. Now a port to R80.20SP would be interesting, even if this release is out of support soon. But we cannot upgrade to R81* because of lack of support for MSG on R81*

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
0 Kudos
Franktum
Contributor

Hi!

Do you have an update about the load on CPU cores on VSX Cluster? In SK still appears the message than Mikael commented.

Thanks

0 Kudos
Elad_Chomsky
Employee
Employee

Hi @Franktum ,

Since then ( Jan 23 ), various improvements were pushed to Skyline to negate this issue. The new version of Skyline on the latest jumbos was tested internally in our labs with 50-75 vs's and we saw it was able to work, we assume based on that, that there is currently no known reason for it to not work on bigger number of vs's.  

Please see the sk updated text:

Skyline deployment on a VSX Gateway / VSX Cluster with many Virtual Systems may increase the load on CPU cores

We do emphasize that naturally the more vs's you have the more data you are sending as part of the Skyline flow, so you should be aware of the increasing resource consumption, however it doesn't imply it will not work.

Thanks, Elad

JozkoMrkvicka
Mentor
Mentor

If there is no plan to integrate classic SNMP handling, then it would be great to have all data from SNMP MIBs available within Open Telemetry method.

For example, I dont see any telemetry related to:

1. VPN (Remote Access nor S2S VPN)

2. HW health (RAID status, HW sensors status (PSU, voltage, fan, temperature, ...) )

3. ClusterXL status

4. Logging statistics

 

How often are the data transfered from the gateway ? Every 5 minutes ? Every second ? Are we able to specify the frequency based on our need ? What in case I want immediately fire an email in case something happened on the gateway/management ? SNMP traps were used for such a cases. Lets say I want to issue an alert and be notified by Skyline once the interface went down.

Kind regards,
Jozko Mrkvicka
Arik_Ovtracht
Employee
Employee

Our plan is to add important metrics to Skyline, including VPN, HW health, ClusterXL and Logging stats, and much more. We are adding these statistics as an ongoing effort. The GA version will include some of them, and we are going to add more after the release as well.

The data is sampled every 15 seconds, and we currently don't allow users to control this period. We might add the ability to do that in a later version.

You can use any monitoring tool's alerting capabilities to do what you asked - for example, Grafana has some alerts which you can use to send a notification via email, sms and more.

 

Aaron_Vivadelli
Contributor
Contributor

First off, this is really awesome.

I know it's still beta so I'm testing in my home lab, but I had some issues with some combination of R81.10 Take 55, this package, and DHCP relay.  routed kept crashing and restarting and it broke my DHCP relay.  I also wasn't getting CPU, Memory, Disk, data from cpview, so although some data was making it into Prometheus, the Grafana templates weren't working because the instance name wasn't recognized (also wasn't getting anything from system.os.configuration_info).  More importantly though, DHCP wasn't working.

I uninstalled Take 55 and that got my DHCP Relay working again.  cpview was also started displaying stats, but when I tried to start skyline it said the 'cpview -a' flag wasn't recognized.  I figured something was wrong with skyline now so I planned to re-install it.  The uninstall ran fine, but when it rebooted 'defaultfilter' policy was loaded and I had to console in.  I tried cpstart and fwstart was failing.  I decided to reinstall skyline and this is where I finally had success.

I'm still running R81.10 Take 44, but Skyline is installed and working, and so is my DHCP relay with routed stable.  Production users take warning because this was a frustrating process with hard down scenarios.  Not sure if this was Take 55, Skyline Beta Package, or a combination of the two.

Now, for my recommendation/request.  I started using Zabbix for SNMP monitoring.  Someone developed a Zabbix Plugin for Grafana that allows you to easily create queries in Grafana.  You select the group, the host, the metric category, and the metric itself.  It even auto-populates all the possible entries.  From what I can tell in Skyline, creating your own graphs and queries is heavily based on query language and requires proper syntax.  It would be nice to have plugin to help create these queries.

Excellent work!

Aaron_Vivadelli
Contributor
Contributor

My installation is also a cluster and both members had the same exact issue and were resolved in the same exact way.  Each step was able to be replicated exactly.

0 Kudos
Arik_Ovtracht
Employee
Employee

Hi Aaron,

We are aware of certain issues which happen when combining newer takes of the Jumbo with Skyline, and for that reason we have removed the Skyline installation hotfix from the sk for the moment.

As far as we know, it only happens when installing Skyline on top of the newer takes of Jumbo (newer than the minimum ones listed in the sk), or when installing a newer Jumbo take on top of Skyline (which should not have been allowed, that is one of the issues).

We will publish a newer version of Skyline (still beta) soon, which will fix those issues.

In the meantime, I would recommend to install Skyline only on top of the Jumbo takes which are listed in the sk, and to not install any Jumbo takes on top of it.

Sorry for the inconvenience, and glad to hear that you still think it's good!

0 Kudos
JozkoMrkvicka
Mentor
Mentor

Hi Arik,

I see the Skyline homepage has been updated, but the link to download the package is still not working. I am fine to install Skyline only on top of Take 30 of R81.10 just for home testing. Would you please check it ?

Thanks.

Kind regards,
Jozko Mrkvicka
0 Kudos
Arik_Ovtracht
Employee
Employee

Yes, we removed the Skyline hotfix from the download server after we discovered that there is a bug which allows users to install newer Jumbo Hotfix takes on top of it, which could cause issues with Skyline.

Please contact me directly at ariko@checkpoint.com and I can send you the hotfix installation file.

0 Kudos
Oliver_Fink
Advisor
Advisor

It would be really nice if there would be a short notice that the Downloads have been removed. Just deleting the files and leaving the user with a defective download link does not seem to be good practice.

0 Kudos
_Val_
Admin
Admin

We have notified the relevant team for that matter, requesting that particular note before the package is back. 

JozkoMrkvicka
Mentor
Mentor

Starting from 20th of June, the BETA skyline packages are again available for download.

Kind regards,
Jozko Mrkvicka
0 Kudos
JSG1
Explorer

Game changer...how many times have we all been in the situation where we needed live stats! Having to log into the gateway to retrieve them isn't scalable. This is exactly what is needed in production environments. Smartview Monitor is very dated and although good to get some data in a pinch, it's not good enough for 24/7 monitoring. Looking forward to when this becomes GA. Great work!

Some ideas for metrics to monitor:

- CoreXL statistics - be great to see dynamic dispatcher stats (fw ctl multik stat / vpn tu mstats)

- Route changes / BGP / OSPF neighbour changes.

- ClusterXL monitoring

- Process statistics (CPU and RAM consumption)

- Active debugs running

milunb
Participant

Hallo,

Do you plan to update Skyline so that it can work with new versions of Jumbo Hotfix, for example Take 66.

0 Kudos
Arik_Ovtracht
Employee
Employee

Hi,

We are very near to releasing the first Skyline GA version, and it will be released as part of one of the next Jumbo takes for each supported version (currently R80.40, R81 and R81.10). 

They are all planned to be released during October 2022.

milunb
Participant

Thank you very much for the information.

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events