Hi guys!
I went through almost every post here in this forum section to understand what is going on with out opentelemetry/prometheus/grafana setup. Some facts:
- Our gateways (3 of them) are on R81.20
- Our prometheus is already receiving data from many other devices other than Checkpoint without issues so, it is not the problem, same for Grafana, we are using it at full power with many other Dashboards
- I followed the skyline config doc to deploy it. We are using the noTLS payload. Already checked many times the remotewrite.
- Checked outgoing connection with tcpdump from the gw and also the ingestion on prometheus instance, all fine, I can see 9090 connections being established.
- For some reason cpview -a does not exist here, only -s
- When I run cpview I get into an old DOS kind of app with all the metrics showing up fine.
So, once I used the sklnctl export --set with the json it all went fine (also done with g_all to apply to all 3), I can not see any errors on the 3 log files related to open telemetry stuff, no issue at all. When I look at the metrics in use, I see all the default ones:
system.uptime
agent.gaia.os.role
agent.network.info
profiles.count
tdlog.debug
psql.size
threadpool.status
revisions.count
agent.connections.bytes
system.network.interface_group_bond_id
admins.count
domains.count
agent.hitcount
deadlock.status
heap.size
I can see data coming to prometheus, for example, system.uptime and agent.gaia.os.role give me numbers while using the explorer. I also installed the dashboards available at CP website and in the CP Single Machine I can see uptime in one of the boxes at Grafana.
I went through the grafana JSON of this and other dashboards and the metrics repository and got a lot of them. Wrote the list, taking care of the _ and . stuff and did the command to add those metrics while doing a cat on the text file.
Ok when I list the metrics again I can see all of them in the active list.
Then comes the problem, no matter what I do, those metrics do not show up on Prometheus. I've tried virtually every start/stop command found here for opentelemetry stuff (agent, cli, etc) and nothing works. Tried to revert back to the default config and do the export --set again and reapplying again the metrics and nothing works. Tried to wait few hours to see if they come.
Your help is appreciated, thanks!