Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Kaspars_Zibarts
Employee Employee
Employee

MDS backup too big and slow in R80.10

For those running MDS management solution. What's your take on backup after R80.10? In our case in R77.30 backup was approx 3GB in size and it took less than half an hour to restore MDS and have it up and running. With R80.10 backup has grown to 18GB(!) within a year and actual process takes well over an hour if not closer to two. As an engineer I might accept the argument that R80.10 brought in so many new features thus increasing backup size but from business and disaster recovery point of of view it is complete shumbles. 

Ironically it makes even support process painfully slow as I was asked to upload MDS backup yesterday and considering that CP FTP servers are over 50ms away from us, it will take couple of hours to complete that.

I have been raising SRs trying to point out inefficiency of MDS backup process for years - same MDS TGZ being archived and compressed 4 times... Seriously. In order to restore backup now (offcial MDS GAIA backup) we would need nearly 100GB free disk-space. Not that it costs too much money but it makes it so slow.

I'm not expecting many votes as probably not that many run MDS but still would be good to hear opinions about the matter

41 Replies
Mike_A
Advisor

Great news Kaspars Zibarts

0 Kudos
Ran_Kopelman
Employee
Employee

Hi Kaspars,

 

My name is Ran, I’m a TL in the Management R&D responsible for the repository of the Management server, including the PostgreSQL database.

 

We have identified the major issues causing the DB growth. We created a fix to prevent it from happening and a tool to clear up the historic growth which is not cleared with purge at the moment.

We are working to add them both to one of the next JHFs.

The good news is that you don’t have to wait. You can get them both privately now. All you need to do is open a support ticket with the relevant details: ‘MDS backup is too big’ and the current JHF you are using (latest is always recommended), you can mention my name as well so I can handle it faster.

 

Once we get your ticket we will create a private HF with the relevant content for you to deploy. We will also share the instructions for how to run the tool to clear up the historic growth.

 

Thanks,

Ran    

Sander_Zumbrink
Contributor

Hello Ran,

Good to hear that there could be a solution for it. I had a long case regarding the sync between multiple domain servers caused by the big postgresql database. There was a lot of (old) data in it regarding compliancy. Does the fix also helps with the sync between MDS servers? if the postgresql database is smaller.

Kind Regards,

Sander Zumbrink

0 Kudos
Ran_Kopelman
Employee
Employee

Hi Sander,

 

I'm sorry but I don't see the connection between sync issues and a big database. The only impact related I can think of is the time it takes for Full-Sync operation to complete. 

Maybe in your case the database was so big and as a result the disc was completely full ? 

If you have an SR number I can check.

 

Thanks,

Ran

0 Kudos
Sander_Zumbrink
Contributor

Hello Ran,

At that moment the issues were disk full and time.
But now the issue has been resolved.

But the sync is still "slow" due to the large postgresql database.
So i was querious if the hotfix is also speeding up the sync between MDS servers.

Sander

0 Kudos
Ran_Kopelman
Employee
Employee

Sanders,

What do you mean by 'slow sync'  ?

The time takes for changes appear on the second machine ? 

The time takes for Full-Sync to complete ? 

Thanks,

Ran 

0 Kudos
Brian_Deutmeyer
Collaborator

Sander-

What JHF version are you running?  There were Sync improvements and JHF154 should help with that...

0 Kudos
Ran_Kopelman
Employee
Employee

Hi Brian,

How are you ? 

I'm personally familiar with your environment, therefore we are already working with Support since Monday to prepare this fix for you Smiley Happy

Brian_Deutmeyer
Collaborator

Thanks, Ran! 

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

Interesting.. my case have been open for months and the last update was on 9th November saying part #1 is done. Smiley Happy

0 Kudos
cosmos
Advisor

I know it's 3 years later and 3 versions on... I thought a 7GB MDS backup was large, is this now the standard for an R80 MDS?

I don't recall my NG-AI MDS backups being more than a gig, for hundreds of customers/policies/revisions, yet I have a 4-CMA R80.40 MDS with about 80 policies (VSX mind you) and can't get the backup any smaller than 7GB. I'm sure I have seen odd things like binaries in backups too.

I agree with the UI improvements in R80 but I just don't think the whole layers of databases thing works for firewall management with all the problems I've had with them since. In contrast I manage another multi-tenanted environment with hundreds of devices and can export the entire configuration (including that of the devices) to a flat file of about 50MB - most of which are certificates and images used in response pages. I can also restore these environments with confidence in minutes and don't burn hours with "take forever, get to 99% and fail" type situations.

</rant> but I don't see this improving without a complete re-think and simplification of management (sorry Dorit) including things like gateway interaction, SIC (especially in VSX), configuration management, unification (1 UI for everything) instead of wrapping it all up in another database 🙂

0 Kudos
Tomer_Noy
Employee
Employee

@cosmos, there is always more room to improve and we welcome the feedback. I'd like to point out a few things that we have already done in the Check Point Management product:

  1. Since the launch of R80.10 we indeed had a few edge cases that caused bloat in the DB and led to large backups. These were fixed in various JHFs and released for multiple R80.x versions.
  2. Some customers have large DBs because they accumulate many changes and did not purge revisions. For this we have added auto-purge options and even made them on-by-default in later versions.
  3. It's important to notice the flags of the command when taking backups. If you include logs, obviously that can take up a lot of space.
  4. In R81.10 we did a significant re-architecture of the way we store data. We leveraged new capabilities in the underlying DB (Postgres) and removed the need to store object indexing in Solr. There are many benefits to this approach (stability, performance), but a major benefit is a reduction in DB size, backup size and backup/restore duration. I recommend going for this version if these scenarios are important for you.

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events