Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Eve_Z
Contributor
Jump to solution

Tuning best practices for crawlers

Hello Community,

I am aware that CloudGuard WAF licenses are counted per number of HTTP/HTTPS requests. In some of our websites, most of the prevented traffic comes from GoogleBots IP addresses. These bots are crawlers that index the websites for search engines, but CloudGuard WAF classifies the traffic as malicious, prevents it and counts it from the license.

What tuning best practices do you recommend to reduce the amount of counted requests from our license?

Regards.

1 Solution

Accepted Solutions
Danny
Champion Champion
Champion

WAF security-practices

try these crawler tuning tips:

  • whitelist known crawlers
    • create custom rules to allow traffic from verified crawler addresses (example list)

  • tune bot classification features in anti-bot protection to improve distinguish between good and malicious bots
  • adjust security practices
    • activate web application protection and web API protection in detect/learn mode before switching to prevent mode, this allows the system to learn traffic patterns and reduce false positives
  • reduce inspection scope
    • exclude static resources (images, css, .js) from inspection if they are frequently accessed by crawlers by configuring URL patterns or content types to bypass WAF inspection

View solution in original post

1 Reply
Danny
Champion Champion
Champion

WAF security-practices

try these crawler tuning tips:

  • whitelist known crawlers
    • create custom rules to allow traffic from verified crawler addresses (example list)

  • tune bot classification features in anti-bot protection to improve distinguish between good and malicious bots
  • adjust security practices
    • activate web application protection and web API protection in detect/learn mode before switching to prevent mode, this allows the system to learn traffic patterns and reduce false positives
  • reduce inspection scope
    • exclude static resources (images, css, .js) from inspection if they are frequently accessed by crawlers by configuring URL patterns or content types to bypass WAF inspection

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.