<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Tuning best practices for crawlers in WAF</title>
    <link>https://community.checkpoint.com/t5/WAF/Tuning-best-practices-for-crawlers/m-p/253699#M329</link>
    <description>&lt;P&gt;Hello Community,&lt;/P&gt;&lt;P&gt;I am aware that CloudGuard WAF licenses are counted per number of HTTP/HTTPS requests. In some of our websites, most of the prevented traffic comes from GoogleBots IP addresses. These bots are crawlers that index the websites for search engines, but CloudGuard WAF classifies the traffic as malicious, prevents it and counts it from the license.&lt;/P&gt;&lt;P&gt;What tuning best practices do you recommend to reduce the amount of counted requests from our license?&lt;/P&gt;&lt;P&gt;Regards.&lt;/P&gt;</description>
    <pubDate>Tue, 22 Jul 2025 19:27:41 GMT</pubDate>
    <dc:creator>Eve_Z</dc:creator>
    <dc:date>2025-07-22T19:27:41Z</dc:date>
    <item>
      <title>Tuning best practices for crawlers</title>
      <link>https://community.checkpoint.com/t5/WAF/Tuning-best-practices-for-crawlers/m-p/253699#M329</link>
      <description>&lt;P&gt;Hello Community,&lt;/P&gt;&lt;P&gt;I am aware that CloudGuard WAF licenses are counted per number of HTTP/HTTPS requests. In some of our websites, most of the prevented traffic comes from GoogleBots IP addresses. These bots are crawlers that index the websites for search engines, but CloudGuard WAF classifies the traffic as malicious, prevents it and counts it from the license.&lt;/P&gt;&lt;P&gt;What tuning best practices do you recommend to reduce the amount of counted requests from our license?&lt;/P&gt;&lt;P&gt;Regards.&lt;/P&gt;</description>
      <pubDate>Tue, 22 Jul 2025 19:27:41 GMT</pubDate>
      <guid>https://community.checkpoint.com/t5/WAF/Tuning-best-practices-for-crawlers/m-p/253699#M329</guid>
      <dc:creator>Eve_Z</dc:creator>
      <dc:date>2025-07-22T19:27:41Z</dc:date>
    </item>
    <item>
      <title>Re: Tuning best practices for crawlers</title>
      <link>https://community.checkpoint.com/t5/WAF/Tuning-best-practices-for-crawlers/m-p/254560#M330</link>
      <description>&lt;P&gt;&lt;SPAN&gt;&lt;A href="https://waf-doc.inext.checkpoint.com/concepts/security-practices" target="_self"&gt;WAF security-practices&lt;/A&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;try these crawler tuning tips:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;U&gt;whitelist known crawlers&lt;/U&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P&gt;create custom rules to allow traffic from verified crawler addresses (&lt;A href="https://udger.com/resources/ua-list/crawlers" target="_self"&gt;example list&lt;/A&gt;)&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;&lt;U&gt;tune bot classification features&lt;/U&gt; in anti-bot protection to improve&amp;nbsp;&lt;SPAN&gt;distinguish between good and malicious bots&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;U&gt;adjust security practices&lt;/U&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;SPAN&gt;a&lt;/SPAN&gt;&lt;SPAN&gt;ctivate &lt;EM&gt;web application protection&lt;/EM&gt; and &lt;EM&gt;web API protection&lt;/EM&gt; in detect/learn mode before switching to prevent mode, t&lt;/SPAN&gt;&lt;SPAN&gt;his allows the system to learn traffic patterns and reduce false positives&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;&lt;U&gt;reduce inspection scope&lt;/U&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;SPAN&gt;exclude s&lt;/SPAN&gt;&lt;SPAN&gt;tatic resources (images, css, .js) from inspection if they are frequently accessed by crawlers by&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;configuring URL patterns or content types to bypass WAF inspection&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Tue, 05 Aug 2025 14:31:36 GMT</pubDate>
      <guid>https://community.checkpoint.com/t5/WAF/Tuning-best-practices-for-crawlers/m-p/254560#M330</guid>
      <dc:creator>Danny</dc:creator>
      <dc:date>2025-08-05T14:31:36Z</dc:date>
    </item>
  </channel>
</rss>

