URL Categorization

jberg712 · ‎2025-07-02

I wanted to reach out if anyone else has had to fight with this. So we have established a policy that we would block all AI web tools except for Copilot since we are in the cloud now. I built an inline rule that when matched to the category AI, go through this inline rule, allow co-pilot, block everything else. As per best practice since this rule might not be hit as much, it's been placed lower than some. What we found is that some AI tools were still accessible such as Google Gemini. When I inspected this, Google Gemini was allowed because of 2 rules that allowed "Google Services" and "Google Ads". I ended up moving the rule higher above these rules for this to catch. I don't understand why 'gemini.google.com' would still be categorized or use the objects Google Services or Google Ads. I can understand the possibility of Google Services, but Google Ads and the web advertisements category? Really? That's strange to me and I don't follow the logic that Gemini would fall under this particular one.

Is there anything that can be done to limit certain sites to only 1 category or object? Or if this has come up what have others done to mitigate allowance of sites/services that match other categories and potentially match other rules allowing or blocking access which is not intended?

Chris_Atkinson · ‎2025-07-02

Hello,

Could you please add some additional details for context:

- HTTPS Inspection?

- QUIC traffic blocked or inspected?

- Gateway version/JHF?

- Categorization mode hold or background?

CCSM R77/R80/ELITE

jberg712 · ‎2025-07-02

- HTTPS Inspection? YES

- QUIC traffic blocked or inspected? BLOCKED and DISABLED on end user workstations by Group Policy

- Gateway version/JHF? R81.20 JHF Take 92

- Categorization mode hold or background? HOLD

Lloyd_Braun · ‎2025-07-02

What does the HTTPS inspection policy look like? Are you applying it via category set? Have seen some quirky behavior with the "categorize https sites" best effort functionality, where something may fall into a different category set based on limited visibility of certificate and SNI.

the_rock · ‎2025-07-02

Good point there Lloyd, though considering https inspection is enabled, that should cover it.

Andy

Best,
Andy

jberg712 · ‎2025-07-02

In our HTTPS Inspection policy, we have bypass rules where we use custom Application/Site groups with specific URLs and those would be under the Category/Custom Application Column. We have bypass rules that bypass Any category but only go to a specific destination. Only 1 rule has 2 categories in it and that's a Microsoft Office category and a Web conferencing category. All these are bypass rules. We just have 1 outbound Inspect rule while the rest are inbound inspect. And we do have SNI enabled so the domains the SAN field are checked and hopefully categorized properly.

the_rock · ‎2025-07-02

I had another look at the policy you have and to me, that makes 100% logical sense. I do very similar thing in my lab, though not sure if you have multiple ordered layers or not, like I do. Regardless, I totally see the points you made about bypass and having to move the rules around, it can get very confusing.

Andy

Best,
Andy

Chris_Atkinson · ‎2025-07-02

Is it just Google/Gemini that is problematic?

Note also for awareness the following JHF fixes:

T101

PRJ-59616,
PRHF-38387

Application Control

Some custom applications in the HTTPS Inspection policy are not matched if they are part of a Group object. Refer to sk183176.

T99:

PRJ-55652,
PRHF-34020

Application Control

In some scenarios, a custom application does not match a URL Filtering rule.

CCSM R77/R80/ELITE

the_rock · ‎2025-07-02

Good idea Chris. Considering 105 is recommended take, I would probably go with that one.

Andy

Best,
Andy

the_rock · ‎2025-07-02

Do you mind send screenshots of how you have it configured in the rulebase? Please blue out any sensitive data.

Andy

Best,
Andy

jberg712 · ‎2025-07-02

Is there anything I can limit posting because I have a large rulebase? It would quite a few on what is needed. Basically in the 2 I have, the AI rule was below the Group Web policy rules I have. Some of these Group Web policies have Web Advertisements like Marketing and Google Services. Now I'm finding out, all of Github is being blocked because all of it is matching the AI rule when I know it has the AI piece. So why all of github is being listed as AI, I don't know.

the_rock · ‎2025-07-02

just that specific rule.

Best,
Andy

jberg712 · ‎2025-07-02

I edited my post. But this really seems to be more of how things are being categorized or what application they are falling under. It's like it's miscategorizing them OR since somethings fall under multiple categories, it's matching the secondary categories instead of the primary maybe?

the_rock · ‎2025-07-02

I cant sadly tell from that screenshot, but here is what I would do. Create INLINE layer like below. I attached it, hope it makes sense.

Andy

Best,
Andy

the_rock · ‎2025-07-02

@jberg712

Also, super important...MAKE SURE to enable urlf blade in the inline layer if you are going to try that. Since by default, only fw blade is on.

Andy

Best,
Andy

jberg712 · ‎2025-07-02

This rulebase with the rules in the screenshot is Application and URLf and the only one selected. But by the logic you have their Andy, wouldn't that apply to all traffic to hit the inline rule? If you look at mine, you don't apply the inline rule if the request isn't for something in the AI category.

jberg712 · ‎2025-07-02

In the services column...

the_rock · ‎2025-07-02

Hang on, sorry, I see the screenshot okay now. You have it right, as far as Im concenred.

Quesitons I have:

1) Are CNB net users NOT able to access the copilot sites?

2) Any hits on rule 8.1?

3) If access for those users is failing, what do logs show?

If nothing helps, I would simply create custom category and add *copilot* entry in it and see if that helps, just add that in services.

Andy

Best,
Andy

the_rock · ‎2025-07-02

Well, thats the logic I apply to this. You have that inline layer ONLY for what you are trying to do. So essentially you allow copilot and block anything else at the bottom. BUT, key is that parent rule of the layer has to be hit first.

Andy

Best,
Andy

jberg712 · ‎2025-07-02

Correct. I understand that concept. The issue is still when the rule was below the Group User roles, Google Gemini was still being allowed because the traffic log indicated that Google Gemini fell under Google Services and Google Web Advertisements. Under a different object and a different category. Therefore, users hit the rule that allowed Google services and Google Web Advertisements (which has to be allowed for certain things to work), therefore they could get to Google Gemini.

For example,

Rule 15 --> Marketing --> Access granted to Google Web Advertisements
Rule 19 --> Inline Rule for Blocking AI except for CoPilot

Marketing users try to use Google Gemini. Google Gemini being AI is accepted on Rule 15 because of Google Web Advertisements being allowed. (miscategorization or secondary categorization).

Google Gemini is strictly AI and shouldn't fall under this.

Now I've moved my rule above the Marketing rule which now blocks Google Gemini.

The part I'm battling with is that Google Gemini SHOULD NOT BE categorized under Google Web Advertisements correct? But if this is somehow correct (which I don't understand why), what can be done differently without moving the AI rule? I mean if Gemini has a primary of AI and secondary category of Web Advertisements, can the rulebase somehow be set to only match the primary category?

And that's assuming that's why Gemini is even matching the Marketing rule in the first place is because they are allowed the Web Advertisement category. They don't have the AI category in their rule. I'm just not wanting Gemini to match under the category Web Advertisements or Google Services.

I hope this is clear enough.

the_rock · ‎2025-07-02

Well, I get what you are saying. Sadly, thats not something customers can change, it comes from the company in question, in this case, Google. Technically, I doubt even CP can change that. See for example output from my lab gateway for Google services.

Andy

[Expert@R82:0]# dynamic_objects -uo "Google Services"

object name: Google Services
range 0 : 8.8.4.0 8.8.4.255
range 1 : 8.8.8.0 8.8.8.255
range 2 : 8.34.208.0 8.34.223.255
range 3 : 8.35.192.0 8.35.207.255
range 4 : 23.236.48.0 23.236.63.255
range 5 : 23.251.128.0 23.251.159.255
range 6 : 34.0.0.0 34.3.1.255
range 7 : 34.3.3.0 34.3.4.255
range 8 : 34.3.8.0 34.3.127.255
range 9 : 34.4.0.0 34.191.255.255
range 10 : 35.184.0.0 35.199.191.255
range 11 : 35.200.0.0 35.247.255.255
range 12 : 57.140.192.0 57.140.255.255
range 13 : 64.15.112.0 64.15.127.255
range 14 : 64.233.160.0 64.233.191.255
range 15 : 66.22.228.0 66.22.229.255
range 16 : 66.102.0.0 66.102.15.255
range 17 : 66.249.64.0 66.249.95.255
range 18 : 70.32.128.0 70.32.159.255
range 19 : 72.14.192.0 72.14.255.255
range 20 : 74.114.24.0 74.114.31.255
range 21 : 74.125.0.0 74.125.255.255
range 22 : 104.154.0.0 104.155.255.255
range 23 : 104.196.0.0 104.199.255.255
range 24 : 104.237.160.0 104.237.191.255
range 25 : 107.167.160.0 107.167.191.255
range 26 : 107.178.192.0 107.178.255.255
range 27 : 108.59.80.0 108.59.95.255
range 28 : 108.170.192.0 108.170.255.255
range 29 : 108.177.0.0 108.177.127.255
range 30 : 130.211.0.0 130.211.255.255
range 31 : 136.22.160.0 136.22.186.255
range 32 : 136.124.0.0 136.125.255.255
range 33 : 142.250.0.0 142.251.255.255
range 34 : 146.148.0.0 146.148.127.255
range 35 : 152.65.208.0 152.65.211.255
range 36 : 152.65.214.0 152.65.215.255
range 37 : 152.65.218.0 152.65.219.255
range 38 : 152.65.222.0 152.65.255.255
range 39 : 162.120.128.0 162.120.255.255
range 40 : 162.216.148.0 162.216.151.255
range 41 : 162.222.176.0 162.222.183.255
range 42 : 172.110.32.0 172.110.39.255
range 43 : 172.217.0.0 172.217.255.255
range 44 : 172.253.0.0 172.253.255.255
range 45 : 173.194.0.0 173.194.255.255
range 46 : 173.255.112.0 173.255.127.255
range 47 : 192.104.160.0 192.104.161.255
range 48 : 192.158.28.0 192.158.31.255
range 49 : 192.178.0.0 192.179.255.255
range 50 : 193.186.4.0 193.186.4.255
range 51 : 199.36.154.0 199.36.156.255
range 52 : 199.192.112.0 199.192.115.255
range 53 : 199.223.232.0 199.223.239.255
range 54 : 207.223.160.0 207.223.175.255
range 55 : 208.65.152.0 208.65.155.255
range 56 : 208.68.108.0 208.68.111.255
range 57 : 208.81.188.0 208.81.191.255
range 58 : 208.117.224.0 208.117.255.255
range 59 : 209.85.128.0 209.85.255.255
range 60 : 216.58.192.0 216.58.223.255
range 61 : 216.73.80.0 216.73.95.255
range 62 : 216.239.32.0 216.239.63.255
range 63 : 216.252.220.0 216.252.223.255
range 64 : 2001:4860:: 2001:4860:ffff:ffff:ffff:ffff:ffff:ffff
range 65 : 2404:6800:: 2404:6800:ffff:ffff:ffff:ffff:ffff:ffff
range 66 : 2404:f340:: 2404:f340:ffff:ffff:ffff:ffff:ffff:ffff
range 67 : 2600:1900:: 2600:190f:ffff:ffff:ffff:ffff:ffff:ffff
range 68 : 2605:ef80:: 2605:ef80:ffff:ffff:ffff:ffff:ffff:ffff
range 69 : 2606:40:: 2606:40:ffff:ffff:ffff:ffff:ffff:ffff
range 70 : 2606:73c0:: 2606:73c0:ffff:ffff:ffff:ffff:ffff:ffff
range 71 : 2607:1c0:241:40:: 2607:1c0:241:4f:ffff:ffff:ffff:ffff
range 72 : 2607:1c0:300:: 2607:1c0:3ff:ffff:ffff:ffff:ffff:ffff
range 73 : 2607:f8b0:: 2607:f8b0:ffff:ffff:ffff:ffff:ffff:ffff
range 74 : 2620:11a:a000:: 2620:11a:a0ff:ffff:ffff:ffff:ffff:ffff
range 75 : 2620:120:e000:: 2620:120:e0ff:ffff:ffff:ffff:ffff:ffff
range 76 : 2800:3f0:: 2800:3f0:ffff:ffff:ffff:ffff:ffff:ffff
range 77 : 2a00:1450:: 2a00:1450:ffff:ffff:ffff:ffff:ffff:ffff
range 78 : 2c0f:fb50:: 2c0f:fb50:ffff:ffff:ffff:ffff:ffff:ffff

Looking for domains for 'Google Services' and its children objects:

The updatable object Google Services does not contain any domains (object is enforced by IP addresses only)

Operation completed successfully
[Expert@R82:0]#

Best,
Andy

the_rock · ‎2025-07-02

Mind you, google gemini falls under AI category.

Andy

URL Categorization

For: https://gemini.google.com/app

Current Categories: Artificial Intelligence (AI), Low Risk

Artificial Intelligence (AI)

This category includes URLs that provide Chatbots and virtual assistance using natural language processing, Machine Learning tools and Generative AI models for creating new content including images, music, deepfake, text and more. Artificial Intelligence simulates human intelligence processes by machines, especially computer systems. Examples : chat.openai.com, bard.google.com

Low Risk

Applications and Websites that are potentially non business related yet low risk.

Best,
Andy

jberg712 · ‎2025-07-02

This is what my log says and under matched rules you see that it takes a different application and rule even though the details tab shows the categorization.

jberg712 · ‎2025-07-02

This is where the "logic" of the matching categories and applications eludes me. As the details state the category for AI and Medium Risk, but the rule it matches is for Google Services Application and category Computers/Internet. Unless they overlap somehow or nested somehow, it shouldn't be matching these rules.

the_rock · ‎2025-07-02

Thats my thinking as well, sounds like its overlapping in some way, otherwise, would not make logical sense.

Andy

Best,
Andy

the_rock · ‎2025-07-02

I get what you mean, thats puzzling to me as well. Might be worth TAC case.

Andy

Best,
Andy

jberg712 · ‎2025-07-02

Do the developers not watch these threads anymore? At one point for a weird issue I posted about a dev or high level engineer was responding to this.

the_rock · ‎2025-07-02

Lets see if someone from CP can confirm.

Andy

Best,
Andy

jberg712 · ‎2025-07-02

That would be great Andy. Thank you. It's gotten to a point that I have re-arrange the entire rulebase by moving rules around just to get the behavior I'm wanting and expecting. It's starting to get to the point where when I move things around the overlapping starts blocking some things for some users but then they may hit another rule because of a different category for the same resource. It's really getting convoluded and confusing.

the_rock · ‎2025-07-02

I totally agree, it can get very cumbersome, for the lack of better term. I wish it would be easier. I was hoping R82 would solve some of those issues, but does not seem like it. Based on my lab experience, its exactly the same. Yes, there are lof of new ssl inspection features in smart console that make life easier, but as far as example you gave, I dont see any difference at all.

Andy

Best,
Andy

Are you a member of CheckMates?

Application Control and URL filtering Matching Multiple Categories/Objects

URL Categorization