How to Batch Categorize URLs
Author
@Sebastien_Rho
You can look up the category of a website using Check Point’s URL categorization website (https://www.checkpoint.com/urlcat/main.htm)
Since the site allows you to query 1 site at a time, it could be a long process if you have a list of sites you want to query. All commands must be made from the same folder.
1. Create a cookie for your session on the website
This is how you log into the site with curl and store your session cookie to a file… (replace email@domain.com and “password” with your UserCenter credentials):
curl_cli -k -v --cookie-jar ./cookie -X POST -d "customer=&g-recaptcha-response=&needCaptcha=false&userName=email@domain.com&password=password&userOr..." https://www.checkpoint.com/urlcat/login.htm
N.B.: If you have special characters in your password which might be misinterpreted by bash you may have to “escape” them with “\”
E.g. Pas$word should be entered as Pas\$word
2. Create the list of the websites you want to query
[Expert@yourSMS]# vi sites.txt
www.yahoo.com
www.cnn.com
www.gmail.com
Then make a bash script with this content (categorize.sh):
[Expert@yourSMS]# vi categorize.sh
#!/bin/bash
while read p; do
result=$(curl_cli -k -v --cookie ./cookie -X POST -d "action=post&actionType=submitURL&urlCategorization=$p" https://www.checkpoint.com/urlcat/main.htm 2>/dev/null | grep -A4 "Categories:" | tr -d '\n' | grep -oP '(?<=).*?(?=
)' | sed 's/^[ \t]*//')
echo $p,$result
sleep 1
done <sites.txt
done <sites.txt
When running the script, it will return all the categories that the site is associated with:
[Expert@yourSMS]# ./categorize.sh
01com.com,Computers / Internet
020jbxsgqwpse.changeip.org,Computers / Internet
022btrarqcfuk.changeip.org,Computers / Internet
026kordzsydup.changeip.org,Computers / Internet
1001-love.com,Sex
You can also output the script to a file to be able to save and send.
[Expert@yourSMS]# ./categorize.sh >sites.csv
For the full list of White Papers, go here.