Mark,
to your questions:
1. Can a VSX gateway (physical appliance) be clustered for HA (high availability)?
YES
2. Can a VSX gateway (physical appliance) be clustered for LS (load sharing)?
YES
3. Is it possible and does it even make sense to cluster virtual systems (virtual firewall on the vsx gateway) for HA on a VSX gateway or is this not necessary as both rely on the same hardware and would fail if the underlying hardware fails anyways?
You can't cluster a virtual system on one hardware, but your virtual system is high available if you run VSX-cluster
4. Is it possible and does it even make sense to cluster virtual systems (virtual firewall on the vsx gateway) for LS on a VSX gateway or not?
no, you can't run any cluster for a virtual system on one VSX gateway
5. Is it possible to cluster virtual systems over two or more VSX gateways (physical appliances)?
Yes, but you don't create a cluster of virtual systems, you must build a cluster with your VSX hardware.
Please have a look at the VSX documentation to better understanding of the VSX concept and VSX-cluster:
https://sc1.checkpoint.com/documents/R80.30/WebAdminGuides/EN/CP_R80.30_VSX_AdminGuide/html_frameset...
there is nice part for cluster.
To get high availability with VSX you have to use more then one hardware to built a VSX-cluster. This VSX-cluster can be run in HA or LS mode.
If in HA mode all virtual systems are active on one node and will be failover to the other in case of a problem.
LoadSharing mode gives you the chance to run virtual systems active on all of your VSX-cluster nodes. Meaning virtual system A running on node A and virtual system B running on node B. If node A is failing virtual system A will failover to node B and both virtual systems are active on node B.
Ther are some limitations with VSX-cluster in LS (you can't use virtual-router as an example). So you have to check for your requirements.
Wolfgang