Who rated this post

_Val_ · ‎2025-12-01

It’s been a busy week in AI security, with new vulnerabilities emerging in agentic tooling, fresh research showing how stylistic prompts can bypass safety filters, and ongoing discussions about supply-chain risks and open-source guardrails. Add in a major enterprise model update and a couple of upcoming events (including one from the Lakera team), and there’s plenty to cover.

Let’s jump right into it.

Google Antigravity: Indirect Prompt Injection Steals Developer Secrets

A malicious prompt hidden inside a reference guide can trick Antigravity into running terminal commands, reading .env files, and exfiltrating credentials through its browser subagent. Default settings make the attack easy to miss, highlighting real risks in agentic IDE workflows.
🔗 Read the Antigravity exploit write-up

Adversarial Poetry: A Universal Single-Turn Jailbreak

A new paper finds that turning harmful prompts into verse dramatically improves jailbreak success across 25 frontier models: up to 18× more effective than prose. It exposes a systemic gap in current alignment methods: stylistic shifts alone can dismantle safety filters.
🔗 Read the adversarial poetry paper

Whisper Leak: New Research Shows Metadata Side-Channel Risks in LLM Traffic

Published earlier this month, the Whisper Leak paper demonstrates how attackers can infer sensitive prompt topics from encrypted LLM traffic by analyzing packet size and timing patterns. Tested across 28 major models, the attack achieves near-perfect classification, even identifying topics like “money laundering” with 100% precision, showing how metadata alone can compromise privacy under network surveillance.
🔗 Read the Whisper Leak paper

AI Supply-Chain Risks Expected to Fuel 2026 Cybercrime

Security analysts warn that open-source fragility and increasingly automated attack tooling will shape the 2026 threat landscape. With AI driving both exploitation and defense, supply-chain security continues to grow in importance.
🔗 Read the 2026 outlook

RuleHub: New Open-Source Guardrails for LLM Ops

RuleHub introduces an open-source, “policy-as-code” framework aimed at helping teams define and enforce safety and governance rules across ML and LLM workflows. It’s an interesting entry in the growing ecosystem of community-driven guardrail tooling, especially for teams experimenting with lightweight or DIY approaches.
🔗 Explore RuleHub

Claude Opus 4.5 Lands With Enterprise-Focused Upgrades

Anthropic’s latest flagship boosts reasoning, code generation, and long-running agent workflows, aiming to serve as a full-stack enterprise assistant. Another step in the growing race for frontier-grade business models.
🔗 Read the Claude Opus 4.5 coverage

In Case You Missed It

Check Point Event: Lakera Joins the Conversation

Check Point’s December 4 virtual event dives into securing AI-powered innovation, and Lakera will be part of the discussion. It’s a great chance to hear how our combined teams are approaching hybrid mesh security and AI-agent defense heading into 2026.
🔗 Register for the event

Lakera Online Event: The Year of the Agent

On December 10, we’re hosting a look back at 2025’s biggest AI-driven threats and what’s coming next. Mateo Rojas-Carulla, David Haber, and guest practitioners will unpack real attack trends and how defenders are preparing for 2026. If you work with AI in production, you’ll want to join us.
🔗 Save your spot