This week’s issue centers on security-first stories: new exploits and prompt-injection research, large open-data/model moves, and Lakera’s own release of the Backbone Breaker (b3) benchmark.
Expect agent attack examples, enterprise-scale defenses, and an update on where open models and datasets are heading.
LotL Attack Hides Malware in Windows Native AI Stack
A researcher showed a living-off-the-land technique that abuses Windows’ native AI stack to deliver malware, demonstrating a new, stealthy attack vector for AI-enabled tooling on endpoints. The finding shows how OS-level AI integrations add novel supply-chain and persistence risks security teams must account for.
🔗 DarkReading coverage
AI Agents Can Leak Company Data Through Simple Web Searches
New research reveals enterprise AI agents that combine LLMs, retrieval, and web tools can be manipulated by indirect prompt injections, allowing attackers to exfiltrate internal data via seemingly innocuous web content. The work shows agent architectures need threat modeling for external content and tool trust boundaries, not just model alignment.
🔗 Help Net Security report
Google’s Built-In AI Defenses on Android Now Block 10 Billion Scam Messages a Month
Google says Android’s on-device AI protections now block over 10 billion suspected malicious calls and messages monthly, signaling both the scale of AI-driven fraud and the increasing role of ML-based defenses in platform security. The stat is a reminder that defenders are deploying generative/ML tools at massive scale to match attackers’ tactics.
🔗 The Hacker News coverage
NVIDIA Releases Open Models and Large Open-Source Dataset for Physical AI
NVIDIA published new open models plus a massive multimodal driving dataset (1,700+ hours of sensor data) aimed at physical-world AI research, accelerating reproducible work on robotics, autonomy, and sensor fusion. The release highlights industry moves to democratize high-quality data and models for real-world AI tasks.
🔗 NVIDIA blog announcement
Chinese Start-up MiniMax Releases M2 Model, Leading Among Open Models
MiniMax unveiled M2, a 200B-parameter Mixture-of-Experts open model that is now competing with top open-weight systems, illustrating how Chinese startups are pushing on open-source model performance and narrowing gaps with proprietary providers. The launch sharpens the global open-model landscape and competition on capabilities.
🔗 SCMP coverage
Microsoft Positions GitHub as AI Agent Platform with “Agent HQ”
Microsoft announced a push to make GitHub a central hub for AI coding agents through “Agent HQ,” enabling integrations of third-party agents and more automation across the developer workflow. The move builds on Microsoft’s strategy to make agents first-class developer tools—but also concentrates more agent attack surface in widely used ecosystems.
🔗 The Verge coverage
In Case You Missed It from Lakera
The Backbone Breaker Benchmark (b3)
Lakera and the UK AI Security Institute published the Backbone Breaker Benchmark (b3), a human-grounded benchmark that evaluates the security of backbone LLM calls via “threat snapshots” derived from ~194,000 human red-team attempts. b3 isolates the exact LLM decision points attackers exploit and shows that step-by-step reasoning improves resilience while open models are closing the gap with closed systems.
🔗 Read the release
Agent Breaker: Episode 6 (OmniChat Desktop)
Episode 6 of Agent Breaker demos indirect prompt injection against a desktop chatbot using the Model Context Protocol (MCP), showing metadata and tool descriptions can be weaponized to bend agent behavior without a user-visible jailbreak. The episode is a practical, watchable example of how trusted tools and metadata expand an agent’s attack surface.
🔗 Watch Episode 6 on YouTube
This week highlights the blunt reality that agentic AI multiplies both capability and attack surface: from OS-level LotL techniques to metadata and web-borne prompt injections. At the same time, the ecosystem is responding: large open datasets, agent platforms, and new benchmarks like b3 are already reshaping how we measure and defend agent security.
See you next week!