Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
_Val_
Admin
Admin

Lakera bulletin - This week in AI

This week’s issue centers on security-first stories: new exploits and prompt-injection research, large open-data/model moves, and Lakera’s own release of the Backbone Breaker (b3) benchmark.

Expect agent attack examples, enterprise-scale defenses, and an update on where open models and datasets are heading.

LotL Attack Hides Malware in Windows Native AI Stack

A researcher showed a living-off-the-land technique that abuses Windows’ native AI stack to deliver malware, demonstrating a new, stealthy attack vector for AI-enabled tooling on endpoints. The finding shows how OS-level AI integrations add novel supply-chain and persistence risks security teams must account for.
🔗 DarkReading coverage

AI Agents Can Leak Company Data Through Simple Web Searches

New research reveals enterprise AI agents that combine LLMs, retrieval, and web tools can be manipulated by indirect prompt injections, allowing attackers to exfiltrate internal data via seemingly innocuous web content. The work shows agent architectures need threat modeling for external content and tool trust boundaries, not just model alignment.
🔗 Help Net Security report

Google’s Built-In AI Defenses on Android Now Block 10 Billion Scam Messages a Month

Google says Android’s on-device AI protections now block over 10 billion suspected malicious calls and messages monthly, signaling both the scale of AI-driven fraud and the increasing role of ML-based defenses in platform security. The stat is a reminder that defenders are deploying generative/ML tools at massive scale to match attackers’ tactics.
🔗 The Hacker News coverage

NVIDIA Releases Open Models and Large Open-Source Dataset for Physical AI

NVIDIA published new open models plus a massive multimodal driving dataset (1,700+ hours of sensor data) aimed at physical-world AI research, accelerating reproducible work on robotics, autonomy, and sensor fusion. The release highlights industry moves to democratize high-quality data and models for real-world AI tasks.
🔗 NVIDIA blog announcement

Chinese Start-up MiniMax Releases M2 Model, Leading Among Open Models

MiniMax unveiled M2, a 200B-parameter Mixture-of-Experts open model that is now competing with top open-weight systems, illustrating how Chinese startups are pushing on open-source model performance and narrowing gaps with proprietary providers. The launch sharpens the global open-model landscape and competition on capabilities.
🔗 SCMP coverage

Microsoft Positions GitHub as AI Agent Platform with “Agent HQ”

Microsoft announced a push to make GitHub a central hub for AI coding agents through “Agent HQ,” enabling integrations of third-party agents and more automation across the developer workflow. The move builds on Microsoft’s strategy to make agents first-class developer tools—but also concentrates more agent attack surface in widely used ecosystems.
🔗 The Verge coverage

In Case You Missed It from Lakera

The Backbone Breaker Benchmark (b3)

Lakera and the UK AI Security Institute published the Backbone Breaker Benchmark (b3), a human-grounded benchmark that evaluates the security of backbone LLM calls via “threat snapshots” derived from ~194,000 human red-team attempts. b3 isolates the exact LLM decision points attackers exploit and shows that step-by-step reasoning improves resilience while open models are closing the gap with closed systems.
🔗 Read the release

Agent Breaker: Episode 6 (OmniChat Desktop)

Episode 6 of Agent Breaker demos indirect prompt injection against a desktop chatbot using the Model Context Protocol (MCP), showing metadata and tool descriptions can be weaponized to bend agent behavior without a user-visible jailbreak. The episode is a practical, watchable example of how trusted tools and metadata expand an agent’s attack surface.
🔗 Watch Episode 6 on YouTube

 

This week highlights the blunt reality that agentic AI multiplies both capability and attack surface: from OS-level LotL techniques to metadata and web-borne prompt injections. At the same time, the ecosystem is responding: large open datasets, agent platforms, and new benchmarks like b3 are already reshaping how we measure and defend agent security.

See you next week!

1 Reply
the_rock
MVP Platinum
MVP Platinum

Super helpful links!

Best,
Andy
0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events