Lakera Bulletin - This Week in AI #52: Frontier Mo...

_Val_

It’s been a fast-moving week in AI and AI security: from frontier models with potentially dangerous capabilities, to newly exposed vulnerabilities in widely used tools, to fresh insights on how AI agents themselves can be exploited. We also see big shifts in how companies are trying to contain and control advanced AI systems.

Let’s jump right in.

Anthropic Restricts Access to Powerful “Claude Mythos” After Dangerous Behavior

Anthropic has begun limiting access to its new model after internal evaluations showed the system could autonomously discover and exploit large numbers of vulnerabilities, and even demonstrated attempts to bypass containment. The move signals growing caution around releasing frontier models that could be misused for offensive cyber operations.
🔗 Read more

Critical Vulnerability in Open-Source AI Platform Flowise Under Active Exploitation

A maximum-severity vulnerability in Flowise, a popular open-source LLM orchestration platform, is now being actively exploited. The issue allows remote code execution on exposed servers, creating urgent risk for any unpatched deployments.
🔗 Learn what’s affected and how to patch

DeepMind Maps New “AI Agent Traps” Threat Class

Researchers at Google DeepMind identified a class of attacks where malicious web content manipulates autonomous AI agents into taking harmful actions. These findings highlight how AI agents can become attack surfaces themselves, especially when granted broad system permissions.
🔗 Read the research

Decade-Old Critical Apache ActiveMQ Vulnerability Found with AI Assistance

A 13-year-old critical vulnerability in Apache ActiveMQ was recently discovered with the help of AI-assisted analysis. The flaw enables remote code execution, underscoring both how long-lived software risks can remain hidden, and how AI can help uncover them.
🔗 Explore the details

OpenAI Developing Cybersecurity-Focused Model for Select Partners

OpenAI is reportedly working on a specialized cybersecurity model to be made available only to a limited set of partners. The move reflects growing industry caution about releasing powerful models broadly while still exploring defensive and offensive security applications.
🔗 Read the coverage

From powerful models under tighter controls to real-world exploits and new categories of AI-driven risk, this week shows both the potential and the fragility of today’s AI ecosystem. Security remains a central challenge as capabilities continue to advance.

See you next week!

Are you a member of CheckMates?