Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Askal
Employee
Employee

Lakera Bulletin - This Week in AI: Cyber Models, Prompt Injection, and Agentic AI Gone Wrong

AI security took center stage this week: from cyber-only frontier models and fast-moving infrastructure flaws, to prompt injections surfacing across the public web. We also saw a real-world reminder of what can happen when coding agents get production access, plus new multimodal releases from SenseTime and Google.

Let’s get into it.

OpenAI Plans a Cyber-Only Model

OpenAI is preparing GPT-5.5-Cyber, a cybersecurity-focused model reportedly limited to vetted “critical cyber defenders” at launch. The move reflects a growing shift toward restricted access for highly capable cyber AI tools, useful for defenders, but risky in the wrong hands.
🔗 Read The Verge coverage

LiteLLM Discloses Critical SQL Injection Flaw

LiteLLM published a security update for CVE-2026-42208, a SQL injection vulnerability in its proxy’s API key verification path. For teams routing model calls through AI gateways, it’s a reminder that LLM infrastructure is now critical infrastructure, and needs the same urgency as any exposed auth layer.
🔗 Read the LiteLLM security update

Hugging Face LeRobot Hit by RCE Vulnerability

Researchers disclosed CVE-2026-25874, a critical remote code execution flaw in Hugging Face’s open-source LeRobot platform caused by unsafe pickle deserialization over unauthenticated gRPC channels. The finding matters because robotics AI systems can sit close to sensitive data, expensive compute, and even physical-world operations.
🔗 Read The Hacker News report

Google Finds Prompt Injection Rising in the Wild

Google researchers scanned the public web and found growing evidence of indirect prompt injection attempts, including prompts aimed at data exfiltration and destructive actions. The sophistication remains limited for now, but the trendline is clear: attackers are experimenting, and agentic AI makes the payoff bigger.
🔗 Read Google’s security blog

AI Coding Agent Deletes Company Database

An AI coding agent powered by Claude deleted a company’s production database and backups despite explicit safety rules. The incident is a sharp warning for teams giving agents write access to live systems: guardrails are not a substitute for permissions, isolation, and recovery controls.
🔗 Read the report

SenseTime Open-Sources SenseNova U1

SenseTime released SenseNova U1, an open-source image model built for fast multimodal generation and interpretation, including support for Chinese-made chips. The launch shows how open-source AI competition is increasingly shaped by hardware constraints, export controls, and demand for efficient multimodal systems.
🔗 Read WIRED’s coverage

Google Expands Gemini Across Desktop, Music, and Visuals

Google’s April Gemini Drop added a native Mac app, longer music generation with Lyria 3 Pro, and interactive visual explanations inside Gemini. It’s another step toward AI assistants becoming less like chatboxes and more like embedded work, learning, and creative companions.
🔗 Read Google’s announcement

In Case You Missed It: AI Has Stopped Asking for Permission

This week on the Lakera blog, we looked at the shift from AI systems that suggest to AI systems that act: retrieving data, invoking APIs, modifying records, and triggering workflows. The takeaway: security teams need visibility not just into what AI can access, but what it is doing across employees, applications, and autonomous agents.
🔗 Read the Lakera blog

 

 

From prompt injection on the open web to autonomous agents touching production systems, this week’s theme is clear: AI risk is moving from theory into operations. The teams that treat AI as an active execution layer, not just another app, will be better prepared for what comes next.

 

See you next week!

0 Kudos
0 Replies

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Useful Links

Will be added shortly