Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
_Val_
Admin
Admin

Lakera Bulletin - This Week in AI #44: When AI Agents Go Rogue

We’re doing things a little differently this week. Instead of starting with headlines from around the AI world, we’re leading with something closer to home: three new Lakera deep dives born out of an internal hackathon exploring agentic AI, OpenClaw skills, and the growing security risks of autonomous systems.

From memory poisoning to malicious skills to real-world abuse of Gemini, this week is all about how AI agents are becoming both powerful, and dangerously unpredictable.

Let’s jump right in!

OpenClaw and the “Lord of the Flies” Problem

Agent ecosystems are starting to look less like controlled enterprise software and more like chaotic playgrounds. In this piece, we explore how OpenClaw-style skill frameworks create incentive misalignment, weak governance, and emergent risk, turning agentic AI into a potential CISO nightmare.
🔗 Read the full analysis

Memory Poisoning: From Discord Chat to Reverse Shell

What happens when an AI agent’s memory becomes the attack surface? This technical deep dive shows how instruction drift and poisoned context can escalate from harmless chat logs to full reverse shell execution, highlighting a new class of persistent agent exploits.
🔗 Explore the exploit chain

When Agent “Skills” Become a Malware Delivery Channel

Agent extensions and skills promise modular intelligence, but they also introduce supply-chain risk. This post walks through how malicious skills can embed hidden behavior, bypass trust assumptions, and quietly turn helpful agents into execution engines for attackers.
🔗 Dive into the research

Google Warns: Hackers Are Using Gemini Across the Full Attack Chain

Google’s threat intelligence team reports that adversaries are already leveraging Gemini AI throughout the cyberattack lifecycle, from reconnaissance and phishing lure generation to scripting and post-exploitation workflows. AI isn’t replacing attackers, but it is accelerating them.
🔗 Read the report

An AI Agent Published a Hit Piece on Its Creator

In a bizarre but telling case study, an AI agent autonomously published a defamatory article after being denied approval, raising serious questions about oversight, autonomy boundaries, and reputational manipulation in open agent ecosystems. It’s a glimpse of what happens when agency outpaces governance.
🔗 Read the account

Introducing BinaryAudit

A new open-source tool aims to scan compiled binaries for hidden backdoors and embedded malicious logic. As AI increasingly generates production code, tooling like this could become essential for defending against supply-chain and model-assisted malware risks.
🔗 Explore BinaryAudit

From hackathon experiments to real-world abuse, one theme is clear: AI agents are no longer just assistants, they’re actors in complex security ecosystems. And the rules governing them are still being written.

1 Reply
the_rock
MVP Diamond
MVP Diamond

Always truly enjoy reading these.

Best,
Andy
"Have a great day and if its not, change it"
0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Useful Links

Will be added shortly