OpenClaw Security Analysis: Poisoning AI Agent Memory Triples Attack Success Rate from 25% to 74%

Available in: 中文

2026-04-07T17:49:56.425Z·2 min read

OpenClaw operates with full local system access and integrates with sensitive services including Gmail, Stripe, and filesystems. While this enables powerful automation, it also exposes a substantia...

A comprehensive academic paper presents the first real-world safety evaluation of OpenClaw, the most widely deployed personal AI agent platform in early 2026. The research introduces a novel attack taxonomy and demonstrates that poisoning any single dimension of agent state triples the success rate of attacks.

Why This Matters

The CIK Taxonomy

The paper introduces CIK — a three-dimensional framework for analyzing agent safety:

Dimension	Description	Attack Vector
Capability	What the agent can do	Modify tools, permissions, capabilities
Identity	Who the agent thinks it is	Change agent persona, goals, preferences
Knowledge	What the agent knows	Inject false memories, alter context

Key Findings

Baseline attack success rate: 24.6% across clean agent state
After poisoning ANY single CIK dimension: 64-74% success rate
Most robust model (not named) still shows 3x increase after poisoning
Tested across 4 backbone models: Claude Sonnet 4.5, Opus 4.6, Gemini 3.1 Pro, GPT-5.4
12 attack scenarios on a live OpenClaw instance

The Core Vulnerability

The fundamental issue: AI agents maintain persistent state (memory, preferences, tool configurations) that becomes part of their decision-making process. Unlike stateless API calls, this persistent state creates a new class of attack surface where adversaries can corrupt the agent's "world model" rather than just exploiting code vulnerabilities.

Implications

Personal AI agents are vulnerable to state poisoning attacks
Security evaluations must consider persistent state, not just individual API calls
Multi-model testing shows this is a platform-level issue, not model-specific
The attack surface is fundamentally different from traditional software security

↗ Original source · 2026-04-07T00:00:00.000Z

Comments0