Snowflake Cortex AI Sandboxed Escape: Prompt Injection Bypasses Human-in-the-Loop
Security researchers at PromptArmor have disclosed a vulnerability in Snowflake's Cortex Code CLI that allowed malware execution via indirect prompt injection — bypassing both the sandbox and human-in-the-loop approval mechanisms.
The Vulnerability
Snowflake Cortex Code CLI is a command-line coding agent (similar to Claude Code and OpenAI's Codex) with built-in SQL integration for Snowflake. Two days after release, researchers found a flaw in the command validation system that allowed specially constructed malicious commands to:
- Execute arbitrary commands without triggering human-in-the-loop approval
- Escape the Cortex CLI sandbox entirely
- Leverage the victim's active credentials to perform malicious actions in Snowflake (exfiltrate data, drop tables)
The Attack Chain
- User opens Cortex and enables sandbox mode
- User asks for help with a third-party open-source codebase found online
- A prompt injection hidden in the repository's README manipulates Cortex into running a dangerous command
- The command validation fails — Cortex does not validate commands inside prompt-injected content
- Malware executes outside the sandbox with the user's credentials
Key Issues Identified
- No workspace trust — Cortex lacks the "workspace trust" convention that warns users about untrusted directories (used by VS Code and most agentic CLIs)
- Prompt injection surface — any untrusted data (README files, web search results, database records, MCP responses, terminal output) can carry an injection
- Credential reuse — the victim's active Snowflake credentials are immediately available to the attacker
The Fix
Snowflake's security team validated and remediated the vulnerability. The fix shipped in Cortex Code CLI version 1.0.25 on February 28, 2026.
Why It Matters
This is a textbook example of the emerging threat landscape for agentic AI tools:
- AI coding agents have deep system access — filesystem, network, database credentials
- Prompt injection is a real, exploitable attack vector — not theoretical
- Sandbox boundaries are insufficient if the agent itself can be manipulated
- Human-in-the-loop is a last resort — if it can be bypassed, everything else fails
The disclosure adds to growing evidence that securing agentic AI requires fundamentally new security architectures, not just traditional sandboxing.
Source: PromptArmor | HN Discussion