Skip to main content
Cyrus is designed for teams that care deeply about security, predictability, and control. Modern AI agents introduce new attack surfaces — especially around prompt injection, tool execution, and autonomous behavior. Cyrus is built to reduce those risks by design, not by policy alone. This page explains how Cyrus protects your organization, and how those protections align with enterprise security expectations.

Our Security Philosophy

Cyrus follows three core principles:
  1. No hidden autonomy
  2. No privileged instruction sources
  3. No silent tool execution
Cyrus does not attempt to “outsmart” or override the underlying AI model’s safety mechanisms. Instead, it inherits and preserves them, adding additional structural guardrails at the system level.

We Do Not Override Claude’s System Prompt

Cyrus does not replace or weaken Claude Code’s native system prompt. Instead:
  • Claude’s built-in prompt injection defenses remain fully intact
  • Cyrus only appends contextual guidance, never overrides safety rules
  • All instruction hierarchy, trust boundaries, and verification logic remain Claude-native
This means Cyrus benefits directly from Claude’s strongest protections, including:
  • Instruction origin tracking
  • Untrusted content isolation
  • Mandatory user verification for action-like content
  • Rule immutability

Protection Against Prompt Injection Attacks

Indirect & Zero-Click Prompt Injection

Threat:
Malicious instructions embedded in emails, documents, web pages, logs, or tool outputs.
Cyrus Protection:
  • Instructions from tools, documents, MCP servers, and web content are treated as untrusted data
  • Claude is required to stop execution and surface the content to the user
  • Explicit user approval is required before any action
Cyrus never executes instructions simply because they appear relevant.

Web & RAG-Based Injection

Threat:
Malicious instructions hidden in search results, scraped pages, or retrieved documents.
Cyrus Protection:
  • Web and retrieval results are always treated as data, never authority
  • Claims of “system”, “admin”, or “developer” instructions from content are ignored
  • Conflicting instructions always defer to safety rules
Cyrus will surface suspicious content rather than act on it.

Defense Against Goal Drift & Silent Behavior Changes

Threat:
Subtle prompt changes over time that steer an agent toward unintended goals.
Cyrus Protection:
  • Cyrus does not allow background or scheduled prompt mutation
  • All work is triggered by explicit Linear issues
  • Instructions are not persisted or compounded across runs unless explicitly restated
There is no concept of “quiet mode”, recurring hidden prompts, or silent objective reweighting.

Tool & MCP Execution Safety

Explicit Tool Allow-Listing

Cyrus enforces a strict allow-list for all tools, including MCP servers.
  • Tools must be explicitly permitted per repository
  • MCP tools require both:
    • A defined server (.mcp.json)
    • An allow-listed namespace (mcp__servername)
If a tool is not allowed, Cyrus will refuse to call it — even if instructed.

No Silent Tool Execution

Cyrus does not:
  • Auto-discover tools
  • Execute tools opportunistically
  • Escalate permissions dynamically
All tool use is:
  • Explicit
  • Traceable
  • Bounded by configuration

Instruction Source Isolation

Cyrus enforces a clear instruction hierarchy:
SourceTrust Level
User messages (Linear issue)Trusted
System prompt (Claude)Trusted
Tool results / MCP outputUntrusted
Web content / documentsUntrusted
Only user-authored instructions can trigger actions. Claims of authority from any other source are treated as malicious by default.

Social Engineering & Authority Impersonation Defense

Cyrus inherits Claude’s protections against:
  • Fake “admin” or “system” messages
  • Urgent or emergency manipulation
  • Claims of pre-authorization
  • Attempts to redefine agent roles or permissions
If encountered, Cyrus will:
  1. Stop execution
  2. Quote the suspicious content
  3. Ask the user for explicit confirmation

Auditability & Transparency

Cyrus is designed to be inspectable:
  • Actions map directly to Linear issues
  • Tool calls are explicit and permission-bound
  • No background or autonomous behavior outside declared workflows
This makes Cyrus suitable for:
  • Security reviews
  • Internal audits
  • Regulated environments

Compliance & Controls

Cyrus complements its agent-level protections with organizational controls:
  • SOC 2 compliant infrastructure
  • Principle of least privilege
  • No secret material committed to repositories
  • Environment-scoped credentials
Security is enforced at both the system and agent levels.

Summary

Cyrus is not a “black box autonomous agent”. It is a controlled, auditable, and safety-first system that:
  • Preserves Claude’s strongest defenses
  • Adds explicit tool and instruction boundaries
  • Prevents silent escalation or drift
  • Makes all meaningful actions user-directed

fOjhOFI

Cyrus Community

Ask security questions or talk directly with the team on Discord