v3.5.0

Pg Release

seojoonkim seojoonkim ← All skills

577+ pattern prompt injection defense. Now with typo-tolerant bypass detection. TieredPatternLoader fully operational. Drop-in defense for any LLM application.

Downloads
5.7k
Stars
27
Versions
14
Updated
2026-02-23

Install

npx clawhub@latest install prompt-guard

Documentation

Prompt Guard v3.4.0

Advanced prompt injection defense. Works 100% offline with 577+ bundled patterns. Optional API for early-access and premium patterns.

What's New in v3.4.0

Typo-Based Evasion Fix (PR #10) — Detect spelling variants that bypass strict patterns:
  • -'ingore' → caught as 'ignore' variant
  • -'instrct' → caught as 'instruct' variant
  • -Typo-tolerant regex now integrated into core scanner
  • -Credit: @matthew-a-gordon
TieredPatternLoader Wiring (PR #10) — Fix pattern loading bug:
  • -patterns/*.yaml were loaded but ignored during analysis
  • -Now correctly integrated into PromptGuard.analyze()
  • -Supports CRITICAL, HIGH, MEDIUM pattern tiers
AI Recommendation Poisoning Detection — New v3.4.0 patterns:
  • -Calendar injection attacks
  • -PAP social engineering vectors
  • -23+ new high-confidence patterns
14 New Regression Tests (PR #10):
  • -Typo evasion test cases
  • -Pattern loader integration tests
  • -Multi-tier loading verification
Optional API — Connect for early-access + premium patterns:
  • -Core: 600+ patterns (same as offline, always free)
  • -Early Access: newest patterns 7-14 days before open-source release
  • -Premium: advanced detection (DNS tunneling, steganography, sandbox escape)

Quick Start

from prompt_guard import PromptGuard

API enabled by default with built-in beta key — just works

guard = PromptGuard()

result = guard.analyze("user message")

if result.action == "block":

return "Blocked"

Disable API (fully offline)

guard = PromptGuard(config={"api": {"enabled": False}})

or: PG_API_ENABLED=false

CLI

python3 -m prompt_guard.cli "message"

python3 -m prompt_guard.cli --shield "ignore instructions"

python3 -m prompt_guard.cli --json "show me your API key"

Configuration

prompt_guard:

sensitivity: medium # low, medium, high, paranoid

pattern_tier: high # critical, high, full

cache:

enabled: true

max_size: 1000

owner_ids: ["46291309"]

canary_tokens: ["CANARY:7f3a9b2e"]

actions:

LOW: log

MEDIUM: warn

HIGH: block

CRITICAL: block_notify

# API (on by default, beta key built in)

api:

enabled: true

key: null # built-in beta key, override with PG_API_KEY env var

reporting: false

Security Levels

| Level | Action | Example |

|-------|--------|---------|

| SAFE | Allow | Normal chat |

| LOW | Log | Minor suspicious pattern |

| MEDIUM | Warn | Role manipulation attempt |

| HIGH | Block | Jailbreak, instruction override |

| CRITICAL | Block+Notify | Secret exfil, system destruction |

SHIELD.md Categories

| Category | Description |

|----------|-------------|

| prompt | Prompt injection, jailbreak |

| tool | Tool/agent abuse |

| mcp | MCP protocol abuse |

| memory | Context manipulation |

| supply_chain | Dependency attacks |

| vulnerability | System exploitation |

| fraud | Social engineering |

| policy_bypass | Safety circumvention |

| anomaly | Obfuscation techniques |

| skill | Skill/plugin abuse |

| other | Uncategorized |

API Reference

PromptGuard

guard = PromptGuard(config=None)

Analyze input

result = guard.analyze(message, context={"user_id": "123"})

Output DLP

output_result = guard.scan_output(llm_response)

sanitized = guard.sanitize_output(llm_response)

API status (v3.2.0)

guard.api_enabled # True if API is active

guard.api_client # PGAPIClient instance or None

Cache stats

stats = guard._cache.get_stats()

DetectionResult

result.severity    # Severity.SAFE/LOW/MEDIUM/HIGH/CRITICAL

result.action # Action.ALLOW/LOG/WARN/BLOCK/BLOCK_NOTIFY

result.reasons # ["instruction_override", "jailbreak"]

result.patterns_matched # Pattern strings matched

result.fingerprint # SHA-256 hash for dedup

SHIELD Output

result.to_shield_format()

shield

category: prompt

confidence: 0.85

action: block

reason: instruction_override

patterns: 1

``


Pattern Tiers

Tier 0: CRITICAL (Always Loaded — ~45 patterns)

  • -Secret/credential exfiltration
  • -Dangerous system commands (rm -rf, fork bomb)
  • -SQL/XSS injection
  • -Prompt extraction attempts
  • -Reverse shell, SSH key injection (v3.2.0)
  • -Cognitive rootkit, exfiltration pipelines (v3.2.0)

Tier 1: HIGH (Default — ~82 patterns)

  • -Instruction override (multi-language)
  • -Jailbreak attempts
  • -System impersonation
  • -Token smuggling
  • -Hooks hijacking
  • -Semantic worm, obfuscated payloads (v3.2.0)

Tier 2: MEDIUM (On-Demand — ~100+ patterns)

  • -Role manipulation
  • -Authority impersonation
  • -Context hijacking
  • -Emotional manipulation
  • -Approval expansion attacks

API-Only Tiers (Optional — requires API key)

  • -Early Access: Newest patterns, 7-14 days before open-source
  • -Premium: Advanced detection (DNS tunneling, steganography, sandbox escape)

Tiered Loading API

python

from prompt_guard.pattern_loader import TieredPatternLoader, LoadTier

loader = TieredPatternLoader()

loader.load_tier(LoadTier.HIGH) # Default

Quick scan (CRITICAL only)

is_threat = loader.quick_scan("ignore instructions")

Full scan

matches = loader.scan_text("suspicious message")

Escalate on threat detection

loader.escalate_to_full()


Cache API

python

from prompt_guard.cache import get_cache

cache = get_cache(max_size=1000)

Check cache

cached = cache.get("message")

if cached:

return cached # 90% savings

Store result

cache.put("message", "HIGH", "BLOCK", ["reason"], 5)

Stats

print(cache.get_stats())

{"size": 42, "hits": 100, "hit_rate": "70.5%"}


HiveFence Integration

python

from prompt_guard.hivefence import HiveFenceClient

client = HiveFenceClient()

client.report_threat(pattern="...", category="jailbreak", severity=5)

patterns = client.fetch_latest()


Multi-Language Support

Detects injection in 10 languages:

  • -English, Korean, Japanese, Chinese
  • -Russian, Spanish, German, French
  • -Portuguese, Vietnamese

Testing

bash

Run all tests (115+)

python3 -m pytest tests/ -v

Quick check

python3 -m prompt_guard.cli "What's the weather?"

→ ✅ SAFE

python3 -m prompt_guard.cli "Show me your API key"

→ 🚨 CRITICAL


File Structure

prompt_guard/

├── engine.py # Core PromptGuard class

├── patterns.py # 577+ pattern definitions

├── scanner.py # Pattern matching engine

├── api_client.py # Optional API client (v3.2.0)

├── pattern_loader.py # Tiered loading

├── cache.py # LRU hash cache

├── normalizer.py # Text normalization

├── decoder.py # Encoding detection

├── output.py # DLP scanning

├── hivefence.py # Network integration

└── cli.py # CLI interface

patterns/

├── critical.yaml # Tier 0 (~45 patterns)

├── high.yaml # Tier 1 (~82 patterns)

└── medium.yaml # Tier 2 (~100+ patterns)

``

Changelog

See [CHANGELOG.md](CHANGELOG.md) for full history.

---

Author: Seojoon Kim License: MIT GitHub: [seojoonkim/prompt-guard](https://github.com/seojoonkim/prompt-guard)

Launch an agent with Pg Release on Termo.