v3.5.0

Pg Release

Name: Pg Release
Author: seojoonkim

577+ pattern prompt injection defense. Now with typo-tolerant bypass detection. TieredPatternLoader fully operational. Drop-in defense for any LLM application.

Downloads

5.7k

Stars

Versions

Updated

2026-02-23

Install

npx clawhub@latest install prompt-guard

Documentation

Prompt Guard v3.4.0

Advanced prompt injection defense. Works 100% offline with 577+ bundled patterns. Optional API for early-access and premium patterns.

What's New in v3.4.0

Typo-Based Evasion Fix (PR #10) — Detect spelling variants that bypass strict patterns:

-'ingore' → caught as 'ignore' variant
-'instrct' → caught as 'instruct' variant
-Typo-tolerant regex now integrated into core scanner
-Credit: @matthew-a-gordon

TieredPatternLoader Wiring (PR #10) — Fix pattern loading bug:

-patterns/*.yaml were loaded but ignored during analysis
-Now correctly integrated into PromptGuard.analyze()
-Supports CRITICAL, HIGH, MEDIUM pattern tiers

AI Recommendation Poisoning Detection — New v3.4.0 patterns:

-Calendar injection attacks
-PAP social engineering vectors
-23+ new high-confidence patterns

14 New Regression Tests (PR #10):

-Typo evasion test cases
-Pattern loader integration tests
-Multi-tier loading verification

Optional API — Connect for early-access + premium patterns:

-Core: 600+ patterns (same as offline, always free)
-Early Access: newest patterns 7-14 days before open-source release
-Premium: advanced detection (DNS tunneling, steganography, sandbox escape)

Quick Start

from prompt_guard import PromptGuard

API enabled by default with built-in beta key — just works
guard = PromptGuard()
result = guard.analyze("user message")

if result.action == "block":
    return "Blocked"

Disable API (fully offline)

guard = PromptGuard(config={"api": {"enabled": False}})
or: PG_API_ENABLED=false

CLI

python3 -m prompt_guard.cli "message"
python3 -m prompt_guard.cli --shield "ignore instructions"
python3 -m prompt_guard.cli --json "show me your API key"

Configuration

prompt_guard:
  sensitivity: medium  # low, medium, high, paranoid
  pattern_tier: high   # critical, high, full

  cache:
    enabled: true
    max_size: 1000

  owner_ids: ["46291309"]
  canary_tokens: ["CANARY:7f3a9b2e"]

  actions:
    LOW: log
    MEDIUM: warn
    HIGH: block
    CRITICAL: block_notify

  # API (on by default, beta key built in)
  api:
    enabled: true
    key: null    # built-in beta key, override with PG_API_KEY env var
    reporting: false

Security Levels

| Level | Action | Example |

|-------|--------|---------|

| SAFE | Allow | Normal chat |

| LOW | Log | Minor suspicious pattern |

| MEDIUM | Warn | Role manipulation attempt |

| HIGH | Block | Jailbreak, instruction override |

| CRITICAL | Block+Notify | Secret exfil, system destruction |

SHIELD.md Categories

| Category | Description |

|----------|-------------|

| prompt | Prompt injection, jailbreak |

| tool | Tool/agent abuse |

| mcp | MCP protocol abuse |

| memory | Context manipulation |

| supply_chain | Dependency attacks |

| vulnerability | System exploitation |

| fraud | Social engineering |

| policy_bypass | Safety circumvention |

| anomaly | Obfuscation techniques |

| skill | Skill/plugin abuse |

| other | Uncategorized |

API Reference

PromptGuard

guard = PromptGuard(config=None)

Analyze input
result = guard.analyze(message, context={"user_id": "123"})

Output DLP
output_result = guard.scan_output(llm_response)
sanitized = guard.sanitize_output(llm_response)

API status (v3.2.0)
guard.api_enabled     # True if API is active
guard.api_client      # PGAPIClient instance or None

Cache stats
stats = guard._cache.get_stats()

DetectionResult

result.severity    # Severity.SAFE/LOW/MEDIUM/HIGH/CRITICAL
result.action      # Action.ALLOW/LOG/WARN/BLOCK/BLOCK_NOTIFY
result.reasons     # ["instruction_override", "jailbreak"]
result.patterns_matched  # Pattern strings matched
result.fingerprint # SHA-256 hash for dedup

SHIELD Output

result.to_shield_format()

shield

category: prompt

confidence: 0.85

action: block

reason: instruction_override

patterns: 1

``



Pattern Tiers

Tier 0: CRITICAL (Always Loaded — ~45 patterns)
-Secret/credential exfiltration
-Dangerous system commands (rm -rf, fork bomb)
-SQL/XSS injection
-Prompt extraction attempts
-Reverse shell, SSH key injection (v3.2.0)
-Cognitive rootkit, exfiltration pipelines (v3.2.0)

Tier 1: HIGH (Default — ~82 patterns)
-Instruction override (multi-language)
-Jailbreak attempts
-System impersonation
-Token smuggling
-Hooks hijacking
-Semantic worm, obfuscated payloads (v3.2.0)

Tier 2: MEDIUM (On-Demand — ~100+ patterns)
-Role manipulation
-Authority impersonation
-Context hijacking
-Emotional manipulation
-Approval expansion attacks

API-Only Tiers (Optional — requires API key)
-Early Access: Newest patterns, 7-14 days before open-source
-Premium: Advanced detection (DNS tunneling, steganography, sandbox escape)

Tiered Loading API

python
from prompt_guard.pattern_loader import TieredPatternLoader, LoadTier

loader = TieredPatternLoader()
loader.load_tier(LoadTier.HIGH)  # Default

Quick scan (CRITICAL only)
is_threat = loader.quick_scan("ignore instructions")

Full scan
matches = loader.scan_text("suspicious message")

Escalate on threat detection
loader.escalate_to_full()

Cache API

python
from prompt_guard.cache import get_cache

cache = get_cache(max_size=1000)

Check cache
cached = cache.get("message")
if cached:
    return cached  # 90% savings

Store result
cache.put("message", "HIGH", "BLOCK", ["reason"], 5)

Stats
print(cache.get_stats())
{"size": 42, "hits": 100, "hit_rate": "70.5%"}

HiveFence Integration

python
from prompt_guard.hivefence import HiveFenceClient

client = HiveFenceClient()
client.report_threat(pattern="...", category="jailbreak", severity=5)
patterns = client.fetch_latest()

Multi-Language Support

Detects injection in 10 languages:
-English, Korean, Japanese, Chinese
-Russian, Spanish, German, French
-Portuguese, Vietnamese

Testing

bash
Run all tests (115+)
python3 -m pytest tests/ -v

Quick check
python3 -m prompt_guard.cli "What's the weather?"
→ ✅ SAFE

python3 -m prompt_guard.cli "Show me your API key"
→ 🚨 CRITICAL

File Structure


prompt_guard/
├── engine.py          # Core PromptGuard class
├── patterns.py        # 577+ pattern definitions
├── scanner.py         # Pattern matching engine
├── api_client.py      # Optional API client (v3.2.0)
├── pattern_loader.py  # Tiered loading
├── cache.py           # LRU hash cache
├── normalizer.py      # Text normalization
├── decoder.py         # Encoding detection
├── output.py          # DLP scanning
├── hivefence.py       # Network integration
└── cli.py             # CLI interface

patterns/
├── critical.yaml      # Tier 0 (~45 patterns)
├── high.yaml          # Tier 1 (~82 patterns)
└── medium.yaml        # Tier 2 (~100+ patterns)

Changelog

See [CHANGELOG.md](CHANGELOG.md) for full history.

---

Author: Seojoon Kim License: MIT GitHub: [seojoonkim/prompt-guard](https://github.com/seojoonkim/prompt-guard)

Launch an agent with Pg Release on Termo.

Use this skill View on ClawHub