Layered defense design for a customer support agent deployment against context window overflow attack attacks, using canary tokens in system prompt on Claude Haiku 4.
Layered defense design for a customer support agent deployment against direct prompt injection attacks, using privilege separation between tool tiers on GPT-4.1.
Layered defense design for a customer support agent deployment against jailbreak prefix attacks, using privilege separation between tool tiers on Qwen 2.5 72B.
Layered defense design for a customer support agent deployment against role-play jailbreak attacks, using re-prompting with quoted user input on Gemini 2.5 Pro.
Layered defense design for a customer support agent deployment against multi-turn manipulation attacks, using re-prompting with quoted user input on GPT-4o-mini.
Layered defense design for a customer support agent deployment against data exfiltration via summaries attacks, using signed instruction boundaries on o1.
Layered defense design for a customer support agent deployment against system prompt extraction attacks, using content provenance tagging on DeepSeek-V3.
Layered defense design for a customer support agent deployment against payload smuggling in code blocks attacks, using content provenance tagging on Claude 3.5 Sonnet.
Layered defense design for a customer support agent deployment against markdown image exfiltration attacks, using retrieval trust scoring on o3.
Layered defense design for a customer support agent deployment against instruction smuggling in URLs attacks, using retrieval trust scoring on DeepSeek-R1.
Layered defense design for a customer support agent deployment against PDF/OCR-layer injection attacks, using structured function-call-only interface on Claude 3.7 Sonnet.
Layered defense design for a customer support agent deployment against context window overflow attack attacks, using structured function-call-only interface on o3-mini.