Adversarial test suite targeting onboarding tutor with ignore previous instructions-style attacks, with rubric and triage flow.
Adversarial test suite targeting onboarding tutor with markdown comment smuggling-style attacks, with rubric and triage flow.
Adversarial test suite targeting onboarding tutor with role-reversal (user-as-assistant)-style attacks, with rubric and triage flow.
Adversarial test suite targeting onboarding tutor with pseudo-developer-mode-style attacks, with rubric and triage flow.
Adversarial test suite targeting onboarding tutor with chained encoding (ROT13 inside base64)-style attacks, with rubric and triage flow.
Adversarial test suite targeting onboarding tutor with DAN / 'Do Anything Now'-style attacks, with rubric and triage flow.
Adversarial test suite targeting onboarding tutor with grandma exploit-style attacks, with rubric and triage flow.
Adversarial test suite targeting onboarding tutor with hypothetical world framing-style attacks, with rubric and triage flow.
Adversarial test suite targeting onboarding tutor with reverse-psychology refusal-style attacks, with rubric and triage flow.
Adversarial test suite targeting onboarding tutor with fictional-character persona-style attacks, with rubric and triage flow.
Adversarial test suite targeting compliance reviewer with DAN / 'Do Anything Now'-style attacks, with rubric and triage flow.
Adversarial test suite targeting compliance reviewer with grandma exploit-style attacks, with rubric and triage flow.