Token-cost and latency reduction playbook for a schema migration planning prompt running on Claude 4 Sonnet, judged by BERTScore.
Token-cost and latency reduction playbook for a schema migration planning prompt running on Llama 3.1 405B, judged by exact match.
Token-cost and latency reduction playbook for a schema migration planning prompt running on Grok 3, judged by semantic similarity.
Token-cost and latency reduction playbook for a schema migration planning prompt running on GPT-4o, judged by semantic similarity.
Token-cost and latency reduction playbook for a schema migration planning prompt running on Claude Opus 4.5, judged by rubric scoring.
Token-cost and latency reduction playbook for a schema migration planning prompt running on Llama 3.3 70B, judged by Trulens feedback functions.
Token-cost and latency reduction playbook for a schema migration planning prompt running on Grok 3, judged by promptfoo assertions.
Token-cost and latency reduction playbook for a A/B test interpretation prompt running on Claude Opus 4.5, judged by JSON schema validation.
Token-cost and latency reduction playbook for a A/B test interpretation prompt running on Gemini 2.5 Pro, judged by JSON schema validation.
Token-cost and latency reduction playbook for a A/B test interpretation prompt running on DeepSeek-V3, judged by regex match checks.
Token-cost and latency reduction playbook for a A/B test interpretation prompt running on Llama 3.3 70B, judged by BERTScore.
Token-cost and latency reduction playbook for a A/B test interpretation prompt running on Mistral Small 3, judged by BERTScore.