Diagnose why a Least-to-Most prompt is failing on API design decisions with Grok 3 and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on incident post-mortems with Llama 3.1 405B and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on technical spec writing with Claude 4.5 Sonnet and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on data pipeline debugging with Command R+ and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on funnel analysis with Mistral Large and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on resume screening with Claude Haiku 4 and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on math word problems with GPT-4o and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on SQL query writing with Qwen 2.5 72B and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on medical triage with Gemini 2.5 Pro and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on research synthesis with GPT-4.1 and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on legal brief summarization with o1 and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on API design decisions with Gemini 2.0 Flash and produce a fix plan.