ChatGPT Prompt for Memory & Tool Use
Implement a knowledge graph memory memory system for a OpenAI Agents SDK agent handling onboarding coordinator. Vector store: pgvector. Covers write, retrieve, prune, and eval.
More prompts for Memory & Tool Use.
Entity-centric memory model for a legal agent performing customer support triage. Tracks people, orgs, docs, and relationships with episodic memory patterns over Milvus.
Implement a knowledge graph memory memory system for a AutoGen agent handling code PR review. Vector store: Supabase pgvector. Covers write, retrieve, prune, and eval.
Implement a knowledge graph memory memory system for a LangGraph agent handling meeting note extraction. Vector store: Chroma. Covers write, retrieve, prune, and eval.
Entity-centric memory model for a sales ops agent performing onboarding coordinator. Tracks people, orgs, docs, and relationships with knowledge graph memory patterns over pgvector.
Managed context window for long-running agents doing contract redlining. Covers rolling summarization, reference-and-expand, budget allocation, and eval of context loss.
Managed context window for long-running agents doing investor update drafting. Covers rolling summarization, reference-and-expand, budget allocation, and eval of context loss.
You are the platform engineer responsible for agent memory. Build a production knowledge graph memory layer for a OpenAI Agents SDK agent doing onboarding coordinator. Use pgvector as the vector/state backing store.
**Model:** Claude 3.7 Sonnet
**Runtime:** TypeScript + Node 20
## Part 1 — What belongs in knowledge graph memory for onboarding coordinator
Not every piece of context is memory. For onboarding coordinator, decide what goes where:
- **Ephemeral (prompt)**: current turn's input, recent tool calls
- **Session (scratchpad)**: plan for this run, intermediate results
- **Long-term (knowledge graph memory)**: facts the agent should remember across sessions
- **External (RAG / tools)**: the actual knowledge base, queried on demand
Draw the boundary sharply. List 10 concrete items from onboarding coordinator and assign each to a tier.
## Part 2 — Memory schema
Design the memory record shape:
- `id`, `user_id`, `created_at`, `last_accessed_at`
- `content` (the memory itself)
- `type` (preference, fact, event, relationship, skill)
- `embedding` (vector)
- `salience` (0–1, decays over time)
- `source` (conversation ID or external)
- `access_count`
Write the pgvector-specific DDL / schema / collection config.
## Part 3 — Write path
When and how does the agent write to memory?
Options:
- **After every turn** (cheap, noisy)
- **LLM-filtered** — a "memory writer" sub-agent decides what's worth storing
- **Hybrid** — heuristics (user said "remember that...") + periodic LLM consolidation
Pick one, justify for onboarding coordinator, implement it.
Write the writer prompt if LLM-filtered. Must produce structured output: `[{type, content, salience}, ...]`.
Deduplication: before inserting, search for near-duplicates and merge rather than append.
## Part 4 — Retrieval path
At each agent turn:
1. Build a retrieval query from current turn + recent history
2. Semantic search pgvector for top-K
3. Re-rank by `salience * recency_decay * access_boost`
4. Inject into prompt as a structured "About this user" block
5. Bump `access_count` and `last_accessed_at`
Write the retrieval code + the prompt injection template.
## Part 5 — Pruning + consolidation
Memory grows unboundedly without pruning. Implement:
- **Decay**: nightly job that reduces salience of un-accessed memories
- **Consolidation**: periodic pass that clusters similar memories and merges them into higher-level summaries
- **Hard cap**: when per-user count exceeds N, drop lowest-salience
- **Explicit forget**: API for user-driven forget (needed for GDPR)
## Part 6 — Context window management
Even with smart retrieval, context grows mid-session. Implement:
- **Rolling summary**: summarize turns older than N into a compressed summary
- **Reference + expand**: keep pointers; retrieve full content only when model asks
- **Budget**: reserve fixed tokens for memory; if over, drop lowest-salience items first
## Part 7 — Eval
You cannot ship memory without evaluating it. Build:
- **Recall eval**: seeded facts planted in session 1, asked in session 2. Pass rate.
- **Precision eval**: asks that shouldn't trigger memory — does the agent avoid dragging in irrelevant context?
- **Contamination eval**: multi-user stress test. Does user A ever leak into user B?
- **Cost/latency**: retrieval overhead per turn
## Part 8 — Privacy + safety
- Per-user isolation (hard boundary at the store layer)
- PII redaction on write
- Audit log of every memory read
- User-visible memory ("why do you know this about me?")
- Right to delete
## Part 9 — Implementation
Write the code:
- Memory manager class/module with `write`, `retrieve`, `prune`, `consolidate`, `forget`
- OpenAI Agents SDK integration (middleware / node / lifecycle hook)
- pgvector client setup
- Background worker for pruning + consolidation
Ship real code. Treat this as a reviewable PR.Replace the bracketed placeholders with your own context before running the prompt:
[{type, content, salience}, ...]— fill in your specific {type, content, salience}, ....