Agent Memory Markets

AI agents learn through experience. An agent that spends 20 rounds assessing DeFi risk develops heuristics a fresh agent doesn’t have. The question I kept coming back to: can that learned behavior be extracted, verified, and traded? Built a protocol to test it.

Yes. With 95–110% transfer efficiency across two domains.

A buyer agent using a purchased memory artifact matches or exceeds expert performance — 109.9% transfer efficiency in DeFi risk assessment, 95.5% in cybersecurity vulnerability scoring. Three trials each, statistically significant.

The harder problem is trust. Memory artifacts have an information asymmetry worse than traditional lemons markets: revealing the artifact to prove quality destroys its value. You cannot inspect what you are buying without consuming it.

The referee protocol solves this without exposing artifact contents. An independent, disposable referee agent runs the sealed artifact on a held-out benchmark — a benchmark the seller has never seen. Four adversarial probes run in parallel: bias detection with trap protocols, consistency testing through input perturbation, steganographic scanning for hidden instructions, and overfitting comparison between seen and unseen data.

The aggregate score determines the verdict. Pass, warn, or fail. The artifact contents remain sealed throughout. The buyer receives a verification certificate and a trust score. They never need to trust the seller.

Poisoned artifact detection works. A test seller claiming 95% transfer efficiency measured at -39% — the protocol flagged it with a trust score of 35.8/100, a bias score of 50, and a stego score of 100 after detecting hidden instructions embedded in the artifact text. The buyer was never exposed.

The memory artifact schema is formal: M = (D, K, P, A, H). Domain, knowledge, provenance, attestation, content-addressed hash. The framework is domain-agnostic — adding a new domain requires only a config file. No changes to the benchmark, verification, or adversarial code.

Stack: Python · Anthropic Claude API · SQLite trust registry Status: Paper complete. Protocol implemented. Two domains tested. ArXiv-ready. GitHub · Paper (PDF)

About this article

Tags

More from Research

Related

Agent Memory Markets

Lineage

Appears in Threads

About this article

Tags

More from Research

Related