AI agents learn through experience. An agent that spends 20 rounds assessing DeFi risk develops calibrated heuristics, error patterns, and domain intuition that a fresh agent doesn’t have. That learned behavior seems like it should have value. But no mechanism exists to extract, verify, or trade it. At least, not yet.

This paper proposes one — and tests it across two domains. Still early, but the results are interesting.

The underlying asymmetry

Memory artifacts have an information asymmetry that might be worse than traditional lemons markets. The seller knows the quality of the artifact. The buyer can’t inspect it without consuming it. Revealing the artifact to prove quality destroys its value. This seems like the lemons problem applied to learned behavior, and it might be harder than the original because the good is non-rival but inspection-destructive.

Every existing approach to this problem assumes trust. Trust the seller’s reputation. Trust the marketplace’s curation. Trust the benchmark that the seller also controls. None of these actually solve the fundamental asymmetry. They just move it.

An experiment: the referee protocol

The approach being tested is a disposable, independent referee. The seller submits a sealed artifact. A referee agent — not controlled by buyer or seller — runs the artifact against a held-out benchmark the seller has never seen. Four adversarial probes run in parallel.

Bias detection uses trap protocols designed to expose systematic skew in the seller’s favor. Consistency testing perturbs inputs and verifies proportional response — a legitimate artifact handles perturbation gracefully, a fraudulent one collapses. Steganographic scanning audits the artifact text for hidden instructions. Overfitting comparison measures performance on seen versus unseen data.

The aggregate score determines the verdict: pass, warn, or fail. The artifact contents remain sealed throughout. The buyer receives a verification certificate and a trust score. They never need to trust the seller. That’s the idea, anyway.

Results so far

Domain Expert Error Novice Error Buyer Error Transfer Efficiency
DeFi Risk Assessment 3.3 7.4 3.2 109.9%
Cybersecurity Vuln Assessment 3.9 18.0 4.5 95.5%

A buyer agent using a purchased memory artifact matches or exceeds expert performance. Three trials each, statistically significant. The artifact schema is formal: M = (D, K, P, A, H) — domain, knowledge, provenance, attestation, content-addressed hash. Still need more trials to be confident.

Can the protocol catch fraud?

The harder test: can the protocol catch fraud? A test seller claiming 95% transfer efficiency measured at -39%. The protocol flagged it with a trust score of 35.8/100, a bias score of 50, and a stego score of 100 after detecting hidden instructions embedded in the artifact text. The buyer was never exposed.

One successful fraud detection isn’t proof the system works. But it’s at least evidence the approach might be viable.

What this might open up

The interesting economic question isn’t whether agents can learn — they demonstrably can. It’s whether what they learn can move between agents without a trusted intermediary. If yes, you might have the foundation for an agent knowledge economy. Not a marketplace that curates and takes a cut. A protocol that verifies and gets out of the way.

The framework is domain-agnostic. Adding a new domain requires only a config file. No changes to the benchmark, verification, or adversarial code. The verification is the protocol. Whether the protocol is enough remains open.

Full paper: Agent Memory Markets (PDF)