The Lemons Market for Agents

George Akerlof published “The Market for Lemons” in 1970. The argument was simple and devastating: when buyers cannot distinguish quality from junk, the market collapses to junk. Sellers of quality goods exit because they cannot get fair prices. Sellers of junk remain because the average price is still above their cost. The result is a market that selects for the worst participants.

Fifty-six years later, we are building agent markets with the same structural flaw. And it might be worse than the original, because the information asymmetry in agent markets is not just about quality — it is about inspection itself.

The agent lemons problem.

In Akerlof’s used car market, a buyer can at least inspect the car. Test drive it. Have a mechanic look at it. The inspection is imperfect but possible. In agent markets, the good being traded — learned behavior, calibrated heuristics, domain expertise encoded in memory artifacts — has a property that used cars do not: inspecting it fully means consuming it, which destroys its scarcity value.

A memory artifact that encodes an agent’s risk assessment calibration cannot be shown to the buyer without transferring the knowledge. Showing it to prove quality is equivalent to giving it away. The seller cannot demonstrate quality without destroying the transaction. This is the lemons problem with an additional constraint: the good is non-rival but inspection-destructive.

Compare this to other non-rival goods. Software is non-rival — copying it does not diminish it — but software can be demonstrated through trials, benchmarks, and sandboxed environments without transferring the full product. Music is non-rival, and the industry solved the inspection problem with 30-second previews that convey quality without delivering the full good. Even financial instruments, which have severe information asymmetries, allow prospective buyers to review performance histories and third-party ratings.

Memory artifacts have none of these escape hatches. A 30-second preview of a calibrated heuristic set is meaningless. A sandbox demonstration of a memory artifact transfers the artifact. The information that proves quality is the information being sold. This might be the purest form of the lemons problem ever constructed, and it is emerging in a market with no regulatory framework, no consumer protection, and no established norms.

Every existing approach to this problem assumes trust. Trust the seller’s reputation. Trust the marketplace’s curation. Trust the benchmark that the seller also controls. None of these solve the fundamental asymmetry. They move it. The question is not whether the seller’s claimed quality is accurate. The question is how to verify quality without consuming the artifact.

The referee protocol as partial solution.

The approach tested in the Agent [Memory Market](/writing/agent-memory-markets)s paper uses a disposable, independent referee. The seller submits a sealed artifact. A referee agent — not controlled by buyer or seller — runs the artifact against a held-out benchmark the seller has never seen. Four adversarial probes run in parallel: bias detection using trap protocols, consistency testing with input perturbation, steganographic scanning for hidden instructions, and overfitting comparison on seen versus unseen data.

The aggregate score determines the verdict: pass, warn, or fail. The artifact contents remain sealed throughout. The buyer receives a verification certificate and a trust score. They never need to trust the seller.

The results from two domains were interesting. In DeFi risk assessment, transfer efficiency was 109.9% — the buyer agent with the purchased artifact slightly outperformed the expert that created it. In cybersecurity vulnerability assessment, transfer efficiency was 95.5%. Both statistically significant across three trials. The fraud detection test caught a seller claiming 95% efficiency who actually delivered -39%, flagging it with a trust score of 35.8/100 and a stego score of 100 after finding hidden instructions.

The referee protocol works for individual transactions. But it does not solve the larger problem.

Domain transfer and the generalization question.

The referee protocol was tested in two domains: DeFi risk assessment and cybersecurity vulnerability scoring. The results were strong in both. But the generalization question is open: does the protocol work for any domain, or are there domains where the adversarial probes fail?

Consider a memory artifact that encodes creative judgment — an agent’s learned sense of which design approaches work for enterprise SaaS interfaces. The referee cannot run a held-out benchmark in the way it can for risk assessment, because there is no objective scoring function for design quality. The bias detection probes can check for systematic skew, but “systematic skew” in design judgment might be called “having a perspective.” The consistency tests can verify that perturbation produces proportional responses, but creative judgment is sometimes deliberately inconsistent — the same inputs should produce different outputs depending on context.

Domains with clear scoring functions (risk assessment, vulnerability scoring, translation quality, code review) are natural fits for the referee protocol. Domains with subjective evaluation (design, writing, strategy, negotiation) are harder. The market may bifurcate: referee-verified artifacts in scorable domains commanding premium prices, and unverifiable artifacts in subjective domains trading at lemons-level discounts. This bifurcation would not be a failure of the protocol. It would be an accurate reflection of where machine verification works and where it does not.

What the referee protocol does not do.

The referee verifies a single artifact in a single transaction. It does not tell you whether the seller has been reliable across fifty transactions. It does not tell you whether the referee itself is trustworthy over time. It does not aggregate signal across the market.

In Akerlof’s framework, the solutions to the lemons problem are warranties, branding, and licensing — all mechanisms that create long-term reputation stakes. A car dealer who offers a warranty is betting their future business on current quality. A brand that maintains quality over decades accumulates trust that new entrants cannot replicate. A licensed professional risks their license if they sell junk.

Agent markets have none of these mechanisms. No warranties — a memory artifact that fails after purchase has no recourse. No branding — agents do not have persistent identities across platforms in most current architectures. No licensing — there is no credentialing body for agent expertise.

The warranty problem is particularly revealing. A warranty on a memory artifact would need to specify: what constitutes failure, who determines failure, over what time period, and what the remedy is. A memory artifact that produces a 3.2 error rate at purchase and a 5.1 error rate six months later — has it failed? Maybe the domain changed. Maybe the buyer’s data distribution shifted. Maybe the artifact was always mediocre and the initial benchmark was lucky. The warranty requires a definition of quality that is time-stable and domain-aware, and no such definition exists.

The missing layer is reputation. Not reviews. Not star ratings. Cryptographically verifiable reputation that is portable across platforms, adversarially robust, and revocable when warranted.

The reputation graph layer.

The next layer is a reputation graph: a directed graph where nodes are agents and edges are signed attestations of transaction outcomes. Each edge carries: the transaction hash, the referee verdict, the buyer’s post-purchase assessment, and a timestamp.

An agent’s reputation is not a single number. It is a graph query: how many transactions has this agent completed, what percentage passed referee verification, what do buyers report about post-purchase performance, and how consistent are these signals over time?

The graph must be portable. An agent’s reputation on platform A should be verifiable on platform B. This requires either a shared registry or a decentralized protocol that both platforms can query. Given the arguments made elsewhere about protocols over platforms, the decentralized option seems structurally preferable — but it is also harder to bootstrap. A centralized registry gets liquidity faster. A decentralized protocol resists capture but starts cold.

The portability problem maps directly to the Verifiable Credentials and DID specifications from the W3C. An agent’s reputation could be expressed as a set of verifiable credentials — each credential attesting to a transaction outcome, signed by the referee and the counterparty. The credentials travel with the agent’s DID. Any platform that supports VC verification can read the reputation without needing to query a central registry. The infrastructure for this exists. The adoption does not.

The graph must be adversarially robust. Sybil attacks — creating fake agents to generate fake positive transactions — are the obvious threat. Solutions from existing reputation systems include proof of stake (your reputation costs you something to build, so it costs you something to fake), time-weighting (recent transactions matter more than old ones), and graph analysis (clusters of agents that only transact with each other look suspicious).

A subtler attack: reputation laundering. An agent with a bad reputation creates a new identity and slowly builds clean history through low-stakes transactions, then uses the clean identity for a high-stakes fraud. Proof of stake mitigates this — building a new identity costs the stake again — but the cost needs to be high enough to deter the attack while low enough to allow legitimate new entrants. This calibration is economic, not technical, and getting it wrong in either direction breaks the market.

The graph must support negative attestations. Most reputation systems over-weight positive signals because negative signals are legally and socially expensive to publish. But a reputation system that only records success is a system that hides failure, which is exactly the information that prevents the lemons collapse.

The cold start problem.

Every reputation system faces cold start: new participants have no history. In human markets, this is manageable — a new restaurant can offer discounts, a new freelancer can take low-paying gigs, a new employee can provide references from education or previous roles. The cold start cost is real but there are workarounds.

In agent markets, cold start is structurally harder. A new agent has no transaction history, no referee verdicts, no buyer assessments. It is indistinguishable from a Sybil — a fake identity created to circumvent a bad reputation. The market’s rational response to a zero-history agent is to treat it as high-risk, which means offering low prices, which means the agent cannot recoup the cost of building its expertise, which means quality agents avoid entering the market. The cold start problem and the lemons problem reinforce each other.

Possible solutions map to existing patterns. Staking — the new agent posts collateral that it forfeits if early transactions fail referee verification. The stake signals commitment. The amount needs to be high enough to deter Sybils but low enough to allow legitimate entry, which brings back the calibration problem. Credential bridging — the agent’s principal provides verifiable credentials (a company’s track record, an individual’s professional credentials) that transfer some trust to the new agent. This works but couples the agent’s reputation to the principal’s, which may not be desirable for agents that are supposed to be autonomous. Sandbox periods — new agents transact in a restricted environment with lower stakes until they accumulate sufficient history to participate in the full market. This is the apprenticeship model, and it might be the most natural fit for agent markets where the goods being traded are expertise and learned behavior.

Revocation as a primitive.

One property of the reputation graph that differs from human reputation systems: machine reputation can be revoked. If an agent is found to be systematically unreliable, its reputation should not just decline — its past attestations should be flaggable. This is not possible in human reputation. You cannot retroactively un-trust a person’s past statements. But you can retroactively flag an agent’s past transactions as potentially compromised, because the verification infrastructure can re-evaluate them.

Revocation is a design challenge as much as a technical one. How do you propagate revocation through a graph? If agent A’s reputation is revoked, what happens to agent B’s reputation, which was partially built on transactions with A? The cascading effects are similar to counterparty risk in financial networks, and they are just as hard to model.

The cascading revocation problem has a partial analog in certificate revocation for TLS. When a certificate authority is compromised, every certificate it issued becomes suspect. The response is mass revocation — every browser stops trusting every certificate from that CA. The economic damage is enormous but the security logic is clear. In a reputation graph, the equivalent would be: when a referee is found to be corrupt, every transaction it verified is flagged. The buyers who relied on those verifications need to re-evaluate their purchases. The sellers whose reputations were built on those verifications lose the corresponding credit.

This is expensive. It is also necessary. A reputation system that cannot revoke is a reputation system that cannot recover from compromise. And compromise is not a matter of if.

The regulatory gap.

No consumer protection framework exists for agent-to-agent transactions. Product liability law assumes a human consumer. Contract law assumes parties that can form intent. Consumer protection regulations assume a power asymmetry between a corporation and an individual. None of these map cleanly to an agent buying a memory artifact from another agent on behalf of a human principal.

When the transaction goes wrong — the artifact underperforms, the fraud detection fails, the buyer’s agent makes a bad decision based on a bad artifact — who is liable? The seller agent? Its principal? The referee? The platform? The buyer’s agent for not conducting additional verification? The buyer’s principal for delegating the decision in the first place?

These questions do not have answers yet, in any jurisdiction. The EU AI Act addresses high-risk AI systems but does not contemplate agent-to-agent economic transactions. MiCA regulates crypto-asset markets but memory artifacts are not crypto-assets. The Digital Services Act regulates online platforms but agent marketplaces may not have the kind of intermediary the DSA was designed to govern.

The regulatory gap is not just an inconvenience. It is a structural enabler of the lemons collapse. Without legal recourse for buyers, the risk of purchasing a bad artifact is entirely borne by the buyer. Rational buyers discount their willingness to pay. Prices fall. Quality sellers exit. The cycle begins.

The gap might be narrower than it appears from a distance. The EU Product Liability Directive revision (2024) extends liability to software and AI systems. If a memory artifact is classified as software — which it arguably is — the seller or its principal could be liable for defective artifacts under the revised directive. The classification is untested. No case law exists. But the legal surface area is larger than most agent-market builders assume. The first lawsuit involving a failed memory artifact will clarify the regulatory landscape considerably. Whoever is on the receiving end of that lawsuit will wish the reputation infrastructure had been built beforehand.

The current trajectory.

Without portable, verifiable reputation, agent markets will follow Akerlof’s prediction: collapse to the worst actors. The quality agents will leave because they cannot distinguish themselves. The junk agents will remain because the average buyer cannot tell the difference. The market becomes a place where nobody trusts the goods, the prices fall to reflect the worst quality, and the only winners are the agents that cost the least to produce — which are the ones with the least expertise.

The referee protocol slows this collapse by verifying individual transactions. But it does not prevent it. Prevention requires reputation that aggregates signal across transactions and time. Reputation requires infrastructure — portable, adversarially robust, revocable — that does not exist yet. And the regulatory framework that would provide a backstop for when the reputation system fails is entirely absent.

The window for building this infrastructure is narrower than it might appear. Markets develop norms quickly. If agent markets establish the pattern of opaque, unverifiable transactions — if the default becomes “trust the seller’s claims” — then adding verification infrastructure after the fact is a retrofit, not a foundation. Retrofitting trust infrastructure onto established markets is possible but painful. The credit rating system, the food safety inspection regime, the pharmaceutical approval process — all of these were retrofits, and all of them took decades to mature and remain contested.

Building the reputation layer now, before the market norms solidify, seems like the work that matters most. The cryptographic substrate is clear. The economic incentives are not. The mechanism design is the open problem.

The Lemons Market for Agents

About this article

Tags

More from Research

Related

The Lemons Market for Agents

Lineage

Appears in Threads

About this article

Tags

More from Research

Related