I've been spending considerable time researching and constructing infrastructure for AI agents at a startup, specifically around identity, credentials, and trust. The more I examine this space, the more convinced I become that a significant gap exists that receives little attention.
We constantly discuss making agents smarter. Larger context windows, enhanced reasoning, more capable tool use all matter. However, I believe a more immediate bottleneck receives almost no consideration: the environment where these agents operate.
Here's my point: I want to tell my agent, "find the cheapest flight to SF next weekend, book it, and secure a window seat." I'd hand it a few dollars and let it figure things out. Critically, I don't want to preconfigure which airline API it uses. I don't want accounts on five travel services. I don't want pasting API keys into environment files. I simply want it to execute the task.
Currently? That's impossible. Not because the model lacks reasoning capability. Claude reasons better than most people. The obstacle exists because every service requires humans to sign up, verify emails, add credit cards, generate API keys, and wire everything together before agents can make initial requests. An agent's capability ceiling isn't determined by its intelligence but by how much configuration occurred beforehand.
The issue of agent generality isn't intelligence. It's plumbing.
I keep returning to a straightforward idea: what if agents could approach unfamiliar services, prove their creator's identity and authorization, and pay instantly? All without humans preconfiguring anything.
Identity and tokens replace signup data. Payment protocols replace billing relationships. Together they replace API keys.
Identity Bound to the Agent
Agent identity possesses two crucial aspects:
Developer Identity: This cryptographically binds to agents at credential time, remaining permanent. It answers "who built this?" showing organizational verification status, incorporation location, and safety evaluation results. This establishes the trust foundation, making agents traceable and developers accountable. You might approve only Anthropic-created agents while rejecting others.
User Identity: This resembles a session: "who does this agent represent now?" The agent currently acts for you, within your authorized boundaries. Simultaneously it might act for someone else. Standard KYC works well here. Verified emails suffice minimally.
Credentials should encode developer identity permanently while carrying rotating user context through delegation, attestation, or disclosure. Services receiving requests can answer both "should I trust this software?" and "who authorized this action?" through a single credential chain.
Payments at Protocol Level
An agent might possess perfect credentials proving identity and capability yet remain unable to act autonomously without payment capacity. This represents a major infrastructure problem.
I believe payment should occur at the protocol layer, accompanying individual requests rather than requiring preexisting billing relationships. Something resembling x402, where HTTP status code 402 (Payment Required) becomes genuine conversation between agent and service. The service states costs, the agent pays, the service responds. Complete replacement for API billing.
For services preferring traditional arrangements (enterprise contracts, volume pricing), key management services could bridge gaps. Credential identity becomes the underlying account, and KMS issues conventional API keys atop it. The credential plus micropayment path should work by default, with traditional billing as optional.
What Credentials Should Contain
For this functioning reliably at scale, credentials cannot simply be binary "verified" stamps. Services need sufficient information for independent trust decisions. Credentials should include:
Developer verification tier, agent safety evaluation scores (prompt injection robustness, PII leakage resistance, tool abuse handling), processed data categories, technical profile, operational contacts.
Healthcare APIs might require level 3 verification and confirm no PII retention. Weather APIs might accept any credentials. Credentials provide information; services determine policy.
Remaining Uncertainties
Service Discovery: How do agents locate unfamiliar services? Web search, service compilers, and hubs seem logical. Yet a more intriguing possibility exists: what if services exposed special agent-specific endpoints? Not human documentation but capability files ("skill files") describing services, usage, inputs, outputs, pricing, credential requirements. Similar to robots.txt but for agent capabilities. Agents read this endpoint, understand service interaction patterns, operate with services they've never encountered. Discovery and onboarding simultaneously. No grand registry seems necessary; the web handles discovery well, and agents handle instruction reading.
Credential Ownership: I oscillate here but land on self-sovereign models. You own your credential. Issuers can revoke for violations, a necessary safety mechanism. Yet external validators and verifiers should add signals too. Think less like granted licenses and more like passports accumulating stamps. Different parties attest to different aspects while you maintain possession.
Interaction Models: I envision AI becoming the abstraction layer between users and platforms. Not another interface. Not another application. Agents simply handle tasks on your behalf. Authorization should feel as natural as telling someone beside you "go ahead" or "please do this." We haven't reached this UX level yet, but infrastructure must prepare for arrival.
Adoption: The classic problem persists: agents won't carry credentials if services reject them; services won't accept them if agents don't carry them. Yet self-sovereign models help. If credentials benefit developers regardless through traceability and trust without universal acceptance, adoption can proceed gradually. Services opt in as agent economies expand.
Not Entirely Novel
I emphasize: none of this is completely original. People contemplate verifiable credentials for machines, agent payment protocols, decentralized identity. Some overlaps exist with W3C Verifiable Credentials specifications, HTTP Message Signatures (RFC 9421), broader self-sovereign identity movements.
My intention involves pulling these threads into coherent pictures of ideal systems and demonstrating their relevance now. Models reach sufficiency that infrastructure becomes the constraining factor. Environment and infrastructure matter more than intelligence for enabling genuine extended agency.
Tomorrow's agents won't prove more general through intelligence. They'll achieve generality because we constructed worlds they can traverse.
The Unlock
At scale, differences emerge between agents limited to five configured capabilities versus identical agents completing any task within established boundaries.
Today, adding agent capabilities demands code changes, new API keys, fresh billing accounts, deployments. In credential-based systems, agents simply use new services at runtime. They discover them, present credentials, pay interactions, proceed. Agent capability domains shift from "what developers integrated" to "what exists."
I believe this matters more than intelligence improvements, at minimum near-term. METR insights and long-horizon task research prove crucial and represent genuine intelligence challenges worthy of separate examination. Yet even with current models, solving infrastructure problems would dramatically increase agent usefulness. Intelligence suffices for enormous task ranges. Infrastructure remains missing.
~ pranav