Why AI Needs Its Web 2.0 Moment (and How We Get There)
- Jeremy Ryan

- Oct 21, 2025
- 4 min read
Remember the early internet? Websites looked wild, checkout pages broke, and everyone’s experience was different. Then standards showed up—common ways of doing things—and suddenly e‑commerce took off, sites loaded faster, and customers trusted what they saw. That wasn’t the end of creativity; it was the beginning of scale.
AI is at that same turning point. Right now, every tool has its own way of talking, tracking, and connecting. That means your team spends too much time stitching systems together, re‑entering data, and babysitting brittle integrations. It works… until it doesn’t.
Standards fix that. Think of them as a shared playbook so different AI tools can plug in, swap out, and work together without drama. For small and mid‑sized businesses, that means:
Lower costs. Less custom “glue code,” fewer one‑off integrations, and easier vendor changes when pricing or performance shifts.
Faster projects. Clear formats for prompts, tools, and evaluations mean you can ship features in weeks, not quarters.
Better reliability. Standardized telemetry and provenance make it easier to trace errors, meet compliance needs, and prove where content came from.
Real flexibility. If a model gets better—or cheaper—you can switch without rebuilding your whole workflow.
Put simply: standards turn AI from a cool demo into dependable business plumbing. You get the creativity and the control—with fewer surprises on launch day and fewer invoices for “unexpected complexity.”
Below, I outline a practical, vendor‑neutral bundle of standards we can adopt now to unlock those benefits. Yes, it's a bit geeky and technical, but it will provide a concrete bundle we can rally around to give AI its Web‑scale moment for all businesses, small and big.
TL;DR...
Geek out with me...
What’s already standardizing (or close)
Risk & governance frameworks. NIST’s AI RMF and ISO/IEC guidance give us shared language for trustworthy AI, with audit/certification work emerging (e.g., ISO/IEC 42001). Useful for alignment; thin on day‑to‑day dev interop.
Model/dataset documentation. “Model Cards” and “Datasheets for Datasets” are the closest thing we have to product labels. Platforms like Hugging Face have operationalized them.
Content provenance for gen‑AI. C2PA “Content Credentials” standardizes verifiable origin/edit trails for images/video—and increasingly for AI outputs. Adoption is growing across creative and platform tooling.
Performance & (early) safety benchmarks. MLCommons’ MLPerf is the yardstick for performance. Newer LLM‑oriented/safety benchmarks are taking shape, but they’re not the full product‑QA stack.
Model interchange for classic ML. ONNX works well for conventional ML/DL graphs. For modern agentic LLM apps, it helps—but it’s not the whole story.
Web‑side runtime standards. W3C’s WebNN (with WebGPU) is defining an in‑browser ML layer that’s hardware‑agnostic, plus ethics guidance for web ML.
Interop for tools & context (closest to a “USB‑C for AI”). Anthropic’s Model Context Protocol (MCP) is an open protocol for connecting models to tools/data, moving beyond vendor‑specific function calling.
Telemetry/observability. OpenTelemetry conventions are being extended to LLM apps (e.g., OpenInference) so traces/metrics look the same across providers and gateways.
What’s missing (the real opportunity)
If the Internet scaled because we agreed on interfaces, AI will scale when we standardize these interfaces for:
Prompt & conversation interchange. A neutral, versioned envelope for prompts, system instructions, memory, tool traces, and safety directives that travels cleanly across vendors and SDKs.
Tool/function schemas. We all “use JSON Schema,” but semantics differ. We need a canonical Tool Manifest—namespaces, auth, rate hints, idempotency, side‑effect tags—compatible with MCP and plain HTTP.
Agent action & safety metadata. Standard fields for capability level, dangerous‑capability gating, red‑team status, and human‑in‑the‑loop requirements—enforceable at runtime.
Eval/benchmark exchange format. A portable spec for eval tasks, ground truth, scoring, and run metadata so teams can compare apples‑to‑apples across providers—beyond raw throughput/latency.
Tracing & provenance, end‑to‑end. Join OpenTelemetry traces with C2PA (and W3C PROV concepts) so there’s a verifiable “chain of custody” from user input → tools → model → output.
Dataset & fine‑tuning pack format. Today it’s “some JSONL that worked for provider X.” Define an AI Training Pack (data + consent/licensing + splits + dataset card + safety filters) that’s portable and auditable.
Cost/usage metering. Standard units and reporting—tokens/chars/images/tool calls—plus carbon accounting. Contracts and dashboards shouldn’t feel like an international currency exchange.
Runtime capability negotiation. A handshake so clients can discover: model supports vision @ 1024×1024, function‑calling v2, JSON‑schema v2024‑08, RAG slots=4, etc.—akin to HTTP content negotiation, aligned with MCP feature discovery.
A concrete blueprint I’m proposing
Let’s package the practical bits vendors and builders can adopt now—a small, useful bundle that makes interop real without waiting for a 500‑page spec.
PIF — Prompt Interchange Format JSON/JSON‑LD schema for conversations, roles, safety context, memory, and tool traces. Versioned and loss‑minimizing across providers.
TMS — Tool Manifest Standard A JSON‑Schema–based contract for tools (inputs/outputs, side‑effects, auth, safety tier), compatible with MCP servers and vendor function calling. Think “OpenAPI for LLM tools.”
A‑TRACE — Agent Telemetry Conventions An OpenTelemetry profile for LLM apps: spans for retrieval, tool execs, and model calls; attributes for model ID, temperature, safety events; ready for APMs via OpenInference.
EVALPACK Portable eval bundles (tasks, fixtures, scoring code, run metadata) that any vendor can execute. Align with MLCommons so results can roll up to public leaderboards when desired.
PROV‑AI Merge W3C PROV with C2PA so generated content carries verifiable provenance and links back to the agent/tool trace that produced it (privacy‑aware).
AIPACK A packaging layout (like npm/PyPI) for prompts, tools, evals, and fine‑tunes with attached Model/Dataset Cards. Encourages discoverability and safe reuse.
Why this unlocks Web‑scale AI (aka the ROI)
Portability → competition → better UX.Swap models/providers without rewriting agents. Your ops team maintains fewer bespoke adapters; your finance team stops buying antacids.
Assurance at scale. Common risk controls, evals, and provenance mean auditors and platforms can certify once and accept everywhere—critical for enterprise and regulated commerce.
Ecosystem flywheel. Shared manifests and packs enable marketplaces for tools, evals, and fine‑tunes—like app stores, but actually interoperable.
Who should collaborate
Standards bodies: ISO/IEC JTC 1/SC 42 (umbrella AI), IEEE P7000 series (ethics‑in‑design), W3C (WebNN/AI & the Web).
Infra & benchmarking: MLCommons (for EVALPACK alignment), ONNX (model‑graph lessons), OpenTelemetry SIGs (for A‑TRACE).
Ecosystem that’s already moving: MCP community (Anthropic/OpenAI/Google and friends), Hugging Face for model/dataset cards, and platforms adopting C2PA.
Bottom line
People are talking about AI standards, but the developer‑first interop layer is half‑baked. If we organize around a focused, practical bundle—PIF, TMS, A‑TRACE, EVALPACK, PROV‑AI, AIPACK—we can give AI its Web 2.0 moment. Boxes don’t kill creativity; they keep the Legos from rolling under the couch—and that’s how you ship at scale.














Comments