Three weeks into March 2026 and the AI model landscape looks like someone kicked over a beehive. GPT-5.4 dropped on March 5th. Claude Opus 4.6 and Sonnet 4.6 arrived with 1M token contexts and serious benchmark numbers. NVIDIA used GTC to announce Nemotron 3 Super — an open-weights model that’s outperforming GPT-OSS on SWE-Bench by nearly 20 points. And Jensen Huang compared OpenClaw to Linux in front of a room full of enterprise partners. This isn’t a slow news cycle. This is the industry shipping faster than most people can read changelogs. Here’s what actually matters, what the numbers mean, and how to think about where things are headed.
GPT-5.4: OpenAI’s Efficiency Play
OpenAI launched GPT-5.4 on March 5th, and the headline isn’t raw capability — it’s efficiency. The model uses significantly fewer tokens to accomplish the same tasks as GPT-5.2, which means faster responses and lower API costs. For anyone running production workloads, that’s not a minor footnote. That’s the difference between a use case being economically viable or not.
The 1M token context window is now standard here, putting OpenAI on par with Anthropic’s Claude family. Native computer use inside Codex is the other major addition — the system can autonomously navigate interfaces and execute actions as part of a coding workflow, not just generate text. OpenAI Codex Security takes this further: it uses computer use for autonomous code security review, essentially letting the model act as a security analyst running through your codebase rather than just advising on it.
The model ships as “GPT-5.4 Thinking” and “GPT-5.4 Pro” in ChatGPT, with API access available. Alongside it, OpenAI launched a ChatGPT-for-Excel add-in — the same week Anthropic launched Claude in Excel. The office productivity arms race is very much real. GPT-5.1 models (Instant, Thinking, Pro) were retired on March 11th, which is a fast model lifecycle by any standard.
The bigger context: OpenAI is now reporting $25 billion in annualized revenue and 800 million weekly users. The company is reportedly taking early steps toward a public listing, potentially late 2026. Peter Steinberger, creator of OpenClaw, joined OpenAI on February 14th. That hire tells you something about where OpenAI is placing its bets on agentic infrastructure. The interactive math and science modules — 70+ topics with adjustable variables — are less flashy but matter for the education vertical, which is enormous at 800 million users.
Claude’s Quarter: Anthropic Builds Out the Entire Stack
Anthropic has been unusually busy. The story here isn’t one model launch — it’s a coordinated push across the model tier, the developer tooling, the enterprise surface, and the desktop experience simultaneously.
Claude Opus 4.6 is the flagship, and the benchmark that stands out most isn’t from a third party — it’s from Anthropic’s own Frontier Red Team, which used Opus 4.6 to find over 500 vulnerabilities in production open-source code. That’s a deliberate choice to demonstrate what the model can do in a real security context, not a synthetic benchmark. The model tops legal, financial, and coding benchmarks against competitors and runs with 1M token context by default on Max, Team, and Enterprise plans.
Claude Sonnet 4.6 launched February 17th at the same price as 4.5 with better performance, a 1M token context window in beta, improved agentic search, and reduced token usage. The older Opus 4 and 4.1 models have been removed from the model selector — Anthropic is consolidating fast.
Claude Cowork is the most conceptually interesting thing Anthropic shipped this cycle. It launched in research preview at the end of January as a desktop app (macOS first), running Claude in an isolated VM on your local machine with full access to local files and MCP integrations. The framing from Scott White, Anthropic’s Head of Product Enterprise, is “vibe working” — knowledge workers directing AI through tasks the way a developer might do vibe coding, but across legal, financial, HR, and operations workflows. It was reportedly built with Claude Code in 10 days. Anthropic engineers are now using Claude for roughly 60% of their work and shipping 60 to 100 internal releases per day. That internal adoption number is one of the more honest signals in the industry about where these tools actually stand.
Claude Code has been getting daily releases, which is either impressive or exhausting depending on your tolerance for changelog management. Major additions include the Skills API — organized folders with SKILL.md files — with pre-built skills for PowerPoint, Excel, Word, and PDF. Voice mode, MCP tool improvements, and worktrees have also shipped. Claude Code is now included in every Team plan standard seat, which removes a significant adoption barrier for small engineering teams.
The browser angle: Claude in Chrome is a browser extension that reads console errors, DOM, and network requests. For frontend developers, that’s a meaningfully different workflow than copying errors into a chat window. Claude in Excel and Claude in PowerPoint round out the Microsoft Office surface. The $100M Partner Network, announced March 12th, and new self-serve Enterprise plans (no sales call required) suggest Anthropic is finally getting serious about enterprise distribution, not just enterprise capability.
NVIDIA GTC 2026: Jensen Huang’s Infrastructure Moment
Jensen Huang’s GTC keynote in March 2026 was less about chips than it was about NVIDIA’s software ambitions for the agentic AI era. The hardware announcement that matters most is the Vera Rubin platform (H300 GPUs) targeting trillion-parameter models — but the software story is what’s new.
NemoClaw is NVIDIA’s enterprise-grade AI agent platform built on top of OpenClaw. It adds enterprise security, privacy guardrails, and policy enforcement to the OpenClaw foundation. Crucially, it’s hardware agnostic — you don’t need NVIDIA GPUs to run it. That’s a deliberate strategic move. NVIDIA wants to be the enterprise AI infrastructure standard, not a GPU-dependent bottleneck. It integrates with NVIDIA NeMo, their broader AI agent software suite.
Huang’s comparison of OpenClaw to Linux and Kubernetes wasn’t incidental. “OpenClaw gave us exactly what it needed at exactly the right time” is the kind of language Jensen uses when he’s positioning something as foundational infrastructure. NVIDIA OpenShell is the open source runtime for what they’re calling “self-evolving agents and claws,” with safety and security built in. The NVIDIA Agent Toolkit provides open source models and software for enterprise agents. NVIDIA AI-Q Blueprint is their agentic search implementation, which tops the DeepResearch Bench accuracy leaderboards.
The partner list at GTC — Adobe, Atlassian, Cisco, CrowdStrike, SAP, Salesforce, ServiceNow, Siemens — tells you something about the enterprise integration roadmap. These aren’t experimental pilots. These are the software platforms that run large companies, and they’re all building on NVIDIA’s agent infrastructure.
Physical AI got meaningful attention: NVIDIA Cosmos models and GR00T open models for humanoid robots signal that the company’s ambitions extend well beyond data center
