<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
    <channel>
        <title>VentureBeat</title>
        <link>https://venturebeat.com/feed/</link>
        <description>Transformative tech coverage that matters</description>
        <lastBuildDate>Sat, 09 May 2026 02:45:30 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <language>en</language>
        <copyright>Copyright 2026, VentureBeat</copyright>
        <item>
            <title><![CDATA[Anthropic says it hit a $30 billion revenue run rate after 'crazy' 80x growth]]></title>
            <link>https://venturebeat.com/technology/anthropic-says-it-hit-a-30-billion-revenue-run-rate-after-crazy-80x-growth</link>
            <guid isPermaLink="false">7laerjSOyFdiXHZAjvOY9h</guid>
            <pubDate>Fri, 08 May 2026 21:45:20 GMT</pubDate>
            <description><![CDATA[<p><a href="https://www.darioamodei.com/">Dario Amodei</a> is not the kind of CEO who talks loosely about numbers. The Anthropic co-founder and chief executive, a former VP of research at OpenAI with a PhD in computational neuroscience from Princeton, has built a reputation for measured public statements — particularly around the financial performance of a company that, until recently, disclosed almost nothing about its business.</p><p>So when Amodei took the stage at Anthropic&#x27;s <a href="https://claude.com/code-with-claude">Code with Claude</a> developer conference on Wednesday and offered a genuinely striking piece of financial candor, the room paid attention.</p><p>&quot;We tried to plan very well for a world of 10x growth per year,&quot; Amodei said during a fireside chat with Anthropic&#x27;s chief product officer, Ami Vora. &quot;And yet we saw 80x. And so that is the reason we have had difficulties with compute.&quot;</p><div></div><p>Anthropic had planned for tenfold growth. But revenue and usage increased 80-fold in the first quarter on an annualized basis, a rate Amodei described as &quot;just crazy&quot; and &quot;too hard to handle.&quot;</p><p>The number demands context. Annualized growth rates can overstate sustained performance — a single strong quarter, extrapolated across a full year, can paint a picture that doesn&#x27;t hold. Amodei knows this. But the underlying trajectory is not a mirage. Anthropic has crossed a <a href="https://www.bloomberg.com/news/articles/2026-04-06/broadcom-confirms-deal-to-ship-google-tpu-chips-to-anthropic">$30 billion annualized revenue run rate</a>, up sharply from roughly $9 billion at the end of 2025, and that growth is being driven largely by enterprise demand. The company&#x27;s revenue trajectory has been relentless: <a href="https://www.saastr.com/anthropic-just-passed-openai-in-revenue-while-spending-4x-less-to-train-their-models/">$87 million run rate in January 2024</a>, $1 billion by December 2024, <a href="https://www.reuters.com/business/retail-consumer/anthropic-aims-nearly-triple-annualized-revenue-2026-sources-say-2025-10-15/">$9 billion by end of 2025</a>, $14 billion in February 2026, $19 billion in March, and $30 billion in April.</p><p>For context: Salesforce took about 20 years to reach $30 billion in annual revenue. Anthropic did it in under three years from a standing start.</p><h2><b>Claude Code became the fastest-growing product in enterprise software history</b></h2><p>The growth story at Anthropic is, to a remarkable degree, a single-product story. <a href="https://www.anthropic.com/product/claude-code">Claude Code</a>, the company&#x27;s agentic AI coding tool launched publicly in mid-2025, has become the fastest-growing product in the company&#x27;s history — and, by several measures, one of the fastest-growing software products ever built.</p><p>Claude Code hit <a href="https://www.anthropic.com/news/anthropic-acquires-bun-as-claude-code-reaches-usd1b-milestone">$1 billion in annualized revenue</a> within six months of launch, and the growth hasn&#x27;t slowed down. By February 2026, the product was generating <a href="https://news.ycombinator.com/item?id=46997292">over $2.5 billion in run-rate revenue</a>. The company also said Claude Code&#x27;s weekly active users had doubled since January 1 and that business subscriptions had quadrupled since the start of 2026.</p><p>The mechanics of the product are straightforward. Claude Code is not a chatbot that suggests snippets. It reads a codebase, plans a sequence of actions, executes them using real development tools, evaluates the result, and adjusts its approach. The developer sets the objective and retains control over what gets committed, but the execution loop runs independently. The average developer using <a href="https://code.claude.com/docs/en/overview">Claude Code</a> now spends 20 hours per week working with the tool.</p><p>At Anthropic itself, the majority of code is now written by Claude Code. Engineers focus on architecture, product thinking, and continuous orchestration: managing multiple agents in parallel, giving direction, and making the decisions that shape what gets built.</p><p>That last point may be the most revealing detail Amodei disclosed at the conference: this is the first year Anthropic&#x27;s own internal pull requests have inflected upward due to Claude&#x27;s work on the company&#x27;s own codebase. The tool that Anthropic sells to developers is now a material contributor to Anthropic&#x27;s own engineering output. That creates a feedback loop that is almost impossible for competitors without a comparable product to replicate — the company is using its own product to build the next version of its own product.</p><p>The enterprise numbers tell the same story. The company now counts over 1,000 enterprise customers spending more than $1 million per year on Claude services, a figure that has doubled since February. Much of this increase has been fueled by a wave of corporate customers including Uber and Netflix.</p><p>Amodei framed the adoption curve in economic terms. &quot;Software engineers are the ones who are fastest to adopt new technology,&quot; he said on stage. &quot;It&#x27;s a foreshadowing of how things are going to work across the economy, and how the economy is going to be transformed by AI.&quot;</p><h2><b>Anthropic&#x27;s 80x growth created a compute crisis it couldn&#x27;t solve alone</b></h2><p>Hypergrowth creates its own category of problem. When demand outstrips supply by an order of magnitude, the constraint is not go-to-market strategy or product-market fit. The constraint is physics.</p><p>The company is growing so fast that its infrastructure has struggled to keep up, forcing Anthropic into what may be the most unexpected partnership in the current AI cycle. Amodei&#x27;s comments came hours after <a href="https://www.cnbc.com/2026/05/06/anthropic-spacex-data-center-capacity.html">Anthropic announced a deal with Elon Musk&#x27;s SpaceX</a> to use all of the compute capacity at his company&#x27;s Colossus 1 data center in Memphis, Tennessee. As part of the agreement, Anthropic will get access to more than <a href="https://www.wsj.com/tech/ai/anthropic-inks-deal-to-use-all-of-spacexs-colossus-1-compute-capacity-56a7e2a1">300 megawatts of capacity</a> — over 220,000 Nvidia GPUs, including dense deployments of H100, H200, and next-generation GB200 accelerators.</p><p>The deal is remarkable for several reasons. Musk has been, until very recently, one of Anthropic&#x27;s most vocal critics. He has said Anthropic is &quot;<a href="https://x.com/elonmusk/status/1803783427616821401?lang=en">doomed to become the opposite of its name</a>&quot; and wrote in February that &quot;<a href="https://x.com/elonmusk/status/2027294561467613256">Anthropic hates Western Civilization</a>.&quot; But on Wednesday, Musk changed his tune, saying he spent a lot of time with senior members of the Anthropic team over the past week and that he was &quot;<a href="https://x.com/elonmusk/status/2052069691372478511">impressed</a>.&quot; &quot;Everyone I met was highly competent and cared a great deal about doing the right thing. No one set off my evil detector,&quot; Musk wrote.</p><div></div><p>The strategic logic on both sides is clear. xAI&#x27;s Colossus 1 ended up with capacity that Grok&#x27;s user base never grew into, while Anthropic needs compute immediately. Anthropic has been signing deals with Amazon, Google, Nvidia, and Microsoft for more compute capacity, but most of that isn&#x27;t expected to come online until late 2026 or early 2027. The SpaceX deal gives Anthropic a significant boost now — the key word being &quot;now.&quot;</p><p>As one industry watcher summarized the alignment: &quot;<a href="https://x.com/benitoz/status/2052061969969238317">Elon&#x27;s enemy is Sam. Dario&#x27;s enemy is Sam. Enemy of my enemy is a compute partner</a>.&quot;</p><p>Last month, Anthropic said demand for Claude has led to &quot;inevitable strain on our infrastructure,&quot; which has impacted &quot;reliability and performance&quot; for its users, particularly during peak hours. The company admitted in a postmortem from late April that three bugs had affected Claude Code since March 4, and that internal tests hadn&#x27;t caught them, leading to several weeks of degraded performance. Amodei said at the Code with Claude conference that the company is &quot;working as quickly as possible to provide more&quot; capacity and will &quot;pass that compute on to you as soon as we can.&quot;</p><h2><b>A near-trillion-dollar valuation makes Anthropic&#x27;s IPO the most anticipated debut in years</b></h2><p>The growth figures arrive at a moment when Anthropic&#x27;s valuation is itself becoming one of the defining financial stories of the AI era.</p><p>Anthropic has begun weighing a fresh funding round that would value the company at more than <a href="https://www.reuters.com/business/retail-consumer/anthropic-weighs-new-funding-round-valuation-exceeding-900-billion-bloomberg-2026-04-29/">$900 billion</a>, according to people familiar with the matter, potentially leapfrogging its longtime rival OpenAI as the world&#x27;s most valuable AI startup. The velocity of the escalation is difficult to overstate. From $61.5 billion in March 2025, to $183 billion by its Series F in September, to $380 billion in February, to, if the current discussions proceed, more than $900 billion in May. Anthropic&#x27;s shares were already trading at an implied $1 trillion valuation on secondary markets earlier this month.</p><p>Instead of cashing out, many existing investors are waiting to potentially exit during Anthropic&#x27;s anticipated IPO later this year. The company is raising what is likely to be its last private round before going public to fund its massive computing needs. Bloomberg has reported that the <a href="https://www.bloomberg.com/news/articles/2026-04-29/anthropic-considering-funding-offers-at-over-900-billion-value">company is weighing an IPO as early as October 2026</a>, with Goldman Sachs, JPMorgan, and Morgan Stanley already in early discussions.</p><p>Anthropic is also building out infrastructure on longer time horizons. Amazon has agreed to <a href="https://www.cnbc.com/2026/04/20/amazon-invest-up-to-25-billion-in-anthropic-part-of-ai-infrastructure.html">invest up to $25 billion</a> in Anthropic, securing up to 5 gigawatts of compute capacity for training and deploying Claude models. Anthropic also secured 5 gigawatts of computing capacity as part of a <a href="https://www.anthropic.com/news/google-broadcom-partnership-compute">separate deal with Google and Broadcom</a> that will start to come online next year. The total commitment is staggering — tens of gigawatts of compute across three separate hardware ecosystems: Amazon&#x27;s Trainium chips, Google&#x27;s TPUs via Broadcom, and Nvidia GPUs through SpaceX and Microsoft Azure.</p><p>For perspective: Anthropic&#x27;s $30 billion run rate exceeds the trailing twelve-month revenues of all but approximately 130 S&amp;P 500 companies. A company that was essentially pre-revenue in early 2024 now out-earns most of the Fortune 500.</p><p>That comparison comes with caveats. Private-market revenue run rate is not the same thing as audited GAAP revenue, gross margin, free cash flow, or public float. OpenAI has internally argued that Anthropic&#x27;s $30 billion figure is <a href="https://thenextweb.com/news/openai-852-billion-valuation-investor-scrutiny-anthropic-revenue">overstated by roughly $8 billion</a>, pointing to questions about whether revenues from AWS and Google Cloud should be reported at gross value or net of the partner&#x27;s cut. The accounting question will ultimately be resolved when both companies file IPO prospectuses — but even on a net basis, Anthropic&#x27;s growth rate is unlike anything in enterprise software history.</p><h2><b>Dario Amodei&#x27;s vision for AI extends far beyond coding — and he&#x27;s given himself a deadline</b></h2><p>The financial story — 80x growth, a near-trillion-dollar valuation, a scramble to secure enough GPUs to meet demand — is dramatic on its own terms. But Amodei used his time on stage to place it inside a larger thesis about where AI is headed.</p><p>He described a progression from single agents to multiple agents to what he called whole organizational intelligence — from &quot;a team of smart people in a room&quot; to &quot;a country of geniuses in the data center.&quot; The framing is deliberately expansive. What Anthropic is selling today is a coding tool. What Amodei is describing is a future in which entire categories of knowledge work are performed by fleets of AI agents operating in parallel, supervised by humans who define objectives and review outputs.</p><p>He reiterated a prediction he made roughly a year ago: that 2026 would see the first billion-dollar company run entirely by a single person. &quot;Hasn&#x27;t quite happened yet,&quot; he said. &quot;But we&#x27;ve got seven more months.&quot;</p><p>The company has also been navigating political headwinds. <a href="https://www.npr.org/2026/03/06/g-s1-112713/pentagon-labels-ai-company-anthropic-a-supply-chain-risk">The Pentagon declared Anthropic a supply chain risk in March</a>, blacklisting it from work with the military. The company has warned the designation could result in billions in lost revenue, with over one hundred enterprise customers reportedly expressing doubts about continuing their relationships.</p><p>And yet — as that scuffle makes its way through the legal system, Anthropic is only getting more popular. Amodei said this week he&#x27;s eventually hoping for &quot;more normal&quot; expansion.</p><p>There is a temptation, when covering a company growing at this rate, to let the numbers speak for themselves. They shouldn&#x27;t. Growth at 80x annualized is not a business plan — it&#x27;s an emergency. It means demand has outrun infrastructure, that customers want something the company cannot yet reliably deliver at scale, and that every week of constrained capacity is a week during which competitors can close the gap.</p><p>The investors funding Anthropic — including <a href="https://www.softbank.jp/en//">SoftBank</a>, <a href="https://www.amazon.com/">Amazon</a>, <a href="https://www.nvidia.com/en-us/">Nvidia</a>, <a href="https://www.google.com/">Google</a>, <a href="https://a16z.com/">a16z</a>, <a href="https://www.lightspeedhq.com/home/">Lightspeed</a>, and <a href="https://www.iconiq.com/">ICONIQ</a> — are making a specific bet: that compute costs continue to fall per unit of intelligence, that revenue keeps compounding faster than burn, and that whoever owns the AI infrastructure layer in 2029 will generate returns that make the interim losses irrelevant.</p><p>Amodei&#x27;s candor at <a href="https://claude.com/code-with-claude/san-francisco">Code with Claude</a> was not a victory lap. It was a diagnostic — an admission that his company is running faster than it can steer. He planned for a world of 10x growth and got 80x instead. Now he has seven months to prove that the infrastructure, the organization, and the vision can catch up to the demand. The country of geniuses in the data center is getting crowded. The question is whether anyone remembered to build enough rooms.</p>]]></description>
            <author>michael.nunez@venturebeat.com (Michael Nuñez)</author>
            <category>Technology</category>
            <enclosure url="https://images.ctfassets.net/jdtwqhzvc2n1/3qloNl7y6W8aLaJNTzTXqd/e496756afa4789e88ca341be72f4547b/Nuneybits_Vector_art_of_a_retro_computer_with_a_graph_going_up__ca5ad9fd-fe3e-48ab-a2d3-f2ace34416ec.webp?w=300&amp;q=30" length="0" type="image/webp"/>
        </item>
        <item>
            <title><![CDATA[OpenAI brings GPT-5-class reasoning to real-time voice — and it changes what voice agents can actually orchestrate]]></title>
            <link>https://venturebeat.com/orchestration/openai-brings-gpt-5-class-reasoning-to-real-time-voice-and-it-changes-what-voice-agents-can-actually-orchestrate</link>
            <guid isPermaLink="false">13VA1kM7FoFl9NrxCiqP5Y</guid>
            <pubDate>Fri, 08 May 2026 21:41:21 GMT</pubDate>
            <description><![CDATA[<p>Voice agents have been expensive to run and painful to orchestrate, not because the models can&#x27;t handle conversation, but because context ceilings forced enterprises to build session resets, state compression, and reconstruction layers into every deployment. OpenAI&#x27;s three new voice models are designed to reduce that overhead, and they change how engineers can think about building voice into a larger agent stack.</p><p>GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper integrate real-time audio into the model management stack as discrete orchestration primitives — separating conversational reasoning, translation, and transcription into specialized components rather than bundling them in a single voice product.</p><p>The company said in <a href="https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/">a blog post</a> that Realtime-2 is its first voice model “with GPT-5 class reasoning” and can handle difficult requests and keep conversations flowing naturally. Realtime-Translate understands more than 70 languages and translates them into 13 others at the speaker&#x27;s pace, and Realtime-Whisper is its new speech-to-text transcription model.</p><p>These three actions no longer sit inside a single stack or model. GPT-Realtime-2 could technically handle transcription, but OpenAI is routing distinct tasks to specialized models: Realtime-Translate for multilingual speech and Realtime-Whisper for transcription. Enterprises can assign each task to the appropriate model rather than routing everything through a single, all-encompassing voice system.</p><p>The new OpenAI models compete against <a href="https://venturebeat.com/orchestration/mistral-ai-just-released-a-text-to-speech-model-it-says-beats-elevenlabs-and">Mistral’s Voxtral models</a>, which also separate transcription and target enterprise use cases.  </p><h2>What enterprises should do</h2><p>More enterprises are seeing the value of voice agents now that more people are becoming comfortable conversing with an AI agent, and also because of the richness of data from voice customer interactions.</p><p>Organizations evaluating these models will need to consider their orchestration architecture, not just model quality — specifically, whether their stack can route discrete voice tasks to specialized models and manage state across a 128K-token context window.</p>]]></description>
            <category>Orchestration</category>
            <enclosure url="https://images.ctfassets.net/jdtwqhzvc2n1/4fS8WkFKirUi9KqwhCb400/dba3dc0e8604c34ee39ee2e0fea8b1d8/crimedy7_illustration_of_a_robot_with_a_phone_--ar_169_--v_7_059ef1c1-2f5e-4009-ab54-acec406a8f5e_1.png?w=300&amp;q=30" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[5,000 vibe-coded apps just proved shadow AI is the new S3 bucket crisis]]></title>
            <link>https://venturebeat.com/security/vibe-coded-apps-shadow-ai-s3-bucket-crisis-ciso-audit-framework</link>
            <guid isPermaLink="false">275iE1Y3fHZ1neamQUbFct</guid>
            <pubDate>Fri, 08 May 2026 20:57:01 GMT</pubDate>
            <description><![CDATA[<p>Most enterprise security programs were built to protect servers, endpoints, and cloud accounts. None of them was built to find a customer intake form that a product manager vibe coded on Lovable over a weekend, connected to a live Supabase database, and deployed on a public URL indexed by Google. That gap now has a price tag.</p><p>New research from Israeli cybersecurity firm <a href="https://redaccess.io/">RedAccess</a> quantifies the scale. The firm discovered 380,000 publicly accessible assets, including applications, databases, and related infrastructure, built with vibe coding tools from Lovable, Base44, and Replit, as well as deployment platform Netlify. Roughly 5,000 of those assets, about 1.3%, contained sensitive corporate information. CEO Dor Zvi said his team found the exposure while researching shadow AI for customers. <a href="https://www.axios.com/2026/05/07/loveable-replit-vibe-coding-privacy">Axios independently verified</a> multiple exposed apps, and <a href="https://www.wired.com/story/thousands-of-vibe-coded-apps-expose-corporate-and-personal-data-on-the-open-web/">Wired confirmed</a> the findings separately.</p><p>Among the verified exposures: a shipping company app detailed which vessels were expected at which ports. An internal health company application listed active clinical trials across the U.K. Full, unredacted customer service conversations for a British cabinet supplier sat on the open web. Internal financial information for a Brazilian bank was accessible to anyone who found the URL.</p><p>The exposed data also included patient conversations at a children’s long-term care facility, hospital doctor-patient summaries, incident response records at a security company, and ad purchasing strategies. Depending on jurisdiction and the data involved, the healthcare and financial exposures may trigger regulatory obligations under HIPAA, UK GDPR, or Brazil’s LGPD. </p><p>RedAccess found phishing sites built on Lovable that impersonated Bank of America, FedEx, Trader Joe’s, and McDonald’s. Lovable said it had begun investigating and removing the phishing sites.</p><h2>The defaults are the problem </h2><p>Privacy settings on several vibe coding platforms make apps publicly accessible unless users manually switch them to private. Many of these applications get indexed by Google and other search engines. Anyone can stumble across them. Zvi put it plainly: “I don’t think it’s feasible to educate the whole world around security. My mother is [vibe coding] with Lovable, and no offense, but I don’t think she will think about role-based access.”</p><h2>This is not an isolated finding </h2><p>In October 2025, Escape.tech <a href="https://escape.tech/blog/methodology-how-we-discovered-vulnerabilities-apps-built-with-vibe-coding/">scanned 5,600 publicly available vibe-coded applications</a> and found more than 2,000 high-impact vulnerabilities, over 400 exposed secrets including API keys and access tokens, and 175 instances of personal data exposure containing medical records and bank account numbers. Every vulnerability Escape found was in a live production system, discoverable within hours. The <a href="https://escape.tech/state-of-security-of-vibe-coded-apps">full report</a> documents the methodology. Escape separately raised <a href="https://www.iris.vc/articles/escape-raises-18m-series-a-to-fight-ai-powered-cyberattacks-with-ai-agents">an $18 million Series A</a> led by Balderton in March 2026, citing the security gap opened by AI-generated code as a core market thesis.</p><p>Gartner’s <a href="https://www.armorcode.com/blog/your-genai-code-debt-is-coming-due-heres-what-gartner-predicts">“Predicts 2026” report</a> forecasts that by 2028, prompt-to-app approaches adopted by citizen developers will increase software defects by 2,500%. Gartner identifies a new class of defect where AI generates code that is syntactically correct but lacks awareness of broader system architecture and nuanced business rules. The remediation costs for these deep contextual bugs will consume budgets previously allocated to innovation.</p><h2>Shadow AI is the multiplier</h2><p><b> </b><a href="https://www.ibm.com/reports/data-breach">IBM’s 2025 Cost of a Data Breach Report</a> found that 20% of organizations experienced breaches linked to shadow AI. Those incidents added $670,000 to the average breach cost, pushing the shadow AI breach average to $4.63 million. Among organizations that reported AI-related breaches, <a href="https://newsroom.ibm.com/2025-07-30-ibm-report-13-of-organizations-reported-breaches-of-ai-models-or-applications,-97-of-which-reported-lacking-proper-ai-access-controls">97% lacked proper access controls</a>. And 63% of breached organizations had no AI governance policy in place. </p><p>Shadow AI breaches disproportionately exposed customer personally identifiable information at 65%, compared to 53% across all breaches, and affected data distributed across multiple environments 62% of the time. Only 34% of organizations with AI governance policies performed regular audits for unsanctioned AI tools. <a href="https://venturebeat.com/security/shadow-ai-doubles-every-18-months-creating-blind-spots-socs-never-see">VentureBeat’s shadow AI research</a> estimated that actively used shadow apps could more than double by mid-2026. Cyberhaven data found 73.8% of ChatGPT workplace accounts in enterprise environments were unauthorized.</p><h2>What to do first</h2><p>The audit framework below gives CISOs a starting point for triaging vibe-coded app risk across five domains.</p><table><tbody><tr><td><p><b>Domain</b></p></td><td><p><b>Current State (Most Orgs)</b></p></td><td><p><b>Target State</b></p></td><td><p><b>First Action</b></p></td></tr><tr><td><p>Discovery</p></td><td><p>No visibility into vibe-coded apps</p></td><td><p>Automated scanning of vibe coding platform domains</p></td><td><p>Run DNS + certificate transparency scan for Lovable, Replit, Base44, and Netlify subdomains tied to corporate assets</p></td></tr><tr><td><p>Authentication</p></td><td><p>Platform defaults (public by default)</p></td><td><p>SSO/SAML integration required before deployment</p></td><td><p>Block unauthenticated apps from accessing internal data sources</p></td></tr><tr><td><p>Code scanning</p></td><td><p>Zero coverage for citizen-built apps</p></td><td><p>Mandatory SAST/DAST before production</p></td><td><p>Extend the existing AppSec pipeline to cover vibe-coded deployments</p></td></tr><tr><td><p>Data loss prevention</p></td><td><p>No DLP coverage for vibe coding domains</p></td><td><p>DLP policies covering Lovable, Replit, Base44, Netlify</p></td><td><p>Add vibe coding platform domains to existing DLP rules</p></td></tr><tr><td><p>Governance</p></td><td><p>No AI usage policy or shadow AI detection</p></td><td><p>AI governance policy with regular audits for unsanctioned tools</p></td><td><p>Publish an acceptable-use policy for AI coding tools with a pre-deployment review gate</p></td></tr></tbody></table><p>The CISO who treats this as a policy problem will write a memo. The CISO who treats this as an architecture problem will deploy discovery scanning across the four largest vibe coding domains, require pre-deployment security review, extend the existing AppSec pipeline to citizen-built apps, and add those domains to DLP rules before the next board meeting. One of those CISOs avoids the next headline.</p><p>The vibe coding exposure RedAccess documented is not a separate problem from shadow AI. It is shadow AI&#x27;s production layer. Employees build internal tools on platforms that default to public, skip authentication, and never appear on any asset inventory, which means the applications stay invisible to security teams until a breach surfaces or a reporter finds them first. Traditional asset discovery tools were designed to find servers, containers, and cloud instances. They have no way to find a marketing configurator that a product manager built on Lovable over a weekend, connected to a Supabase database holding live customer records, and shared with three external contractors through a public URL that Google indexed within hours.</p><p>The detection challenge runs deeper than most security teams realize. Vibe-coded apps deploy on platform subdomains that rotate frequently and often sit behind CDN layers that mask origin infrastructure. Organizations running mature, secure web gateways, CASB, or DNS logging can detect employee access to these domains. But detecting access is not the same as inventorying what was deployed, what data it holds, or whether it requires authentication. Without explicit monitoring of the major vibe coding platforms, the apps themselves generate a limited signal in conventional SIEM or endpoint telemetry. They exist in a gap between network visibility and application inventory that most security stacks were never architected to cover.</p><h2>The platform responses tell the story </h2><p>Replit CEO Amjad Masad said RedAccess gave his company only 24 hours before going to the press. Base44 (via Wix) and Lovable both said RedAccess did not include the URLs or technical specifics needed to verify the findings. None of the platforms denied that the exposed applications existed.</p><p><a href="https://www.wiz.io/blog/critical-vulnerability-base44">Wiz Research</a> separately discovered in July 2025 that Base44 contained a platform-wide authentication bypass. Exposed API endpoints allowed anyone to create a verified account on private apps using nothing more than a publicly visible app_id. The flaw meant that showing up to a locked building and shouting a room number was enough to get the doors open. Wix fixed the vulnerability within 24 hours after Wiz reported it, but the incident exposed how thin the authentication layer is on platforms where millions of apps are being built by users who assume the platform handles security for them.</p><p>The pattern is consistent across the vibe coding ecosystem. <a href="https://nvd.nist.gov/vuln/detail/CVE-2025-48757">CVE-2025-48757</a> documented insufficient or missing Row-Level Security policies in Lovable-generated Supabase projects. Certain queries skipped access checks entirely, exposing data across more than 170 production applications. The AI generated the database layer. It did not generate the security policies that should have restricted who could read the data. Lovable disputes the CVE classification, stating that individual customers accept responsibility for protecting their application data. That dispute itself illustrates the core tension: platforms that market to nontechnical builders are shifting security responsibility to users who do not know it exists.</p><h2>What this means for security teams </h2><p>The RedAccess findings complete the picture. Professional agents face credential theft on one layer. Citizen platforms face data exposure on the other. The structural failure is the same. Security review happens after deployment or not at all. Identity and access management systems track human users and service accounts. They do not track the Lovable app a sales operations analyst deployed last Tuesday, connected to a live CRM database, and shared with three external contractors via a public URL.</p><p>Nobody asks whether the database policies restrict who can read the data or whether the API endpoints require authentication. When those questions go unasked at AI-generation speed, the exposure scales faster than any human review process can match. The question for security leaders is not whether vibe-coded apps are inside their perimeter. The question is how many, holding what data, visible to whom. The RedAccess findings suggest the answer, for most organizations, is worse than anyone in the C-suite currently knows. The organizations that start scanning this week will find them. The ones that wait will read about themselves next.</p>]]></description>
            <author>louiswcolumbus@gmail.com (Louis Columbus)</author>
            <category>Security</category>
            <enclosure url="https://images.ctfassets.net/jdtwqhzvc2n1/3H8BkMlVjetAGClhRKtmFK/533b657c9dd2b63819c0db0d2f8c6559/5_000_vibe-coded_apps_just_proved_shadow_AI_is_the_new_S3_bucket_crisis.png?w=300&amp;q=30" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[An AI agent rewrote a Fortune 50 security policy. Here's how to govern AI agents before one does the same.]]></title>
            <link>https://venturebeat.com/security/cisco-crowdstrike-rsac-2026-agent-identity-iam-gap-maturity-model</link>
            <guid isPermaLink="false">1LineMe8o1aapXczWqKIYD</guid>
            <pubDate>Fri, 08 May 2026 17:55:03 GMT</pubDate>
            <description><![CDATA[<p>A CEO’s AI agent rewrote the company’s security policy. Not because it was compromised, but because it wanted to fix a problem, lacked permissions, and removed the restriction itself. Every identity check passed. CrowdStrike CEO George Kurtz <a href="https://venturebeat.com/security/rsac-2026-agentic-soc-agent-telemetry-security-gap">disclosed the incident and a second one at his RSAC 2026 keynote</a>, both at Fortune 50 companies.</p><p>The credential was valid. The access was authorized. The action was catastrophic.</p><p>That sequence breaks the core assumption underneath the IAM systems most enterprises run in production today: that a valid credential plus authorized access equals a safe outcome. Identity systems were built for one user, one session, one set of hands on a keyboard. Agents break all three assumptions at once.</p><p>In an exclusive interview with VentureBeat at RSAC 2026, Matt Caulfield, VP of Identity and Duo at Cisco, (pictured above) walked through the architecture his team is building to close that gap and outlined a six-stage identity maturity model for governing agentic AI. The urgency is measurable: Cisco President Jeetu Patel told VentureBeat at the same conference that 85% of enterprises are running agent pilots while only 5% have reached production — an 80-point gap that the identity work is designed to close.</p><h2>The identity stack was built for a workforce that has fingerprints</h2><p>“Most of the existing IAM tools that we have at our disposal are just entirely built for a different era,” Caulfield told VentureBeat. “They were built for human scale, not really for agents.”</p><p>The default enterprise instinct is to shove agents into existing identity categories: human user; machine identity; pick one. &quot;Agents are a third kind of new type of identity,&quot; Caulfield said. &quot;They&#x27;re neither human. They&#x27;re neither machine. They&#x27;re somewhere in the middle where they have broad access to resources like humans, but they operate at machine scale and speed like machines, and they entirely lack any form of judgment.&quot;</p><p>Etay Maor, VP of Threat Intelligence at Cato Networks, <a href="https://venturebeat.com/security/openclaw-500000-instances-no-enterprise-kill-switch">put a number</a> on the exposure. He ran a live Censys scan and counted nearly 500,000 internet-facing OpenClaw instances. The week before, he found 230,000, discovering a doubling in seven days.</p><p>Kayne McGladrey, an IEEE senior member who advises enterprises on identity risk, made the same diagnosis independently. Organizations are cloning human user accounts to agentic systems, McGladrey told VentureBeat, except agents consume far more permissions than humans would because of the speed, the scale, and the intent.</p><p>A human employee goes through a background check, an interview, and an onboarding process. Agents skip all three. The <a href="https://newsroom.cisco.com/c/r/newsroom/en/us/a/y2026/m03/cisco-reimagines-security-for-the-agentic-workforce.html">onboarding assumptions baked into modern IAM</a> do not apply. Scale compounds the failure. Caulfield pointed to projections where a trillion agents could operate globally. “We barely know how many people are in an average organization,” he said, “let alone the number of agents.”</p><h2>Access control verifies the badge. It does not watch what happens next.</h2><p>Zero trust still applies to agentic AI, Caulfield argued. But only if security teams push it past access and into action-level enforcement. “We really need to shift our thinking to more action-level control,” he told VentureBeat. “What action is that agent taking?”</p><p>A human employee with authorized access to a system will not execute 500 API calls in three seconds. An agent will. Traditional <a href="https://venturebeat.com/security/rsac-2026-agent-identity-frameworks-three-gaps">zero trust</a> verifies that an identity can reach an application. It doesn’t scrutinize what that identity does once inside.</p><p>Carter Rees, VP of Artificial Intelligence at <a href="https://reputation.com/">Reputation</a>, identified the structural reason. The flat authorization plane of an LLM fails to respect user permissions, Rees <a href="https://venturebeat.com/security/one-command-open-source-repo-ai-agent-backdoor-openclaw-supply-chain-scanner">told VentureBeat</a>. An agent operating on that flat plane does not need to escalate privileges. It already has them. That is why access control alone cannot contain what agents do after authentication.</p><p>CrowdStrike CTO Elia Zaitsev described the detection gap to VentureBeat. In most default logging configurations, an agent’s activity is indistinguishable from a human. Distinguishing the two requires walking the process tree, tracing whether a browser session was launched by a human or spawned by an agent in the background. Most enterprise logging cannot make that distinction.</p><p>Caulfield’s identity layer and Zaitsev’s telemetry layer are solving two halves of the same problem. No single vendor closes both gaps.</p><p>“At any moment in time, that agent can go rogue and can lose its mind,” Caulfield said. “Agents read the wrong website or email, and their intentions can just change overnight.”</p><h2>How the request lifecycle works when agents have their own identity</h2><p>Five vendors shipped agent identity frameworks at RSAC 2026, including Cisco, CrowdStrike, Palo Alto Networks, Microsoft, and Cato Networks. Caulfield walked through how Cisco&#x27;s identity-layer approach works in practice.</p><p>The <a href="https://www.networkworld.com/article/4148823/cisco-goes-all-in-on-agentic-ai-security.html">Duo agent identity platform</a> registers agents as first-class identity objects, with their own policies, authentication requirements, and lifecycle management. The enforcement routes all agent traffic through an AI gateway supporting both <a href="https://venturebeat.com/security/meta-rogue-ai-agent-confused-deputy-iam-identity-governance-matrix">MCP</a> and traditional REST or GraphQL protocols. When an agent makes a request, the gateway authenticates the user, verifies that the agent is permitted, encodes the authorization into an OAuth token, and then inspects the specific action and determines in real time whether it should proceed.</p><p>“No solution to agent AI is really complete unless you have both pieces,” Caulfield told VentureBeat. “The identity piece, the access gateway piece. And then the third piece would be observability.”</p><p>Cisco <a href="https://siliconangle.com/2026/05/04/cisco-buys-astrix-security-strengthen-ai-agent-discovery-governance/">announced its intent to acquire Astrix Security</a> on May 4, signaling that agent identity discovery is now a board-level investment thesis. The deal also suggests that even vendors building identity platforms recognize that the discovery problem is harder than expected.</p><h2>Six-stage identity maturity model for agentic AI</h2><p>When a company shows up claiming 500 agents in production, Caulfield doesn&#x27;t accept the number. &quot;How do you know it&#x27;s 500 and not 5,000?&quot;</p><p>Most organizations don’t have a source of truth for agents. Caulfield outlined a six-stage engagement model.</p><p>Discovery first: identify every agent, where it runs, and who deployed it. Onboarding: register agents in the identity directory, tie each one to an accountable human, and define permitted actions. Control and enforcement: place a gateway between agents and resources, inspect every request and response. Behavioral monitoring: record all agent activity, flag anomalies, and build the audit trail. Runtime isolation contains agents on endpoints when they go rogue. Compliance mapping ties agent controls to audit frameworks before the auditor shows up. The six stages are not proprietary to any single vendor. They describe the sequence every enterprise will follow regardless of which platform delivers each stage.</p><p>Maor&#x27;s Censys data complicates step one before it even starts. Organizations beginning discovery should assume their agent exposure is already visible to adversaries. Step four has its own problem. Zaitsev&#x27;s process-tree work shows that even organizations logging agent activity may not be capturing the right data. And step three depends on something Rees found most enterprises lack: a gateway that inspects actions, not just access, because the LLM does not respect the permission boundaries the identity layer sets.</p><h2><b>Agentic identity prescriptive matrix</b></h2><p><i>What to audit at each maturity stage, what operational readiness looks like, and the red flag that means the stage is failing. Use this to evaluate any platform or combination of platforms.</i></p><table><tbody><tr><td><p><b>Stage</b></p></td><td><p><b>What to audit</b></p></td><td><p><b>Operational readiness looks like</b></p></td><td><p><b>Red flag if missing</b></p></td></tr><tr><td><p><b>1. Discovery</b></p></td><td><p>Complete inventory of every agent, every MCP server it connects to, and every human accountable for it.</p></td><td><p>A queryable registry that returns agent count, owner, and connection map within 60 seconds of an auditor asking.</p></td><td><p>No registry exists. Agent count is an estimate. No human is accountable for any specific agent. Adversaries can see your agent infrastructure from the public internet before you can.</p></td></tr><tr><td><p><b>2. Onboarding</b></p></td><td><p>Agents are registered as a distinct identity type with their own policies, separate from human and machine identities.</p></td><td><p>Each agent has a unique identity object in the directory, tied to an accountable human, with defined permitted actions and a documented purpose.</p></td><td><p>Agents use cloned human accounts or shared service accounts. Permission sprawl starts at creation. No audit trail ties agent actions to a responsible human.</p></td></tr><tr><td><p><b>3. Control</b></p></td><td><p>A gateway between every agent and every resource it accesses, enforcing action-level policy on every request and every response.</p></td><td><p>Four checkpoints per request: authenticate the user, authorize the agent, inspect the action, inspect the response. No direct agent-to-resource connections exist.</p></td><td><p>Agents connect directly to tools and APIs. The gateway (if it exists) checks access but not actions. The flat authorization plane of the LLM does not respect the permission boundaries the identity layer set.</p></td></tr><tr><td><p><b>4. Monitoring</b></p></td><td><p>Logging that can distinguish agent-initiated actions from human-initiated actions at the process-tree level.</p></td><td><p>SIEM can answer: Was this browser session started by a human or spawned by an agent? Behavioral baselines exist for each agent. Anomalies trigger alerts.</p></td><td><p>Default logging treats agent and human activity as identical. Process-tree lineage is not captured. Agent actions are invisible in the audit trail. Behavioral monitoring is incomplete before it starts.</p></td></tr><tr><td><p><b>5. Isolation</b></p></td><td><p>Runtime containment that limits the blast radius if an agent goes rogue, separate from human endpoint protection.</p></td><td><p>A rogue agent can be contained in its sandbox without taking down the endpoint, the user session, or other agents on the same machine.</p></td><td><p>No containment boundary exists between agents and the host. A single compromised agent can access everything the user can. Blast radius is the entire endpoint.</p></td></tr><tr><td><p><b>6. Compliance</b></p></td><td><p>Documentation that maps agent identities, controls, and audit trails to the compliance framework that the auditor will use.</p></td><td><p>When the auditor asks about agents, the security team produces a control catalog, an audit trail, and a governance policy written for agent identities specifically.</p></td><td><p>Emerging AI-risk frameworks (CSA Agentic Profile) exist, but mainstream audit catalogs (SOC 2, ISO 27001, PCI DSS) have not operationalized agent identities. No control catalog maps to agents. The auditor improvises which human-identity controls apply. The security team answers with improvisation, not documentation.</p></td></tr></tbody></table><p><i>Source: VentureBeat analysis of RSAC 2026 interviews (Caulfield, Zaitsev, Maor) and independent practitioner validation (McGladrey, Rees). May 2026.</i></p><h2>Compliance frameworks have not caught up</h2><p>“If you were to go through an audit today as a chief security officer, the auditor’s probably gonna have to figure out, hey, there are agents here,” Caulfield told VentureBeat. “Which one of your controls is actually supposed to be applied to it? I don’t see the word agents anywhere in your policies.”</p><p>McGladrey&#x27;s practitioner experience confirms the gap. The Cloud Security Alliance published an <a href="https://www.nist.gov/itl/ai-risk-management-framework">NIST AI RMF Agentic Profile</a> in April 2026, proposing autonomy-tier classification and runtime behavioral metrics. But SOC 2, ISO 27001, and PCI DSS have not operationalized agent identities. The compliance frameworks McGladrey works with inside enterprises were written for humans. Agent identities do not appear in any control catalog he has encountered. The gap is a lagging indicator; the risk is not.</p><h2>Security director action plan</h2><p>VentureBeat identified five actions from the combined findings of Caulfield, Zaitsev, Maor, McGladrey, and Rees.</p><ol><li><p><b>Run an agent census and assume adversaries already did.</b></p><p> Every agent, every MCP server those agents touch, every human accountable. Maor&#x27;s Censys data confirms agent infrastructure is already visible from the public internet. NIST&#x27;s NCCoE reached the same conclusion in its February 2026  <a href="https://www.nccoe.nist.gov/projects/software-and-ai-agent-identity-and-authorization">concept paper on AI agent identity and authorization</a>.</p></li><li><p><b>Stop cloning human accounts for agents.</b></p><p> McGladrey found that enterprises default to copying human user profiles, and <a href="https://venturebeat.com/security/microsoft-salesforce-copilot-agentforce-prompt-injection-cve-agent-remediation-playbook">permission sprawl starts on day one</a>. Agents need to be a distinct identity type with scope limits that reflect what they actually do.</p></li><li><p><b>Audit every MCP and API access path.</b></p><p>Five vendors shipped MCP gateways at RSAC 2026. The capability exists. What matters is whether agents route through one or connect directly to tools with no action-level inspection.</p></li><li><p><b>Fix logging so it distinguishes agents from humans.</b></p><p> Zaitsev&#x27;s process-tree method reveals that agent-initiated actions are invisible in most default configurations. Rees found authorization planes so flat that access logs alone miss the actual behavior. Logging has to capture what agents did, not just what they were allowed to reach.</p></li><li><p><b>Build the compliance case before the auditor shows up.</b></p><p> The CSA published a <a href="https://labs.cloudsecurityalliance.org/agentic/agentic-nist-ai-rmf-profile-v1/">NIST AI RMF Agentic Profile</a> proposing agent governance extensions. Most audit catalogs have not caught up. Caulfield told VentureBeat that auditors will see agents in production and find no controls mapped to them. The documentation needs to exist before that conversation starts.</p></li></ol><p></p>]]></description>
            <author>louiswcolumbus@gmail.com (Louis Columbus)</author>
            <category>Security</category>
            <enclosure url="https://images.ctfassets.net/jdtwqhzvc2n1/5ZEO9X2XqceSROWgaNS5Q4/66fa10252a4114f0cc41f837059998b0/Caulfield_article.png?w=300&amp;q=30" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[Anthropic wants to own your agent's memory, evals, and orchestration — and that should make enterprises nervous]]></title>
            <link>https://venturebeat.com/orchestration/anthropic-wants-to-own-your-agents-memory-evals-and-orchestration-and-that-should-make-enterprises-nervous</link>
            <guid isPermaLink="false">51e2y4hKsrN4MEaFTB17Oz</guid>
            <pubDate>Fri, 08 May 2026 17:51:41 GMT</pubDate>
            <description><![CDATA[<p>Just a few weeks <a href="https://venturebeat.com/orchestration/anthropics-claude-managed-agents-gives-enterprises-a-new-one-stop-shop-but"><u>after announcing Claude Managed Agents</u></a>, Anthropic has updated the platform with <a href="https://venturebeat.com/technology/anthropic-introduces-dreaming-a-system-that-lets-ai-agents-learn-from-their-own-mistakes">three new capabilities</a> that collapse infrastructure layers like memory, evaluation, and multi-agent orchestration, into a single runtime.</p><p>This move could threaten the standalone tools that many enterprises cobble together.</p><p>The new capabilities — &#x27;Dreaming,&#x27; &#x27;Outcomes,&#x27; and &#x27;Multi-Agent Orchestration&#x27; — aim to make agents inside Claude Managed Agents “more capable at handling complex tasks with minimal steering,” Anthropic said in a press release.  </p><p>Dreaming deals with memory, where agents “reflect” on their many sessions and curate memories so they learns and surface unknown patterns. Outcomes allows teams to define and set specific rubrics to measure an agent&#x27;s success, while Multi-Agent Orchestration breaks jobs down so a lead agent can delegate to other agents.</p><div></div><p>Claude Managed Agents ideally provides enterprises with a simpler path to deploy agents and embeds orchestration logic in the model layer. It’s an end-to-end platform to manage state, execution graphs, and routing. With the addition of Dreaming, Outcomes and Multi-agent Orchestration, Claude Managed Agents expands capabilities even further and directly competes with tools like LangGraph or CrewAI, as well as external evaluation frameworks, RAG memory architectures, and QA loops.</p><h2>An integration threat</h2><p>Enterprises must now ask: Should we ditch our flexible, modular system in favor of an agent platform that brings almost everything in-house?</p><p>Anthropic designed Claude Managed Agents to share context, state, and traceability in one place. This means the platform sees every decision agents make, rather than enterprises having to wire separate systems together. It sounds practical to have one platform that does everything. But not all enterprises want a full-service system. </p><p>Claude Managed Agents already faces criticism that it encourages vendor lock-in because it owns most of the architecture and tools that govern agents. In the current paradigm, an organization may run Managed Agents but keep multi-agent orchestration, memory, or evaluations in a separate space ensures flexibility. </p><p>The platform offers a fully-hosted runtime, which means memory and orchestration run on infrastructure the enterprise does not own. This can become a compliance nightmare for some organizations that have to prove data residency. </p><p>Another problem to consider is that enterprises already in the middle of large-scale AI transformations must cobble together workarounds to deal with the constraints of their tech stack. Not every workflow is easily replaceable by switching to Claude Managed Agents. </p><h2>Dreaming and outcomes against current tools</h2><p>Most enterprises have a fragmented approach to AI deployment.</p><p>For example, they may use LangGraph or Crew AI for agent routing and workflow management, Pinecone as a vector database for long-term memory, DeepEval for external evaluation, and a human-in-the-loop quality assurance to review some tasks. Anthropic hopes to do away with all of that. </p><p>With Dreaming, Anthropic approaches memory by allowing users to actively rewrite it between sessions, so the agent essentially learns from its mistakes. Anthropic says this capability is useful for long-running states and orchestration. Current systems often handle memory persistence by storing embeddings, retrieving relevant context, and adding more state over time. </p><p>Outcomes addresses the evaluation portion by detailing expectations for agents. Instead of external quality checks, which are often done by a team of humans, Anthropic is bringing evaluation into the orchestration layer rather than above it. </p><p>But it’s the Multi-Agent Orchestration capability that pits Claude Managed Agents against orchestration frameworks from Microsoft, LangChain, CrewAI, and others. Model providers like Anthropic and OpenAI have already begun pushing aggressively into this space, arguing that bringing this to the model layer gives teams better control.</p><h2>Big decisions to make</h2><p>Enterprises face a big decision, and this one could depend on where they are in agent maturity. </p><p>If an organization is still experimenting with agents and has not deployed many in production, they may find moving to Claude Managed Agents and configuring Dreaming and Outcomes to their needs much easier. This is the stage of development where, even if enterprises are using a third-party orchestrator like LangChain, they’re still customizing it. </p><p>But for those who are already further along in the process, the calculation becomes trickier. It’s now a matter of parallel evaluation and better understanding of their processes. </p><p>Businesses, though, will face the same decision even if they don’t intend to use Claude Managed Agents. Anthropic has signaled that other model and platform providers will likely shift their product roadmaps to a similar model that keeps everything locked in the same system — because models may become interchangeable, but the tooling and orchestration infrastructure will not. </p>]]></description>
            <category>Orchestration</category>
            <enclosure url="https://images.ctfassets.net/jdtwqhzvc2n1/D0GSZK3z1kRMUMZ5S6AVG/d022b4a260c084315b3ecd9254baa0d5/crimedy7_illustration_of_a_robot_trying_to_remember_something_40ad3c2a-097e-454e-a646-8ef5b077c76e_1.png?w=300&amp;q=30" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[5% GPU utilization: The $401 billion AI infrastructure problem enterprises can't keep ignoring]]></title>
            <link>https://venturebeat.com/infrastructure/5-gpu-utilization-the-401-billion-ai-infrastructure-problem-enterprises-cant-keep-ignoring</link>
            <guid isPermaLink="false">3xuODBK6uplOtvHsAhdTjm</guid>
            <pubDate>Fri, 08 May 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[<p>For the last 24 months, one narrative justified every over-provisioned data center and bloated IT budget: the GPU scramble. Silicon was the new oil, and H100s traded like contraband. Reserve capacity now or your enterprise would be left behind.</p><p>The bill is now due, and the CFO is paying attention. Gartner estimates AI infrastructure is <a href="https://www.gartner.com/en/newsroom/press-releases/2026-1-15-gartner-says-worldwide-ai-spending-will-total-2-point-5-trillion-dollars-in-2026#:~:text=Building%20AI%20foundations%20alone%20will,foundations%20%28see%20Table%201">adding $401 billion in new spending this year</a>. Real-world audits tell a darker story: average GPU utilization in the enterprise is <a href="https://dataconomy.com/2026/04/23/tech-industry-averages-just-5-gpu-utilization-report-finds/#:~:text=A%20report%20from%20Cast%20AI,more%20GPU%20capacity%20than%20necessary">stuck at 5%</a>. </p><p>That <a href="https://venturebeat.com/infrastructure/fomo-is-why-enterprises-pay-for-gpus-they-dont-use-and-why-prices-keep-climbing">utilization floor is driven by a self-reinforcing procurement loop</a> that makes idle GPUs nearly impossible to release. What makes this shift more urgent is the CapEx reality now hitting enterprise balance sheets. Many organizations locked in GPU capacity under traditional three- to five-year depreciation cycles, with the hyperscalers being at five years. That means the infrastructure purchased during the peak of the “GPU scramble” is now a fixed cost, regardless of how much it is actually used.</p><p>As those assets age, the question is no longer whether the investment was justified. It’s whether it can be made productive. Underutilized GPUs are not just idle resources, they are depreciating assets that must now generate measurable return. This is forcing a shift in mindset: from acquiring capacity to maximizing the economic output of what is already deployed.</p><h2>The scramble was a sideshow</h2><p>For the &quot;Tier 1&quot; enterprise — the Intuits, Mastercards, and Pfizers of the world — access was rarely the true bottleneck. Leveraging deep-pocketed relationships with AWS, Azure, and GCP, these organizations secured capacity reservations that sat idle while internal teams struggled with data gravity, governance, and architectural immaturity.</p><p>The industry narrative of &quot;scarcity&quot; served as a convenient smokescreen for this inefficiency. While the headlines focused on supply chain delays, the internal reality was a massive productivity gap. Organizations were activity-rich (buying chips) but output-poor (generating near-zero useful tokens).</p><p>At 5% utilization, the math simply doesn&#x27;t work. For every dollar spent on silicon, 95 cents is essentially a donation to a cloud provider’s bottom line. In any other department, a 95% waste metric would be a firing offense; in AI infrastructure, it was just called &quot;preparedness.&quot;</p><h2>The Q1 tracker: A market in pivot</h2><p>VentureBeat’s <b>Q1 2026 AI Infrastructure &amp; Compute Market Tracker</b> confirms that the panic phase has officially broken. The tracker is directional rather than statistically definitive — January surveyed 53 qualified respondents, and in February there were 39 — but the pattern across both waves is consistent. When we asked IT decision-makers what actually drives their provider choices today, the results show a market in rapid pivot:</p><ul><li><p><b>The access collapse: </b>“Access to GPUs/availability” factor dropped from 20.8% to 15.4% in a single quarter — from primary concern to secondary in 90 days.</p></li><li><p><b>The pragmatic pivot:</b> “Integration with existing cloud and data stacks” held steady as the top priority at roughly 43% across both waves, while security and compliance requirements surged from 41.5% to 48.7% — nearly closing the gap with integration.</p></li><li><p><b>The TCO mandate: </b>“Cost per inference/TCO (total cost of ownership)” as a top priority jumped from 34% to 41% in a single quarter, overtaking performance as the dominant procurement lens.</p></li></ul><p>The era of the blank check is dead. Inference is where AI becomes a line item. </p><p>Training and even fine-tuning were a tactical project; inference is a strategic business model. For most enterprises, the unit economics of that model are currently unsustainable. During the initial pilot phase, flat-fee licenses and bundled token deals allowed for architectural waste. Teams built long-context agents and complex retrieval pipelines because tokens were effectively a sunk cost. </p><p>As the industry moves toward usage-based pricing in 2026, those same architectures have become liabilities. When metered billing is applied to an infrastructure stack that sits idle 95% of the time, the cost per useful token becomes a line-item emergency the moment a project moves into production.</p><h4><b>From activity to productivity</b></h4><p>The shift highlighted in our Q1 data represents more than just a budget correction; it is a fundamental change in how the success of an AI leader is measured.</p><p>For the last two years, success was about “securing” the stack. In the efficiency era, success is “squeezing” the stack. This is why <b>cost optimization platforms</b> saw the largest planned budget increase in our survey, becoming a top-tier priority as organizations realize that buying more GPUs is often the wrong answer.</p><p>Increasingly IT users are asking how to stop paying for GPUs they aren&#x27;t using. They are moving away from measuring <b>GPU activity</b> (how many chips are powered on) and toward <b>GPU productivity</b> (how many useful tokens are generated per dollar spent).</p><p>The luxury of underutilization is now a liability. The next act of the enterprise AI play is more about finding a way to make the silicon you already have pay for itself.</p><h2>Owning the mint: The choice between token consumer and producer</h2><p>As organizations move from proof-of-concept to production, the focus is shifting away from the latest GPU and toward the architecture of token generation. In this new economic reality, every enterprise must decide its role in the token economy: will you be a token consumer, paying a permanent tax to a model provider, or a token producer, owning the infrastructure and the unit economics that come with it?</p><p>This choice is not just about cost; it is about how an organization decides to handle complexity. Owning inference infrastructure means overcoming KV cache persistence, understanding the storage architecture, knowing what are tolerable latency guarantees, and addressing power constraints. It also introduces real-world enterprise limitations, power availability, data center footprint, and operational complexity, that directly impact how far and how fast AI can scale.</p><p>At the core of this challenge is KV cache economics. Storing context in GPU memory delivers performance but comes at a premium, limiting concurrency and driving up cost per token. Offloading KV cache to shared NVMe-based storage can improve reuse and reduce prefill overhead, but introduces tradeoffs in latency and system design. As NVMe costs rise and GPU memory remains scarce, organizations are forced to balance performance against efficiency.</p><p>For a token producer, managing these tradeoffs, across memory, storage, power, and operations, is simply the cost of doing business at scale. For others, the overhead remains too high, requiring a different path.</p><h4><b>The specialized cloud pivot</b></h4><p>VentureBeat’s Q1 tracker shows that the market is already voting on this strategy. The top strategic direction for enterprises is now to move more workloads to specialized AI clouds, a category that grew from 30.2% to 35.9% in our latest survey.</p><p>These providers — including Coreweave, Lambda, and Crusoe — are evolving. While they initially gained ground by serving model builders and training-heavy workloads, their revenue mix is changing rapidly. Today, training represents roughly 70% of their business volume, but inference customers now make up 30%. We expect that ratio to flip by the end of 2026 as the long tail of enterprise inference begins to scale.</p><p>These specialized providers are gaining strategic attention because they are not just selling GPU access. They are selling the removal of infrastructure friction. They optimize the full stack — storage, networking, and scheduling — around inference-first economics rather than general-purpose cloud operations. For an organization aiming to be a token producer, these environments offer a more efficient factory floor than traditional hyperscalers.</p><div></div><h4><b>The rise of managed inference</b></h4><p>For organizations that realize they cannot efficiently build or manage their own inference factories, a different trend is emerging. Our survey found that the intention to evaluate inference outsourcing and managed LLM providers jumped from 13.2% to 23.1% in a single quarter.</p><p>This nearly 10-percentage-point increase represents a realization that building inference infrastructure internally often creates hidden costs. Providers like Baseten, Anyscale, FireworksAI, and Together AI offer predictable pricing and service-level agreements without requiring the customer to become experts in vLLM tuning or distributed GPU scheduling.</p><p>In this model, the enterprise remains a token consumer, but one that is actively looking to price away the complexity of the stack. They are learning that managing inference internally is only viable if they have the volume to justify the operational burden.</p><h4><b>Simplifying the hybrid stack</b></h4><p>The choice to be a producer is also being made easier by a new layer of hybrid-cloud AI platforms. Solutions from Red Hat, Nutanix, and Broadcom are designed to operationalize open-source inference infrastructure without forcing every company to become a systems integrator.</p><p>The challenge is that modern inference depends on complex open-source components like vLLM, Triton, and Kubernetes. These systems rely on a rapidly evolving stack, with vLLM for high-throughput serving, Triton for model orchestration, and Ray for distributed execution, each powerful on its own, but complex to integrate, tune, and operate at scale. For most enterprises, the challenge isn’t access to these tools, it’s stitching them together into a reliable, production-grade inference pipeline. The promise of these newer platforms is portability: the ability to build an inference stack once and deploy it anywhere, whether in a hyperscaler, a specialized cloud, or an on-premises data center.</p><p>Our <b>Q1 2026 AI Infrastructure &amp; Compute Market Tracker</b> confirms that interest in these DIY-but-managed stacks is growing, jumping from 11.3% in January to 17.9% in February, alongside provider adoption, with a steady rise in organizations leaning into open source. This flexibility matters because enterprise AI will not be centralized in one place. Inference workloads will be distributed based on where data lives, how sensitive it is, and where the cost of running it is lowest.</p><p>The winner in the next phase of the token economy will not be the platform that forces standardization through restriction. It will be the one that delivers standardization through portability, allowing enterprises to switch between being consumers and producers as their needs evolve.</p><h2>The architecture of efficiency: The technical levers of productivity</h2><p>Fixing the 5% utilization wall requires more than just better software; it requires a structural overhaul of the efficiency stack. Many organizations are discovering that high activity is not the same as high productivity. A cluster can run at full tilt but remain economically inefficient if time-to-first-token is too high or if inference requests spend too much time in prefill.</p><p>Inference economics are determined by how much useful output a cluster generates per unit of cost. This requires a shift from measuring GPU activity — simply having the chips powered on — to measuring GPU productivity. Achieving that productivity depends on three technical levers: the network, the memory, and the storage stack.</p><h4><b>Networking: The cost of waiting</b></h4><p>The network is the often-ignored backbone of inference economics. In a distributed environment, the speed at which data moves between compute nodes and storage determines whether a GPU is actually working or merely waiting.</p><p>RDMA (Remote Direct Memory Access) has become the non-negotiable standard for this move. By allowing data to bypass the CPU and move directly between memory and the GPU, RDMA eliminates the latency spikes that traditional network architectures introduce. In practical terms, an RDMA-enabled architecture can increase the output per GPU by a factor of ten for concurrent workloads.</p><p>Without this level of networking, an enterprise is effectively paying a &quot;waiting tax&quot; on every chip in the rack. As model context windows expand and multi-node orchestration becomes the norm, the network determines whether a cluster is a high-speed factory or a bottlenecked warehouse.</p><h4><b>Solving the memory tax: Shared KV cache</b></h4><p>As models become larger and context windows expand toward the millions of tokens, the cost of repeatedly rebuilding the prompt state has become unsustainable. Large language models rely on key-value (KV) caches to maintain context during a session. Traditionally, these are stored in local GPU memory, which is both expensive and limited.</p><p>This creates a &quot;memory tax&quot; that crushes unit economics as concurrency rises. To solve this, the industry is moving toward persistent shared KV cache architectures. By storing the cache centrally on high-performance storage rather than redundantly across multiple GPU nodes, organizations can reduce prefill overhead and improve context reuse.</p><p>Newer architectures are already proving this out. The VAST Data AI Operating System, running on VAST C-nodes using Nvidia BlueField-4 DPUs, allows for pod-scale shared KV cache that collapses legacy storage tiers. Similarly, the HPE Alletra Storage MP X10000 — the first object-based platform to achieve Nvidia-Certified Storage validation — is designed specifically to feed data to inference resources without the coordination tax that causes bottlenecks at scale. WEKA is another provider in this space. </p><h4><b>The compression edge</b></h4><p>Beyond the physical hardware, new algorithmic contributions are redefining what is possible in inference memory. Google’s recent presentation of TurboQuant at ICLR 2026 demonstrates the scale of this shift. TurboQuant provides up to a 6x compression level for the KV cache with zero accuracy loss.</p><p>Techniques like these allow for building large vector indices with minimal memory footprints and near-zero preprocessing time. For the enterprise, this means more concurrent users on the same hardware estate without the &quot;rebuild storms&quot; that typically cause latency spikes. The caveat: compression standards remain contested — no open-source consensus has emerged, and the space is shaping up as a proprietary stack war between Google and Nvidia.</p><h4><b>Storage as a financial decision</b></h4><p>Storage is no longer just a backend decision; it is a financial one. Platforms like Dell PowerScale are now delivering up to 19x faster time-to-first-token compared to traditional approaches, according to Dell. By separating high-performance shared storage and memory-intensive data access from scarce GPU resources, these platforms allow inference to scale more efficiently.</p><p>When a storage layer can keep GPU-intensive workloads continuously fed with data, it prevents expensive resources from sitting idle. In the efficiency era, the goal is to drive the 5% utilization wall upward by ensuring that every cycle is spent on token generation, not on data movement.</p><p>But as the stack becomes more efficient, the perimeter becomes more porous. High-productivity tokens are worthless if the data powering them cannot be trusted.</p><h2><b>Sovereignty and the agentic future: Building the trust foundation</b></h2><p>The final barrier to achieving return on AI is not a technical bottleneck, but a trust bottleneck. As enterprise AI shifts from simple chatbots to autonomous agents, the risk profile changes. Agents require deep access to internal systems and intellectual property to be useful. Without a sovereign architecture, that access creates a liability that most organizations are not equipped to manage.</p><p>VentureBeat research into the <a href="https://venturebeat.com/orchestration/the-ai-governance-mirage-why-72-of-enterprises-dont-have-the-control-and-security-they-think-they-do">state of AI governance reveals a stark disconnect</a>. While many organizations believe they have secured their AI environments, 72% of enterprises admit they do not have the level of control and security they think they do. This governance mirage is particularly dangerous as agentic systems move into production. In the last 12 months, <a href="https://venturebeat.com/security/most-enterprises-cant-stop-stage-three-ai-agent-threats-venturebeat-survey-finds">88% of executives reported security incidents</a> related to AI agents.</p><h4><b>Sovereignty as an architecture principle</b></h4><p>Data sovereignty is often treated as a geographic or regulatory checkbox. For the strategic enterprise, it must be treated as a core architecture principle. It is about maintaining control, lineage, and explainability over the data that powers an agentic workflow.</p><p>This requires a new approach to data maturity, modeled on the traditional medallion architecture. In this framework, data moves through layers of usability and trust — from raw ingestion at the bronze level to refined gold and, eventually, platinum-quality operational data. AI inference must follow this same discipline.</p><p>Agentic systems do not just need available context; they need trusted context. Providing the wrong data to an agent, or exposing sensitive intellectual property to a non-sovereign endpoint, creates both business and regulatory risk. Compartmentalization must be designed into the stack from the start. Organizations need to know which models and agents can access specific data layers, under what conditions, and with what lineage attached.</p><h4><b>Bringing the AI to the data</b></h4><p>The fundamental question for the agentic future is whether to bring the data to the AI or the AI to the data. For highly sensitive workloads, moving data to a centralized model endpoint is often the wrong answer.</p><p>The move toward private AI — where inference happens closer to where trusted data resides — is gaining momentum. This architecture uses sovereign clouds, private environments, or governed enterprise platforms to keep the data perimeter intact.</p><p>This is where the choice to be a token producer becomes a security advantage. By owning the inference stack, an enterprise can enforce governance and lineage at the infrastructure layer. It ensures that the intellectual property used to ground an agent never leaves the organization&#x27;s control.</p><h4><b>The next platform war</b></h4><p>The battle for AI dominance will not be decided by who owns the largest GPU clusters. It will be won by the companies with the best inference economics and the most trusted data foundation.</p><p>The organizations that win the efficiency era will be those that deliver the lowest cost per useful token and the fastest path to production. They will be the ones that have moved past the hoarding hangover to focus on productive output.</p><p>Achieving return on AI requires a shift in mindset. It means moving from a culture of securing the stack to a culture of squeezing the stack. It requires architectural rigor, a focus on token-level ROI and a commitment to sovereignty. When an organization can generate its own tokens efficiently and securely, AI moves from a science project to an economically repeatable business advantage.</p><p>That is how ROI becomes real. That is where the next generation of enterprise advantage will be built.</p><p><i>Rob Strechay is a Contributing VentureBeat analyst and principal at Smuget Consulting, a research and advisory firm focused on data infrastructure and AI systems.</i></p><p><i>Disclosure: Smuget Consulting engages or has engaged in research, consulting, and advisory services with many technology companies, which can include those mentioned in this article. Analysis and opinions expressed herein are specific to the analyst individually, and data and other information that might have been provided for validation, not those of VentureBeat as a whole.</i></p>]]></description>
            <category>Infrastructure</category>
            <enclosure url="https://images.ctfassets.net/jdtwqhzvc2n1/689tH0bKXw4yNCgbnTOvvA/8e7387c8db90ccb74b93fb52c5cc5e23/Gemini_Generated_Image_catf9dcatf9dcatf.png?w=300&amp;q=30" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[Governance, not gatekeeping: How SAP brings enterprise‑grade safety to AI connectivity]]></title>
            <link>https://venturebeat.com/orchestration/governance-not-gatekeeping-how-sap-brings-enterprise-grade-safety-to-ai-connectivity</link>
            <guid isPermaLink="false">4ALu9B2oRvfMsT8lCo4KXx</guid>
            <pubDate>Fri, 08 May 2026 07:00:00 GMT</pubDate>
            <description><![CDATA[<p><i>Presented by SAP</i></p><hr/><p>The enterprise software industry has undergone a fundamental shift, and vendors are adapting their approaches to better protect the customers who rely on them. For years, every global platform vendor running multi-tenant cloud infrastructure has maintained documented rate limits, usage controls, and restrictions on the use of undocumented internal interfaces. </p><p>CRM platforms impose daily API call limits per organization, enforce platform-layer limits, and maintain a strict separation between bulk data APIs and transactional REST surfaces. Productivity and collaboration suites throttle their graph APIs and redirect bulk workloads to purpose-built data access channels designed for that load. HR and workforce management platforms enforce concurrent request limits and per-session data retrieval caps. IT service management platforms enforce per-user rate limits and instance-level throttling. Hyperscalers publish per-service quotas, enforce them at the infrastructure layer, and explicitly prohibit applications from calling non-SDK or non-published interfaces. </p><p>These are not controversial measures. They are baseline hygiene for enterprise-grade software platforms operating shared infrastructure at scale. For more than a decade these measures have been in place without serious objection.</p><p>As SAP has taken responsibility for securing customers&#x27; mission-critical workloads in the cloud, a unified <a href="https://help.sap.com/doc/sap-api-policy/latest/en-US/API_Policy_latest.pdf">API policy</a> with clarified usage controls is not a restriction but the expression of enterprise-grade stewardship. Some have read the policy as a new restriction. The policy does not introduce new restrictions. It names and unifies controls that have existed across individual SAP products for years. </p><p>SAP is not introducing API governance as a novel concept. SAP <a href="https://help.sap.com/doc/sap-api-policy/latest/en-US/API_Policy_latest.pdf">SuccessFactors</a>, SAP <a href="https://help.sap.com/docs/successfactors-platform/sap-successfactors-api-reference-guide-odata-v2/throttling-limits-for-learning-odata-apis">Ariba</a>, SAP <a href="https://help.sap.com/docs/leanix/ea/rate-limiting">LeanIX</a>, and several other SAP solutions have enforced documented rate limits and usage controls. SAP Notes and SAP’s documentation have also in the past defined API usage. </p><p>What the recent policy does is unify that existing practice into a single cross-portfolio standard, a step made urgent by the arrival of autonomous agentic harnesses that SAP is fully committed to enabling, but which place a categorically different performance, stability, and security load on API surfaces that were never designed for autonomous orchestration and data extraction at scale.</p><h2>Custom interfaces: What SAP’s API policy does and does not restrict</h2><p>Custom APIs built by customers in their own namespace for their own extensibility, integration, and migration purposes are customer-developed interfaces. If you have spent years building custom data services, custom RFCs, and ABAP interfaces to connect your SAP system to the world around it, the policy&#x27;s restriction on non-published APIs might read, on first encounter, like a demolition order. It is not. The policy&#x27;s restriction targets SAP&#x27;s own internal unreleased objects. It does not reach into the Z namespace and condemn two decades of ABAP engineering.</p><p>SAP’s Private Cloud customers are in a distinctly privileged position compared with much of the enterprise world, because they have long been able to build in their own namespace and to shape an environment they were free to modify and extend, and that freedom is not being revoked. </p><p>The policy is focused on something narrower: SAP’s own internal interfaces that were never published, never documented for customer use, and never offered as a dependable foundation for integration. Most custom code never touches these internals and will continue untouched; where it does, the risk for customers has always been present, and the policy merely names it rather than inventing it. </p><p>However, within that set there is a smaller class of interfaces that is not a matter for debate but for prohibition. ODP-RFC belongs in that class: it sits in SAP’s namespace as an internal, non-released interface that SAP explicitly classifies as “unpermitted” for customer or third-party application use as documented in <a href="https://me.sap.com/notes/3255746">SAP Note 3255746</a>. </p><p>These are precisely the kinds of interfaces SAP will flag as prohibited in notes and automated tooling so that such usage can be identified early through tooling and guidance, rather than discovered late in deployment or operational context. Clean Core is distinct from the API Policy but points in the same direction, and it bears noting that customers did not merely accept it but asked for it repeatedly, having lived through the upgrade costs of the alternative; in the agentic era, where SAP runs mission-critical ERP as a service, both the Clean Core Recommendations and API Policy are conditions of the enterprise-grade reliability that cloud operations make possible.</p><h2>How AI agents change API usage patterns in SAP systems</h2><p>While some commentators have argued this policy is primarily a commercial move, the technical evidence tells a different story.</p><p>AI has changed everything about our traditional view of transactional interfaces. The APIs that enterprises have used for decades to integrate SAP systems with third-party applications are request-response interfaces built for transactional workloads. They were designed to fetch a sales order, post a goods receipt, or trigger a payment run. They were designed to be mostly called by a human-authored integration flow, at a predictable frequency, for a defined business purpose. They were not designed to have an autonomous AI orchestration harness run thousands of sequential calls against them in pursuit of semantic context about the business model encoded within. That is not a clean core integration pattern.</p><p>Much of the debate misses a core architectural distinction. A traditional integration tool reads a sales order from SAP, converts it into the format a target schema needs, and moves it on. SAP&#x27;s data model plays no role beyond being a transient interpretation step. </p><p>An AI agent does something categorically different. It does not merely retrieve a value. It reads the sales order header data and learns that this structure represents a customer commitment to buy. It reads the line item data and learns how individual items relate to that order. It reads the net value and learns that this number is meaningful only when paired with the document currency. It traces the path that a sales order takes through delivery, billing, and finally into the accounting ledger, and internalizes how SAP reconciles operations and finance within its business object model. </p><p>The agent is not only consuming a customer&#x27;s transactional data. It is consuming the semantic ontology: the business object definitions, the relationships between entities, the conceptual architecture that SAP has built and refined over five decades of enterprise knowledge encoding.</p><p>SAP has long distinguished between enabling transactional access to customer data and the broader extraction or replication of the underlying ontology. The policy does not create this boundary, because it already existed. Autonomous agents must continue to respect that boundary, rather than redefine it.</p><h2>Security risks in third-party MCP implementations</h2><p>Then there is a security angle, and it is not abstract. The same week this policy was published, a supply chain attack named the Mini Shai-Hulud - a variant of the npm worm, quietly compromised hundreds of software packages. SAP-ecosystem npm packages were compromised and we addressed this with <a href="https://me.sap.com/notes/3747787">this security note for customers</a>. This is not a theoretical threat model. This is the active threat environment in which community-built MCP servers are being connected to productive SAP systems running mission-critical business processes.</p><p>The <a href="https://venturebeat.com/security/mcp-stdio-flaw-200000-ai-agent-servers-exposed-ox-security-audit">OWASP MCP Top 10</a> documents the vulnerability classes systematically: tool poisoning, prompt injection, privilege escalation via scope creep, token mismanagement, and supply chain compromise. Recent research across thousands of analyzed MCP implementations shows that a majority operate with static long-lived credentials or carry identifiable security findings, and a single compromised package in the MCP ecosystem can cascade into hundreds of thousands of exposed development environments. VentureBeat just last week <a href="https://venturebeat.com/security/mcp-stdio-flaw-200000-ai-agent-servers-exposed-ox-security-audit">reported</a> a serious com.mand execution flaw that made up to 200,000 MCP servers vulnerable.</p><p>Consider what that means in practice. An AI agent that has just internalized the semantic structure of your SAP data model and is operating through a community MCP server, moves beyond a productivity tool and into an elevated risk category, one that combines broad system access with an attack surface that is still evolving.</p><h2>Why MCP alone cannot run SAP business processes</h2><p>The MCP debate has also obscured a technical reality that enterprise architects need to confront directly. The Model Context Protocol is plumbing. It specifies how an AI model calls a tool. It says nothing about whether the model understands what the tool does in a business context, in what sequence tools must be called, what side effects a given API invocation will trigger, or what the consequences of an incorrect parameter will be. A naive MCP implementation connecting to SAP OData services can call a tool. It cannot run a business process.</p><p>The token consumption data from production agentic deployments is instructive. For illustration, a query asking for an employee&#x27;s manager and traversing through the list of peers in an SAP SuccessFactors system consumed 565,000 tokens under a standard MCP implementation. The same query under a context-aware implementation consumed 80,000 tokens. That is the difference between a query costing $1.70 and a query costing $.24, for example, on a single operation, repeated across thousands of daily transactions. The standard MCP implementation is not automation. It is an expensive approximation of automation that fails on complex queries while loading the API surface with traffic it was not designed to carry.</p><h2>SAP’s architecture for open third-party AI integration via A2A </h2><p>SAP&#x27;s response to these challenges is not to close the ecosystem but to build the right infrastructure for an open one. That distinction is worth dwelling on.</p><p>The API Policy anchors compliance in documented, co-engineered architectures. The agentic interoperability reference architectures jointly developed with major technology partners are published and available on the <a href="https://architecture.learning.sap.com/docs/ref-arch/ca1d2a3e/1">SAP Architecture Center</a>, prioritized by customer demand and updated as new patterns are validated. </p><p>The bi-directional <a href="https://www.sap.com/products/artificial-intelligence/joule-joule-and-microsoft-365-copilot.html">integration</a> of SAP Joule and Microsoft 365 Copilot is the most visible example of what co-engineered agentic integration looks like in production: two AI systems, from two different vendors, working across each other&#x27;s application surfaces without either party bypassing the other&#x27;s security model. The endorsed path for external AI agent access to SAP is the Agent Gateway via the A2A protocol, with reference <a href="http://architecture.learning.sap.com/docs/aigp">AI Golden Path</a> on the SAP Architecture Center. The SAP Knowledge Graph, <a href="https://open-resource-discovery.org/">Open Resource Discovery (ORD) specification for metadata</a>, and SAP BDC <a href="https://api.sap.com/dataproducts">data products</a> provide the context layer that transforms a protocol connection into a business-capable interaction. SAP also offers governed MCP servers for CAP, UI5, Fiori Elements, and has indicated its intent to extent this model to additional development environments, including ABAP development. These are not closed doors, they are the right doors. </p><p>SAP&#x27;s position in the standards community is that of an active contributor, not a gatekeeper. SAP is a launch partner of the Agent2Agent (A2A) protocol under the Linux Foundation and holds <a href="https://aaif.io/members/">Gold level membership in the Agentic AI Foundation</a>, co-chairing the Agent Identity and Trust workstream alongside the organizations that define how AI agents authenticate, authorize, and interoperate across enterprise boundaries. </p><p>A2A and MCP are not external constraints that SAP is grudgingly accommodating. They are protocols SAP uses internally and is actively hardening through standards work. When community and open-source frameworks meet the security floor that enterprise deployment requires, external integration pathways will follow. </p><p>The API Policy <a href="https://help.sap.com/doc/sap-api-policy/latest/en-US/API_Policy_latest.pdf">issued by SAP</a> does not mark the end of openness. The industry has spent two years deploying AI agents against enterprise systems using protocols that the enterprise security community had not finished hardening, against APIs that were never designed for autonomous orchestration, with community tooling that documented attackers had already learned to compromise. Governance was not optional, it was timely<b>. </b></p><p><i>Anirban Majumdar is Head of the Office of the CTO at SAP.</i></p><hr/><p><i>Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact </i><a href="mailto:sales@venturebeat.com"><i><u>sales@venturebeat.com</u></i></a><i>.</i></p>]]></description>
            <category>Orchestration</category>
            <enclosure url="https://images.ctfassets.net/jdtwqhzvc2n1/2ULwo7S7AVJd9EdiIMCoeF/d4af5ac9e026996471370953b888067e/AdobeStock_713835935__1_.jpeg?w=300&amp;q=30" length="0" type="image/jpeg"/>
        </item>
    </channel>
</rss>