Today, we are launching Operator, an agent that can browse the web and complete tasks for you. It uses its own web browser to view web pages and interact by typing, clicking, and scrolling. Right now, Operator is in a research preview, so it has some limitations and will improve as we get feedback. Operator is one of our first agents and AIs that can handle tasks independently when given instructions.  

Ask the Operator to automate browser tasks such as filling out forms, placing grocery orders, and generating memes. By working with familiar websites and tools, Operator streamlines daily workflows and creates new ways for businesses to engage customers.  

We are starting with a small rollout to ensure a smooth launch. Operator is now available to Pro users in the US at operator.chatgpt.com. This research preview helps us learn and improve. As Operator develops, we plan to expand access to Plus, Team, and Enterprise users and add its features to ChatGPT. Now, let’s look at how Operator works.  

How Operator Works 

Operator runs on a new model called Computer Using Agent (CUA). CUA combines GPT-4O’s vision skills with advanced reasoning to work with graphical user interfaces, such as buttons, menus, and text fields you see on your screen.  

Operator views your screen content through screenshots and interacts by performing mouse clicks and keyboard inputs within its browser. This enables it to execute web-based tasks without needing special API integrations.  

If the operator encounters problems or makes a mistake, it can use its reasoning skills to resolve them. If it gets stuck and needs help, it gives control back to you, making sure the experience stays smooth and collaborative.  

CUA is new and has limitations, but already sets records in Web Arena and Web Voyager, two key browser benchmarks. Read more about Operator’s research in our blog post. Now, let’s see how to use Operator.  

How to Use 

To start, just tell the operator what you want it to do, and it will take care of the rest. You can take control of the remote browser at any time. The operator is also trained to ask you to take over tasks that require a login, payment info, or CAPTCHA solving.  

Customize the operator by adding instructions for its behavior across all sites or specific ones, such as setting flight preferences on booking.com. Save prompts for instant use on the homepage. Ideal for recurring tasks such as Instacart grocery restocks. Like browser tabs, initiate multiple operator sessions for parallel activities. For example, ordering a custom mug on Etsy while booking a campsite on Hipcamp.  

Ecosystem and Users 

Operator changes AI from a passive tool into an active part of the digital world. It helps users get things done faster and gives companies new ways to improve customer experiences and boost conversions. We’re working with companies like DoorDash, Instacart, OpenTable, Priceline, StubHub, Thumbtack, Uber, and others to ensure Operator meets real needs and complies with industry standards. We also see many ways operators can make certain tasks easier and more efficient, especially in the public sector. For example, we’re partnering with the City of Stockton to help people sign up for city services and programs more easily.  

We’re releasing Operator to a small group to gather feedback and improve quickly. This approach balances new features, trust, and safety, ensuring Operator delivers value to users, creators, businesses, and public organizations.  

Safety And Privacy 

Protecting users is our top priority. Operator includes three safeguard layers to prevent abuse and keep users in control. Operator is designed for user control. It prompts for input at key moments.  

  • Takeover mode: when sensitive information, such as passwords or payment details, is required, the operator prompts you to take over. During this mode, the operator does not collect or record anything you type.  
  • User confirmations: Before actions such as ordering or emailing, the Operator asks for your approval.  
  • Operator declines sensitive tasks, such as banking or job applications.  
  • On sensitive sites, the operator requires close supervision to quickly address mistakes.  

Operator provides simple controls for managing data privacy.  

  • Training opt-out: If you disable the improvement of the model for everyone in ChatGPT settings, the operator will not use your data for training models  
  • Transparent data management: The privacy section of operator settings lets you delete all browsing data and log out of all sites with one click. It also allows easy deletion of past conversations.  

Protections are in place to prevent websites from attempting to mislead the operator with hidden prompts, harmful code, or phishing attempts.   

Cautious navigation: operator recognizes and ignores prompt injections.  

  • A model monitors suspicious behavior and can pause tasks if it detects something wrong.  
  • Automated systems and specialists review threats and update safeguards quickly.  

Operator is designed to refuse harmful requests and block unauthorized content. Moderation systems can warn users or revoke access if rules are repeatedly violated. Additionally, review steps have been implemented to detect and address misuse. Guidance is provided on using the operator in accordance with the usage policies.  

No system is perfect, and Operator remains in a research preview. Ongoing improvements are informed by real-world feedback and thorough testing. Visit the Operator research blog’s safety section for more information.  

Limitations 

Operator is in an early research preview. It can complete many tasks but may make errors, especially with complex user interfaces such as slideshows or calendars. User feedback will inform improvements in accuracy, reliability, and safety.  

What’s Next? 

We plan to make CUA, the model behind Operator, available in the API soon. This will let developers build agents based on CUA. We’ll share a release timeline as we get more feedback from the research preview.  

Enhanced capabilities: We’ll keep working to help Operator handle longer, more complex workflows. We’ll expand Operator to support plus team and enterprise users and integrate its capabilities directly into ChatGPT in the future, once we are confident in its safety and usability at scale, unlocking seamless, real-time, asynchronous task execution.  

Source: Introducing Operator 

NemoClaw can be installed with a single command, making it easy to add security and privacy for always‑on OpenClaw agents. It runs in the cloud, on premises, and on NVIDIA GeForce RTX PCs, NVIDIA DGX Station, and NVIDIA DGX Spark.  

At GTC, Nvidia announced the Nvidia NemoClaw stack for the OpenClaw Agent platform. With NemoClaw, users can install NVIDIA Nemotron models and the new NVIDIA OpenShell runtime in one step. This update adds privacy and security controls. As a result, self-evolving autonomous AI agents, called Claws, are now more trustworthy, scalable, and accessible.  

“OpenClaw opened the next frontier of AI to everyone and became the fastest growing open source project in history,” said Jensen Huang, Founder and CEO of Nvidia. Mac and Windows are operating systems for the personal computer. OpenClaw is the operating system for personal AI. This is the moment when the industry is very important. The industry has been waiting for the beginning of a new renaissance in software.  

OpenClaw brings people closer to AI and helps build a world where everyone has their own agents, said Peter Steinberger, creator of OpenClaw. With NVIDIA and the wider ecosystem, we are building the claws and guardrails that let everyone create powerful, secure AI assistants.  

NemoClaw uses the Nvidia Agent Toolkit to optimize OpenClaw with one command. It installs OpenShell, which offers open models and a secure sandbox. This protects data privacy and security for autonomous agents. The Name of Law adds an important infrastructure layer; it gives them the access they need to work well while enforcing security, network, and privacy rules.  

NemoClaw works with any coding agent. With open agents, it can use open models, including Nvidia and Nemo Tron, running locally on a user’s system through a privacy router. Agents can also access advanced models in the cloud by combining local and cloud models. Agents can develop new skills and complete tasks. They can still follow set privacy and security rules.  

Always-on agents need dedicated computing to build software, tools, and complete tasks. Name of claw for open claw can run on any dedicated platform, such as NVIDIA GE4s, RTX PCs and laptops, NVIDIA RTX Pro, Power World Workstations, and NVIDIA DGX Station or DGX Spark AI supercomputers. This setup enables local computation, allowing autonomous agents to run continuously. Stop by NVIDIA, build a Claw event in the GTC Park March 16 to 19 – 1 to 5 p.m. on Monday and 8 a.m. to 5 p.m. on Tuesday through Thursday to customize and deploy an active, always-on AI assistant with NemoClaw or for OpenClaw.  

SourceNVIDIA Announces NemoClaw for the OpenClaw Community 

An Operator is an AI agent called a Computer Using Agent (CUA) that completes tasks by controlling a computer via its screen, mouse, and keyboard, automating browser tasks for users.  

Below are some important details about the operator release:  

  • Availability: Currently, ChatGPT Pro is offered to subscribers for $200 a month.  
  • Functionality: The column operator uses GPT-4 OS vision to interact with computer interfaces.  
  • Future Scope: OpenAI plans to expand Operator to the Plus team and enterprise users and integrate it into ChatGPT.  

The current research preview focuses on browser-based actions, aiming to let AI use computers as a human would.  

The operator is a web-based agent that navigates the internet and completes tasks for users. It operates within its own browser environment, allowing it to view web pages and interact by tapping, clicking, and scrolling. Currently in a research preview phase, Operator has certain limitations that are expected to be addressed with further user feedback. As one of OpenAI’s first agents, Operator enables users to delegate tasks, which it then executes autonomously.  

An operator can manage repetitive browser tasks on behalf of users, such as filling out forms, ordering groceries, or generating memes. Because it interacts with websites and tools in the same way users do, Operator enhances the practicality of AI. It streamlines routine activities and creates new opportunities for businesses to engage customers.  

We are starting with a small roll-out for safety and manageability. Pro users in the US can access Operator at operator.chatgpt.com. This limited release helps us learn from users and improve Operator over time.  

How Operator Works 

The operator runs on a new model called Computer Using Agent (CUA). CUA commands GPT for those vision skills, using advanced reasoning and reinforcement learning. It’s trained to work with graphical user interfaces, such as buttons, menus, and text fields you see on your screen.  

The Operator sees what is on the screen by taking screenshots. It interacts with the browser using all mouse and keyboard actions. This means it works on the web without needing special API interfaces.  

If Operator runs into problems or makes a mistake, it uses its reasoning skills to try to fix things on its own. If it can’t resolve the issue, it gives control back to you, ensuring the experience remains smooth and coordinated.  

CUA is still new and has some limitations, but it has already set new records in important browser benchmarks. More details about our evaluations and the research behind Operator are on our blog post.  

How to Use 

To start, tell the Operator what to do, and it will handle the rest. You can take control of the browser at any time. The Operator asks you to step in for tasks that need a login, payment, or when a captcha appears.  

You can personalize Operator with custom instructions for all or specific sites. For example, you might set airline preferences on booking.com. The operator also lets you save points for quick access. This is useful for frequent tasks like restocking groceries on Instacart. Using multiple tabs, the Operator can handle several tasks at once by starting new conversations, like ordering a mug from Etsy while booking a campsite on Hipcamp.  

Ecosystem & Users 

Operator changes AI from a passive tool into an active helper in the digital world. It makes tasks easier for users and helps companies offer better experiences and improve conversion rates. We’re working with companies like DoorDash, Instacart, OpenTable, Priceline, StubHub, Thumbstack, and Uber, and others to ensure Operator meets real needs and follows industry standards. We also see many ways operators can make certain workflows easier to use and more effective, especially in the public sector. For example, we are partnering with the city of Stockton to help people enroll in city services and programs more easily.  

By initially introducing Operator to a select audience, OpenAI aims to learn and refine its capabilities through real-world feedback, while maintaining a focus on innovation, trust, and safety. This approach supports meaningful value delivery to users, creators, businesses, and public sector organizations.  

Safety and Privacy 

Ensuring the operator is safe to use remains our top priority. We have added three layers of safeguards to prevent abuse and keep users in control.  

Operator keeps users in control by prompting for input at key moments.  

  • Takeover Mode: When sensitive information like passwords or payment details must be entered, the Operator prompts you to take over. In this mode, the operator does not collect or record any input.  
  • User confirmations: before completing actions such as placing an order or sending an email. The operator requests your approval.  
  • Task Limitations: Operator declines certain sensitive tasks, such as banking transactions or job application decisions.  
  • Watch Mode: On sensitive sites, such as email and financial services, the Operator operates under close supervision. This lets you promptly identify and correct any issues.  

Data privacy and management within Operator is designed to be straightforward.  

  • Training Opt-Out: If you turn off “Improve the model for everyone” in your ChatGPT settings, your data in Operator will not be used to train our models.  
  • Transparent Data Management: You can delete all browsing data and log out of every site with one click. In Operator’s settings, you can also easily delete past conversations.  

We have added protections to stop websites from manipulating Operator with hidden prompts, malicious code, or phishing attempts.  

  • Cautious Navigation: The operator can detect and ignore prompting actions.  
  • Monitoring: A dedicated monitor detects suspicious behavior and can pause tasks if necessary.  
  • The detection pipeline uses both automated systems and human reviewers to spot new threats. We update safeguards quickly. Operator is built to refuse harmful requests and block disallowed content. Our moderation can warn users or revoke access if rules are broken. Extra review steps help catch misuse. We provide guidance on using Operator in line with policies.  

Even with safeguards, no system is perfect, and Operator is under research review. We will improve it with feedback and testing. To learn more, visit the Operator Research blogs’ safety section.  

Limitations 

The operator is currently in an early research phase, and while it’s already capable of handling a wide range of tasks, it’s still learning and evolving and may make mistakes. For instance, it currently struggles with complex interfaces, such as creating slide shows or managing calendars. Early user feedback will play a vital role in upgrading its accuracy, reliability, and safety, helping us make Operator better for everyone.  

What’s Next? 

Cua in the API: The model behind Operator, called Cua, will soon become available via the API, enabling developers to build their own CAD computer using agents.  

Enhanced capabilities will keep working to help the Operator handle longer, more detailed workflows.   

Access: We plan to expand Operator to the plus team and enterprise users, and to integrate its capabilities directly into ChatGPT in the future, once we are certain of its safety and usability at scale, unlocking seamless, real-time, and asynchronous task execution.

Source:Introducing Operator 

Right now, we are moving from models that excel at specific tasks to agents that can handle more complex workflows. When you prompt a model, you only get its trained intentions, but if you give it a computer environment, it can do much more, like run services, request data from APIs, and/or create useful things like spreadsheets and reports.  

When building agents, some practical problems come up. For example:  

  • You need to decide where to store intermediate files.  
  • Avoid pasting large tables into prompts.  
  • Give workflows network access without causing security issues.  
  • Handle timeouts and read-rides without building your own workflow system.  

To address these agent-specific challenges, we built the components needed to give the Responsys API a computer environment. By doing this, we enable reliable management of real-world tasks, freeing developers from having to create their own execution setups. This sets the stage for tackling the broader practical problems faced in agent development.  

OpenAI’s API shell tool and hosted container workspace address these challenges. The model suggests steps and commands that run in a separate environment with its own filesystem, optional storage (e.g., SQLite), and limited Network Access.  

With this foundation in place, let’s explore how we build a computer environment for agents and discuss early lessons from using it to accelerate, standardize, and improve safety in production workflows.  

The Shell Tool 

A good agent workload needs a tight execution loop:  

  1. The model suggests an action.  
  1. The platform executes it.  
  1. The result informs the next step.  

We’ll start with the shell tool to illustrate this loop, then discuss the container, workspace, networking, reusable skills, and context. Compact Shenoy  

To understand the shared tool, know how a model uses tools. It suggests tool calls after seeing step-by-step examples during training. The model proposes tool use but can’t execute the calls itself.  

The shell tool gives the model command-line access to perform tasks like text search or API requests using familiar Unix utilities such as grep, curl, and awk.  

Unlike our current code interpreter, which runs only Python, the Shell Tool supports a much broader range of use cases. You can run GO or JAVA programs or start a Node.js server. Such flexibility enables the model to handle more complex tasks.  

Orchestrating The Agent Loop 

On its own, a model can only propose shell commands, but how are these commands executed? We need an orchestrator to retrieve model output, invoke tools, and return the tools’ response to the model in a loop until the task is complete.  

The Responsys API is how developers interact with OpenAI models when used with custom tools. The Responsys API returns control to the client, who must provide their own harness to run the tools. However, this API can also orchestrate between the modern and hosted tools out of the box.  

When the Responsys API receives a prompt, it assembles model context: user prompt, prior dialog state, and tool instructions. For shell execution to work, the prompt must mention using the Shell Tool, and the selected model must be trained to propose shell commands. Models GPT-5.2 and later are trained to do so with all of these contexts.  

The model then decides the next action. If it chooses shell execution, it returns one or more shell commands to the Responsys API service. The API service forwards those commands to the container runtime, streams the shell output back, and feeds it to the model in the next request’s context. The model can inspect the results, issue follow-up commands, or produce a final answer. The Responsys API repeats this loop until the model returns a completion without additional shell commands.  

When the Responsys API runs a Shell command, it keeps a streaming connection to the Container Service open. As output appears, the API sends it to the model almost immediately. This lets the model decide whether to wait for more output, run another command, or issue a final response.  

The model can suggest several shell commands at once. The Responsys API can run these commands concurrently in separate container sessions. Each session streams its output separately. The API then combines these streams into structured tool outputs for context. This allows the agent loop to run tasks such as searching files, fetching data, and checking results in parallel.  

Commands that handle files or process data may generate lots of shell output. This can fill a context space without adding much value. To manage this, the model sets an output limit for each command. The Responsys API enforces the limit and returns a result that keeps both the start and end of the output, marking skipped content. For example, you might set an AF1000 character limit, keeping the beginning and end.  

By combining concurrent execution and output limits, the agent loop maintains speed and context efficiency. The agent loop controls which tool outputs are included in the context, helping the model focus on important results rather than being overwhelmed by raw terminal logs.  

When The Context Window Gets Full: Compaction 

A challenge with agent loops is that some tasks run for a long time. These long tasks can fill up the context window, which tracks information across turns and agents. For example, an agent might call a skill, get a response, and then make turn calls and summaries. The Limited Context Window can fill up quickly, keeping important details while removing extraneous information. We built native compaction into the Responsys API. Developers don’t need to create their own summarization or state systems, and the feature matches model training.  

Our latest models are trained to review prior dialog states and generate a compaction item that stores key information in an encrypted, token-efficient format. After compaction, the context window includes this compaction item and the most important parts of the earlier window. This makes workflow progress smooth across window boundaries, even in long, multi-step, or tool-driven sessions. Codex uses this system to handle long programming tasks and repeated tool use without losing quality.  

You can use compaction either as a built-in server feature or through a separate /compact endpoint. With server-side compaction, you can set a threshold, and the system takes care of compaction timing for you, so you don’t need complex client logic. This setup allows a slightly larger input context window, so small overages just before compaction are handled rather than rejected. As models improve, the native compaction feature updates with every OpenAI model release.  

Codex played a key role in building the compaction system. It was one of the first to use it. If one Codex instance hit a compaction error, we started another instance to investigate. This process helped Codex develop a strong built-in compaction system by working through the problem. Codex’s ability to examine and improve itself has become unique to OpenAI. While most tools just need users to learn them, Codex learns with us.  

Container Context 

Now let’s talk about State and Resources. The container is more than merely a place to run commands. It’s also the model’s working environment. Inside the container, the model can read files, query databases, and reach external systems, all under network policy controls.  

File Systems 

The first part of the container context is the file system, which is used to upload, organize, and manage resources. We created container and file APIs to give the model a clear view of available data and help it select specific file operations rather than run broad energy scans.  

All inputs are directly into the prompt context. As inputs grow, filling the prompt becomes more expensive and harder for the model to navigate. A better approach is to stage resources in the container’s file system and let the model decide which to open, pass, or run via shell commands, much as humans do. Models work better with organized information.  

Databases 

The second part of the container context is databases. We recommend storing structured databases in SQLite and varying them directly rather than copying a spreadsheet into the prompt. Describe the tables and columns, and explain their meanings so the model can pull only the needed rows.  

For example, if you ask which products had declining sales this quarter, the model can look up only the relevant rows rather than search the entire spreadsheet. This approach is faster, cheaper, and better suited to large data sets.  

Network Access 

The third part of the container context is Network Access, essential for agent workloads. Agents may need to fetch live data, call external APIs, or install packages. Giving containers full internet access can be risky, as it allows them to store information outside sites, access sensitive systems, or make it harder to prevent leaks.  

To solve these problems without limiting what agents can do, we set up hosted containers to use a central egress policy proxy. All ongoing network requests go through a central policy layer that enforces allow lists and access controls and keeps traffic visible. For credentials, we use Domain Scoped Secret Injection at egress. The model and container only see placeholders, while the real secret values remain hidden and are used only for approved destinations. This reduces the risk of leaks while still allowing secure external calls.  

Agent Skills 

Shell commands are powerful, but many tasks follow similar multi-step patterns. Agents often must replan and relearn, leading to inconsistent results. Agent skills package these patterns into reusable building blocks. A Skill Easy Folder with a Skill.MD File and Resources such as API Specs and UI Assets.  

This structure maps naturally to the runtime architecture we described earlier. The container provides persistent files and an execution context, and the shell tool provides the execution interface. With both in place, the model can discover scaled files using shell commands (ease, cat, etc.) when needed, interpret instructions, and run scaled scripts within the same agent loop.  

We provide an API to manage skills on the OpenAI platform. Developers upload and store skill folders as versioned bundles, which can later be retrieved by skill ID before sending the prompt to the model. The Responsys API loads the skill and includes it in the model context. The sequence is deterministic.  

  1. Fetch skill metadata, including name and description.  
  1. Fetch the scale bundle, copy it into the container, and unpack it.  
  1. Update model context with skill metadata and the container path.  

When deciding if an SQL is relevant, the model reviews its instructions step by step and runs its scripts using shell commands in the container.  

How Agents are Made 

To put all the pieces together: the Responsys API handles orchestration, provides shell tools, runs actions, supports a double-step container, provides an open system runtime context, supports skill add, provides reusable workflow logic, and enables compaction to let an agent run for a long time with the context needed for an end-to-end workflow.  

Discover the right scale, fetch data, and transform it into a local structured state. Query it efficiently and generate durable artifacts.  

Make Your Own Agent 

For a step-by-step example using the shell tool and computer environment, see our developer blog post and cookbook. These resources show how to package and run a SQL with the responses API.  

We’re eager to see what developers build. Language models go beyond creating text, images, and audio. We’ll continue to enhance our platform for complex real-world tasks at scale.

Source: From model to agent: Equipping the Responses API with a computer environment 

The Buzz 

  • Microsoft has introduced Azure Local Disconnected Operations, Microsoft 365 Local, and Foundry Local—allowing large AI models and productivity tools to run fully offline, as detailed on Microsoft’s official blog.  
  • Organizations can now run multi-modal AI models on NVIDIA hardware within their own secure environments, ensuring full compliance and security without any cloud connection.  
  • The Microsoft 365 productivity suite, including Exchange, SharePoint, and Skype for Business, can now run completely offline through at least 2035.  
  • Defense, government, and other regulated sectors that were previously restricted by compliance rules can now access enterprise AI infrastructure, improving accessibility and enabling innovation.  

Microsoft’s three new sovereign cloud updates let enterprises run large AI models, productivity software, and cloud infrastructure fully offline. This enables high-tech privacy and total data control for regulated sectors. Azure Local, Microsoft 365 Local, and Foundry Local with NVIDIA GPU support empower organizations to securely deploy advanced AI on-premises.  

Microsoft is transforming secure Enterprise AI by allowing large language models and productivity suites to run offline, ensuring strong data privacy and operational control without a cloud connection.  

Azure Local Disconnected Operations are now available, allowing organizations to set up critical infrastructure using Azure’s management tools without any external connections. All management, policy enforcement, and workload execution occur within the customer’s environment. This is a significant change for defense contractors handling classified work or for financial institutions operating in jurisdictions with strict data residency laws.  

The availability of Azure Local Disconnected Operations represents a breakthrough for organizations that need control over their data without sacrificing the power of the Microsoft Cloud. Gerard Hoffman, CEO of Proximus Luxembourg, told Microsoft in a statement, “For Luxembourg, where digitalized sovereignty is not simply a principle but a key necessity, this model offers the strength, autonomy, and trust our market expects.”  

Productivity is just as important as infrastructure. Microsoft 365 Local Disconnected now provides Exchange, SharePoint, and Skype for Business servers fully within customers’ own environment, with promised support through at least 2035. Teams can collaborate, share files, and communicate without any data leaving their network. Customers have full control over access, compliance, and data protection.  

The biggest news is that Foundry Local can now run large-scale AI models on-site. Foundry Local is a set of AI tools that operate entirely within an organization’s own network. Microsoft is adding NVIDIA GPUs so organizations can run computationally intensive AI tasks without connecting to external networks. This update means highly secure organizations can have local, advanced AI capabilities while ensuring strict privacy and compliance.  

The technical architecture is straightforward: In connected mode, a central management component, the control plane, runs in a Microsoft Cloud region and sends configuration and monitoring commands to local, customer-owned servers. In disconnected mode, the control plane itself runs as a virtual machine on the customer’s infrastructure, directly managing Foundry Local, Microsoft 365 Local, and Azure Local without any data or communication reaching Microsoft’s external clouds. The user experience for configuring, monitoring, and updating these services stays the same, whether systems are online, offline, or fully air-gapped.  

Azure Local and Microsoft 365 Local are now globally available in disconnected mode, with Foundry Local’s large AI models offered to qualified compliance-driven customers.  

Digital Sovereignty Roles are patterning worldwide. Microsoft is designed to meet real customer needs, be independent of external connections, and ensure guaranteed continuity.  

Foundry Local is built to handle large models and GPU needs. Microsoft provides support for setup, updates, and work while customers retain full control over their data and hardware.  

The competitive landscape is shifting. While Amazon Web Services offers Hot Posts and Google Cloud provides distributed cloud, neither lets customers run large AI models in completely disconnected, secure environments as Microsoft does. Microsoft leverages its experience with on-premise products like Exchange and SharePoint to give organizations an offline operations and scalable AI advantage.  

In industries such as defense, intelligence, health care, and critical infrastructure, and in certain jurisdictions, this opens up AI capabilities that were previously off-limits. A defense contractor can now run the same multimodal models used in commercial settings, just air-gapped inside a classified facility. A European bank can deploy large language models for internal tools without data crossing borders or touching external networks.  

This setup also addresses operational complexity. Organizations no longer need separate management systems, different governance rules, or split architectures for online and offline workloads, and can achieve consistent management. Whether systems are online, sometimes connected, or always offline, simplifying operations and enhancing efficiency  

Douglas Phillips, President and CTO of Microsoft Specialized Clouds, leads the engineering effort behind these capabilities. His team is responsible for bringing Azure, Microsoft’s adaptive cloud portfolio, and the Microsoft 365 Collaboration Suite to customers with sovereignty, security, edge, and compliance requirements that standard cloud offerings can’t address.  

These changes affect more than just Microsoft customers. This level of secure AI sets a new standard for what businesses can expect from cloud providers. It shows that advanced AI can be used without giving up data control, regulatory compliance, or operational independence.  

Microsoft’s sovereign cloud expansion fundamentally changes what’s possible for enterprises operating under strict compliance regimes. By enabling large AI model deployment, full productivity sockets, and cloud infrastructure to run completely disconnected from external networks, the company is opening up AI capabilities to sectors that were previously locked out by regulatory limitations. The question now isn’t whether sovereign AI is technically feasible. Microsoft just proved it is. The question is how quickly competitors respond and how fast regulated industries adapt these capabilities to close the AI gap within their commercial counterparts.

Source: Microsoft Sovereign Cloud Goes Fully Offline With AI Support