Today, we are launching Operator, an agent that can browse the web and complete tasks for you. It uses its own browser to view web pages and interact by typing, clicking, and scrolling. Right now, Operator is in a research preview, so it has some limitations and will improve as we get feedback. An operator is one of our first agents who can handle tasks independently when given instructions.  

You can ask Operator to handle repetitive browser tasks, such as filling out forms, ordering groceries, or creating memes, using the same websites and tools people already use. Operator saves time and opens new ways for businesses to connect with customers.  

Next, let’s talk about access and rollout. We’re starting with a small rollout for pro users in the US at operator.chatgpt.com. This research period lets us gather feedback and improve Operator over time. We plan to expand access to the Plus team and enterprise users and integrate these features into ChatGPT.  

How Operator Works 

The operator runs on a new model called the Computer Using Agent. CUA combines GPT-4’s vision skills with advanced reasoning using reinforcement learning. It’s trained to work with graphical user interfaces, such as buttons, menus, and text fields you see on your screen.  

The operator takes screenshots of your screen and interacts with websites using mouse and keyboard actions, so it can perform web tasks without requiring special API connections.  

If an Operator encounters problems or makes a mistake, it can use its reasoning skills to fix itself. If it gets stuck and needs help, it gives control back to you, making sure the experience stays smooth and collaborative.  

CUA is still new and has some limitations, but it has already set new records in important browser benchmarks like Web Arena and Web Voyager. You can read more about the evaluations and research behind Operator in our blog post.  

How to Use 

To start, tell the operator what you want it to do. You can take control of the remote browser at any time. The operator will ask you to take over tasks that require a login, payment info, or CAPTCHA.  

Personalize Operator with your own instructions for all or specific sites, such as airline preferences on booking.com. You can set quick-access prompts for frequent tasks and manage multiple tasks at once by starting new conversations.  

Ecosystem and Users 

Operator changes AI from a passive tool to an active helper in the digital world. It makes tasks easier for users and helps companies offer better customer experiences and improve conversion rates. When working with companies like DoorDash, Instacart, OpenTable, Priceline, StubHub, Thumbtack, Uber, and others, ensure the Operator meets real needs and complies with industry standards. We also see many ways operators can make certain workflows more efficient and accessible, especially in the public sector. For example, we are partnering with the City of Stockton to help people enroll in city services and programs more easily.  

As we continue to evaluate Operator during its research period, we aim to identify and expand on ways AI can simplify civic engagement for residents. —Jamil Niazi, Director of Information Technology, City of Stockton 

SourceIntroducing Operator 

Amazon

Leave a Reply

Your email address will not be published. Required fields are marked *