Microsoft pushed Copilot Studio's computer-use agents to general availability on May 13. The marketing pitch is that agents can now use a browser, a screen, and a keyboard, adapting when layouts shift or fields move. That part is accurate. The part missing from most of the writeups is what it changes for the procurement and integration decisions already on your team's plate this quarter.
Computer-use is the first hyperscaler-grade feature that lets an agent operate a UI directly instead of calling an API. Until now, every serious automation project started with the question "does this system have an API?" and stalled when the answer was "yes, but it costs $40K to license." Computer-use changes the question. Some integrations you do not have to build anymore. Some you absolutely still do. The interesting work is knowing the difference.
What computer-use actually does
Copilot Studio's GA build hands the agent a browser, the ability to read the rendered page with a vision model, and a keyboard and mouse it can drive. The agent reasons over what is on screen, decides the next click, and takes it. The supported models are OpenAI's computer-using agent and Claude Sonnet 4.5. Credentials live in Azure Key Vault, audit logging flows through Microsoft Purview, and you can wire in human-in-the-loop checkpoints at any step.
That is the whole shape of it. No SDK. No connector. No vendor partnership required. If a person can do the work in a browser, the agent can do it in a browser too. That is the pitch and it is mostly true.
Three connector jobs you can stop building
Vendor portals with no real API.Every team has at least one. A carrier portal where shipping rates have to be re-pulled every Tuesday. An insurance broker portal that lists this month's policy changes. A municipal portal where permits get downloaded one at a time. Building a scraper against any of these is a per-quarter maintenance bill. Computer-use turns those jobs into a Copilot Studio agent and a screenshot.
Internal apps without a webhook layer. Most companies own at least one homegrown internal tool that nobody is going to extend. A legacy work-order system, an inventory app from 2014, a Filemaker database the finance team still uses. Adding an integration point to those is a quarter of engineering work, half of which is convincing the system owner it is worth doing. Computer-use sits on top of the existing UI and reads what is there.
Third-party SaaS where you are not the admin.Sales has access to a tool you are not paying for. Marketing logs into a vendor's reporting dashboard with shared credentials. Procurement uses a supplier's portal for orders. You cannot enable an API, you cannot install a webhook, you sometimes cannot even create a real account. Computer-use clicks through the same workflow a person would.
The pattern across all three: the cost of building a proper integration is dominated by the absence of an API surface, not by the workflow itself. Computer-use ignores the API question.
Three connector jobs you cannot skip
Anything that touches money.Payment processors, ledgers, ERP modules with journal entries, anything that posts an irreversible transaction. A computer-use agent driving a UI is fast, confident, and hard to fully reproduce. If finance has to explain a posting six months from now, "the agent clicked submit" is the wrong audit story. Build the API integration even if it costs more upfront.
Anything with structured webhooks already. If the system on the other side will push you an event when something happens, take the event. A poll loop with a vision model is slower, more expensive, and less reliable than a webhook handler that runs in 40 milliseconds. The instinct to let the agent watch the screen because it sounds simpler is a cost trap.
Anything regulated. SOC 2, HIPAA, PCI, GDPR-sensitive transactions. The audit log you are going to be asked for in twelve months needs to show what action was taken, with what data, by what identity, against what record. A screenshot timeline is not that audit log. The API call is.
The cost shape that catches people
Computer-use is metered as agent actions at five Copilot Studio credits per action. Credits cost a cent each pay-as-you-go, or $200 for a 25,000-credit pack. Five cents per action sounds cheap until you count the actions.
A worked example. Pull a weekly carrier rate update from a vendor portal. The agent logs in (4 actions). Navigates to the rates section (3). Filters by lane (4). Selects a date range (3). Reads, copies, and posts the data into your system (8). Twenty-two actions, $1.10 per run. Run it daily across three carriers and you land at $1,200 a year. Fine.
Now drop the same approach on a daily reconciliation that takes 200 actions per run across five accounts. You are at $50 a day, $18,000 a year, before token costs on the underlying model. The classic API connector for the same job is roughly $15,000 to build once and a few hundred a year to maintain.
The line where computer-use stops being cheaper than a real API integration is around 50 to 80 actions per run when the workflow runs daily. Past that point, the credit math bends the wrong way fast.
The audit-trail question
Purview captures the metadata for every computer-use session by default. Timestamp, user, organization, resource IDs, transcript thread ID. What it does not capture by default is the full content of what the agent saw and what it did. The transcript is referenced by ID. The content is not stored.
Microsoft's Data Security Posture Management for AI add-on attempts to retrieve the chat text and resource references after the fact. Read the documentation carefully. The phrase "attempts to retrieve" is doing real work in that sentence. DSPM coverage is good for compliance reporting in aggregate. It is not a guarantee that a specific session's screen reads and clicks are reconstructible six months later in front of an auditor.
If the use case requires that level of reconstruction, plan to build a parallel logging layer that captures the screenshots and the action stream into your own store. Not optional.
The decision rule
Reach for computer-use when the system has no API or webhook surface, the workflow runs under 50 actions per execution, the data is not regulated, and you can tolerate a roughly 10x cost premium over a hand-built connector in exchange for not building one.
Build the API connector when the system has an API and you will run the workflow more than a few times a week, when money or regulated data is involved, or when the audit log has to stand up in a regulator's review.
Wait for MCP when the vendor has signaled a roadmap. We covered the renewal-call version of this question two weeks ago. When an MCP server arrives, it beats both computer-use and a hand-built REST connector on cost per integration. Computer-use is the right bridge for the months between now and that arrival. It is not the final shape of the integration.
The vendors selling computer-use will tell you it is the new default. It is not. It is a useful tool for the integration jobs that were genuinely uneconomic to build before. For the rest, the boring API connector is still the right answer this year.
If you are looking at a stack with two or three integrations that never got built because the vendor would not give you an API, and trying to figure out which of them are now worth handing to a computer-use agent, reach out. We run this triage for clients in a few hours.