Early-Stage AI Agent 'Computer Use' Now Available For Application In Copilot Studio

News

Apply for the Early-Stage AI Agent ‘Computer Use’ in Copilot Studio

Microsoft has recently unveiled an exciting early access preview of a feature called “computer use” in Copilot Studio. This innovative step is aimed at creating more autonomous AI agents that can visually interact with any application or website—essentially, these agents can click, type, and navigate through digital interfaces just like humans do.

What is Copilot Studio?

Copilot Studio is a low-code development platform provided by Microsoft. It empowers users—including both business professionals and developers—to build, customize, and launch AI-driven agents capable of automating tasks across various applications and workflows. This tool integrates seamlessly with the Power Platform, allowing users to create agents that function independently, within Power Platform apps, or embedded in other applications such as Microsoft Teams or websites.

Expanded Capabilities of Copilot Studio

Microsoft has continuously enhanced Copilot Studio’s functionality. Recently, the platform gained significant features, including deep reasoning abilities for its agents, support for the popular Model Context Protocol (MCP), and the general availability of agent flows. The latest addition allows these agents to utilize websites and desktop applications to complete tasks efficiently.

Highlights of the ‘Computer Use’ Feature

According to a blog post from Charles Lamanna, a Microsoft executive focused on Business & Industry Copilot, “Computer use enables agents to interact with websites and desktop apps by clicking buttons, selecting menus, and typing into fields on the screen.” This ability allows the agents to manage tasks even when there is no API available for direct system connections, which means if a human can use a software application, the AI agent can too.

Key Features of the Announcement

Early Access Preview: The computer use feature allows Copilot Studio agents to interact with software interfaces through standard UI actions like clicking, typing, and navigating, even without APIs.
Real-Time UI Adaptability: Agents can adapt to real-time changes in user interfaces to maintain continuity in their workflows.
Cross-Browser and Desktop Support: Automation is supported across various platforms, including popular browsers like Edge, Chrome, and Firefox, as well as desktop applications.
Hosted Infrastructure: All operations are hosted within Microsoft’s infrastructure, ensuring that enterprise data stays within the bounds of Microsoft Cloud and is not used for training models.
Highlighted Use Cases:
- Automating data entry into company systems.
- Streamlining market research by extracting data from web sources.
- Facilitating invoice data collection and integration with accounting systems.
Improved Robotic Process Automation: The enhanced system offers more intelligent automation capabilities than traditional RPA, allowing agents to reason, react, and make decisions based on dynamic user interfaces.
No-Code UI Programming: Users can now describe their tasks using natural language, watching real-time automation with visual traces and UI previews.
Full Activity Visibility: Users can view a history of activities performed by the computer use feature, including reasoning pathways and screenshots.
Future Plans: Microsoft plans to showcase more features and updates at the upcoming Microsoft Build event in May 2025.

For those interested in trying out this new feature, applications for the preview can be made here. Please note that a preview environment hosted in the U.S. is required, and you’ll need to provide your Tenant ID and Environment ID.

About the Author

David Ramel is a seasoned editor and writer at Converge 360.

Please follow and like us: