Anthropic recently delivered a revolutionary update to Claude 3.5 Sonnet—it's no longer limited to text conversations. Now it can view your screen, move the cursor, and press keys just like a person, truly helping you operate your computer. If you're still filling out forms manually or copying and pasting data, this upgrade might change the way you work entirely. Let's explore how strong this new "computer operation" capability really is and what scenarios it can handle.
How Does Claude Control a Computer Like a Human?
Anthropic built a dedicated API for Claude that lets it "see" the computer interface—essentially by taking screenshots, understanding button and input field locations, and then generating commands to move the mouse, click, and type. After integrating this API, developers can ask Claude to perform tasks like: "Open the Excel spreadsheet on my desktop, copy the numbers in column B into the web form, and submit it." Claude will inspect the screen step by step, move the cursor, and operate the browser—much like directing an intern remotely.
In the OSWorld benchmark, which evaluates an AI's ability to use a computer, the updated Claude 3.5 Sonnet achieved a score of 14.9% using only screenshots—far ahead of the second-place Cradle BAAI at 7.8%. With more operation steps, its score can climb to 22%. While still behind the human baseline of over 70%, it is currently the most "computer-literate" AI available.
Significant Coding Improvements: More Reliable Code Writing
Beyond computer control, the new Claude 3.5 Sonnet also shows impressive gains in programming. On SWE-bench Verified—a benchmark measuring an AI's ability to solve real-world software problems—its score jumped from 40.6% to 49%, surpassing all public models including OpenAI o1-preview. After testing, GitLab found that Claude's reasoning ability in multi-step software development processes improved by 10%, with no increase in latency. In other words, asking it to write a complete web application module or debug complex code logic is now more dependable than before.

