Claude Computer Use Prompting: UI Targets & Action Sequences

Master Claude's computer use capability. Learn to describe UI targets, structure action sequences, specify error recovery, and prompt for reliable autonomous GUI operation.

January 14, 2026
ClaudeComputer UseGUI AutomationAutonomous AgentsAnthropic

Claude's computer use capability lets it view screenshots, move the mouse, click buttons, type text, scroll, and navigate interfaces — essentially operating a computer like a human would. This opens automation possibilities that traditional scripting can't touch: legacy apps without APIs, complex multi-step workflows, and interfaces that change between runs.

But computer use prompting is a distinct discipline. You're not describing desired output — you're describing UI targets, action sequences, error conditions, and recovery strategies.

The Computer Use Prompt Structure

Every computer use prompt needs:

  1. Goal — What should be accomplished
  2. UI description — What the interface looks like (landmarks)
  3. Action sequence — Step-by-step operations
  4. Verification — How to confirm each step succeeded
  5. Error recovery — What to do when things go wrong

Basic Computer Use Prompt

GOAL: Download the Q3 financial report from the company intranet.

INTERFACE CONTEXT:
- You're starting at the intranet home page (https://intranet.company.com)
- The navigation bar is at the top with links: Home, Documents, HR, Finance, IT
- The Finance section contains quarterly reports

ACTION SEQUENCE:
1. Look at the navigation bar. Click "Finance."
2. On the Finance page, find the "Quarterly Reports" section.
3. Look for "Q3 2025 Financial Report" — it should be a PDF link.
4. Click the download icon next to the report name.
5. Wait for the download to start (look for browser download indicator).

VERIFICATION:
- After step 1: The URL should change to /finance
- After step 4: A download notification should appear in the browser

ERROR RECOVERY:
- If "Finance" link is not visible: scroll down and look again
- If "Q3 2025" report is not listed: check if it's under "Archived Reports" or "2025 Reports"
- If download doesn't start: check if a popup blocker notification appeared
- If you're asked to log in: STOP and report — you don't have credentials

Describing UI Targets

Claude sees screenshots but doesn't "know" UI element types. Describe targets in visual terms:

Good UI Target Descriptions

"Click the blue 'Submit' button in the bottom-right corner of the form."
"Type in the input field labeled 'Email address' — it has a gray placeholder text."
"Click the checkbox next to 'I agree to terms' — it's below the main form."
"Click the dropdown menu that currently says 'Select country' and choose 'Canada'."
"Scroll down until you see the section titled 'Payment History' with a table of transactions."

Poor UI Target Descriptions

"Click the submit button." (Which one? Where?)
"Click element #submit-btn." (Claude doesn't see CSS selectors)
"Type the email." (Where? What field?)
"Click the third button." (Which order? Left to right? Top to bottom?)

UI Landmarking

Before a complex task, have Claude identify landmarks:

First, scan this page and identify the following landmarks:
- Navigation areas (top bar, sidebar, tabs)
- Main content area
- Any forms or input fields
- Buttons visible on screen (list their text and approximate location)
- Any error messages or notifications currently showing

Then report what you see before taking any action.

Action Sequence Patterns

The Checkpoint Pattern

Insert verification after every significant action:

1. Navigate to https://app.example.com/login
   VERIFY: Login page is displayed with email and password fields

2. Type "[email protected]" in the email field
   VERIFY: Email appears in the field

3. Type the password in the password field
   VERIFY: Password field shows dots (not plaintext)

4. Click "Sign In" button
   VERIFY: Either dashboard loads (success) OR error message appears

5. If error: [recovery steps]
   If success: [continue]

The Branching Pattern

Navigate to the user settings page and change the theme to "Dark."

PATH A — Normal flow:
1. Click the user avatar in the top-right corner
2. Click "Settings" in the dropdown
3. Click "Appearance" tab
4. Select "Dark" from the theme options
5. Click "Save changes"
6. VERIFY: "Settings saved" confirmation appears

PATH B — If "Settings" is not in dropdown:
- The user might not have settings access
- Report: "Settings option not available for this account"

PATH C — If "Appearance" tab is not visible:
- Scroll down in the settings page
- If still not found, check if it's under "Display" or "Theme" instead

PATH D — If "Save changes" button doesn't appear:
- The theme might auto-save — look for "Theme updated" toast/indicator
- If no indicator, take a screenshot and report the ambiguity

Error Recovery Patterns

The Stale Screenshot Problem

Screenshots can be outdated if the interface changed between the screenshot and the action:

Before EVERY action:
1. Take a fresh screenshot
2. Compare it to what you expect to see
3. If the screen looks different from expected, describe the difference
4. If the difference is minor (e.g., a notification appeared), proceed
5. If the difference is major (e.g., you're on the wrong page), pause and reassess

The Infinite Loop Prevention

If you attempt the same action 3 times without success:
1. STOP attempting that action
2. Describe what you're trying to do and what's happening
3. Suggest 3 alternative approaches
4. WAIT for human guidance before trying any alternative

The Recovery Prompt Template

Something went wrong. The expected result was: [expected].
The actual result was: [actual — describe what you see].

Analyze what might have happened:
- Wrong page/state? Navigate back to [correct page]
- Element not found? Look for similar elements with different text/labels
- Popup/overlay blocking? Try pressing Escape or looking for close buttons
- Loading/processing? Wait 3 seconds and re-check

If you can identify the likely cause, attempt ONE fix. If the fix doesn't work,
report the situation and WAIT.

Full Computer Use Prompt Template

TASK: [One-sentence description of what to accomplish]

STARTING STATE:
- Current URL: [if known]
- What should be visible: [describe initial screen]
- You are logged in as: [role/permissions if relevant]

ACTION PLAN:
1. [Step 1]
   EXPECT: [what you should see after this step]
2. [Step 2]
   EXPECT: [what you should see after this step]
3. [Step 3]
   EXPECT: [what you should see after this step]

VERIFICATION:
- Success looks like: [concrete success indicators]
- Take a screenshot of the final state

ERROR HANDLING:
- If [common problem]: [recovery action]
- If screen doesn't match expectations at any step: PAUSE and describe what you see
- After 3 failed attempts at the same step: STOP and report

CONSTRAINTS:
- DO NOT: [actions to avoid — e.g., delete, submit without confirmation]
- STOP if: [conditions that require human intervention]

Note:

Common Pitfall: Describing actions in terms of keyboard shortcuts that Claude can't use in the same way. "Press Ctrl+F and search for..." — Claude needs to click the browser's find bar, not send keyboard shortcuts. Describe UI interactions, not abstract commands.

  • Human-in-the-Loop Patterns — Before deploying any computer use workflow, implement safety checkpoints. The risk-based autonomy model is essential reading.