v1.0.0

Desktop Control

matagul matagul ← All skills

Advanced desktop automation with mouse, keyboard, and screen control

Downloads
7.0k
Stars
49
Versions
1
Updated
2026-02-23

Install

npx clawhub@latest install desktop-control

Documentation

---

description: Advanced desktop automation with mouse, keyboard, and screen control

---

Desktop Control Skill

The most advanced desktop automation skill for OpenClaw. Provides pixel-perfect mouse control, lightning-fast keyboard input, screen capture, window management, and clipboard operations.

🎯 Features

Mouse Control

  • -Absolute positioning - Move to exact coordinates
  • -Relative movement - Move from current position
  • -Smooth movement - Natural, human-like mouse paths
  • -Click types - Left, right, middle, double, triple clicks
  • -Drag & drop - Drag from point A to point B
  • -Scroll - Vertical and horizontal scrolling
  • -Position tracking - Get current mouse coordinates

Keyboard Control

  • -Text typing - Fast, accurate text input
  • -Hotkeys - Execute keyboard shortcuts (Ctrl+C, Win+R, etc.)
  • -Special keys - Enter, Tab, Escape, Arrow keys, F-keys
  • -Key combinations - Multi-key press combinations
  • -Hold & release - Manual key state control
  • -Typing speed - Configurable WPM (instant to human-like)

Screen Operations

  • -Screenshot - Capture entire screen or regions
  • -Image recognition - Find elements on screen (via OpenCV)
  • -Color detection - Get pixel colors at coordinates
  • -Multi-monitor - Support for multiple displays

Window Management

  • -Window list - Get all open windows
  • -Activate window - Bring window to front
  • -Window info - Get position, size, title
  • -Minimize/Maximize - Control window states

Safety Features

  • -Failsafe - Move mouse to corner to abort
  • -Pause control - Emergency stop mechanism
  • -Approval mode - Require confirmation for actions
  • -Bounds checking - Prevent out-of-screen operations
  • -Logging - Track all automation actions

---

🚀 Quick Start

Installation

First, install required dependencies:

``bash

pip install pyautogui pillow opencv-python pygetwindow

`

Basic Usage

`python

from skills.desktop_control import DesktopController

Initialize controller

dc = DesktopController(failsafe=True)

Mouse operations

dc.move_mouse(500, 300) # Move to coordinates

dc.click() # Left click at current position

dc.click(100, 200, button="right") # Right click at position

Keyboard operations

dc.type_text("Hello from OpenClaw!")

dc.hotkey("ctrl", "c") # Copy

dc.press("enter")

Screen operations

screenshot = dc.screenshot()

position = dc.get_mouse_position()

`

---

📋 Complete API Reference

Mouse Functions

#### move_mouse(x, y, duration=0, smooth=True)

Move mouse to absolute screen coordinates.

Parameters:
  • -x (int): X coordinate (pixels from left)
  • -y (int): Y coordinate (pixels from top)
  • -duration (float): Movement time in seconds (0 = instant, 0.5 = smooth)
  • -smooth (bool): Use bezier curve for natural movement
Example: `python

Instant movement

dc.move_mouse(1000, 500)

Smooth 1-second movement

dc.move_mouse(1000, 500, duration=1.0)

`

#### move_relative(x_offset, y_offset, duration=0)

Move mouse relative to current position.

Parameters:
  • -x_offset (int): Pixels to move horizontally (positive = right)
  • -y_offset (int): Pixels to move vertically (positive = down)
  • -duration (float): Movement time in seconds
Example: `python

Move 100px right, 50px down

dc.move_relative(100, 50, duration=0.3)

`

#### click(x=None, y=None, button='left', clicks=1, interval=0.1)

Perform mouse click.

Parameters:
  • -x, y (int, optional): Coordinates to click (None = current position)
  • -button (str): 'left', 'right', 'middle'
  • -clicks (int): Number of clicks (1 = single, 2 = double)
  • -interval (float): Delay between multiple clicks
Example: `python

Simple left click

dc.click()

Double-click at specific position

dc.click(500, 300, clicks=2)

Right-click

dc.click(button='right')

`

#### drag(start_x, start_y, end_x, end_y, duration=0.5, button='left')

Drag and drop operation.

Parameters:
  • -start_x, start_y (int): Starting coordinates
  • -end_x, end_y (int): Ending coordinates
  • -duration (float): Drag duration
  • -button (str): Mouse button to use
Example: `python

Drag file from desktop to folder

dc.drag(100, 100, 500, 500, duration=1.0)

`

#### scroll(clicks, direction='vertical', x=None, y=None)

Scroll mouse wheel.

Parameters:
  • -clicks (int): Scroll amount (positive = up/left, negative = down/right)
  • -direction (str): 'vertical' or 'horizontal'
  • -x, y (int, optional): Position to scroll at
Example: `python

Scroll down 5 clicks

dc.scroll(-5)

Scroll up 10 clicks

dc.scroll(10)

Horizontal scroll

dc.scroll(5, direction='horizontal')

`

#### get_mouse_position()

Get current mouse coordinates.

Returns:
(x, y) tuple Example: `python

x, y = dc.get_mouse_position()

print(f"Mouse is at: {x}, {y}")

`

---

Keyboard Functions

#### type_text(text, interval=0, wpm=None)

Type text with configurable speed.

Parameters:
  • -text (str): Text to type
  • -interval (float): Delay between keystrokes (0 = instant)
  • -wpm (int, optional): Words per minute (overrides interval)
Example: `python

Instant typing

dc.type_text("Hello World")

Human-like typing at 60 WPM

dc.type_text("Hello World", wpm=60)

Slow typing with 0.1s between keys

dc.type_text("Hello World", interval=0.1)

`

#### press(key, presses=1, interval=0.1)

Press and release a key.

Parameters:
  • -key (str): Key name (see Key Names section)
  • -presses (int): Number of times to press
  • -interval (float): Delay between presses
Example: `python

Press Enter

dc.press('enter')

Press Space 3 times

dc.press('space', presses=3)

Press Down arrow

dc.press('down')

`

#### hotkey(*keys, interval=0.05)

Execute keyboard shortcut.

Parameters:
  • -*keys (str): Keys to press together
  • -interval (float): Delay between key presses
Example: `python

Copy (Ctrl+C)

dc.hotkey('ctrl', 'c')

Paste (Ctrl+V)

dc.hotkey('ctrl', 'v')

Open Run dialog (Win+R)

dc.hotkey('win', 'r')

Save (Ctrl+S)

dc.hotkey('ctrl', 's')

Select All (Ctrl+A)

dc.hotkey('ctrl', 'a')

`

#### key_down(key) / key_up(key)

Manually control key state.

Example:
`python

Hold Shift

dc.key_down('shift')

dc.type_text("hello") # Types "HELLO"

dc.key_up('shift')

Hold Ctrl and click (for multi-select)

dc.key_down('ctrl')

dc.click(100, 100)

dc.click(200, 100)

dc.key_up('ctrl')

`

---

Screen Functions

#### screenshot(region=None, filename=None)

Capture screen or region.

Parameters:
  • -region (tuple, optional): (left, top, width, height) for partial capture
  • -filename (str, optional): Path to save image
Returns: PIL Image object Example: `python

Full screen

img = dc.screenshot()

Save to file

dc.screenshot(filename="screenshot.png")

Capture specific region

img = dc.screenshot(region=(100, 100, 500, 300))

`

#### get_pixel_color(x, y)

Get color of pixel at coordinates.

Returns: RGB tuple
(r, g, b) Example: `python

r, g, b = dc.get_pixel_color(500, 300)

print(f"Color at (500, 300): RGB({r}, {g}, {b})")

`

#### find_on_screen(image_path, confidence=0.8)

Find image on screen (requires OpenCV).

Parameters:
  • -image_path (str): Path to template image
  • -confidence (float): Match threshold (0-1)
Returns: (x, y, width, height) or None Example: `python

Find button on screen

location = dc.find_on_screen("button.png")

if location:

x, y, w, h = location

# Click center of found image

dc.click(x + w//2, y + h//2)

`

#### get_screen_size()

Get screen resolution.

Returns:
(width, height) tuple Example: `python

width, height = dc.get_screen_size()

print(f"Screen: {width}x{height}")

`

---

Window Functions

#### get_all_windows()

List all open windows.

Returns: List of window titles Example:
`python

windows = dc.get_all_windows()

for title in windows:

print(f"Window: {title}")

`

#### activate_window(title_substring)

Bring window to front by title.

Parameters:
  • -title_substring (str): Part of window title to match
Example: `python

Activate Chrome

dc.activate_window("Chrome")

Activate VS Code

dc.activate_window("Visual Studio Code")

`

#### get_active_window()

Get currently focused window.

Returns: Window title (str) Example:
`python

active = dc.get_active_window()

print(f"Active window: {active}")

`

---

Clipboard Functions

#### copy_to_clipboard(text)

Copy text to clipboard.

Example:
`python

dc.copy_to_clipboard("Hello from OpenClaw!")

`

#### get_from_clipboard()

Get text from clipboard.

Returns: str Example:
`python

text = dc.get_from_clipboard()

print(f"Clipboard: {text}")

`

---

⌨️ Key Names Reference

Alphabet Keys

'a' through 'z'

Number Keys

'0' through '9'

Function Keys

'f1' through 'f24'

Special Keys

  • -'enter' / 'return'
  • -'esc' / 'escape'
  • -'space' / 'spacebar'
  • -'tab'
  • -'backspace'
  • -'delete' / 'del'
  • -'insert'
  • -'home'
  • -'end'
  • -'pageup' / 'pgup'
  • -'pagedown' / 'pgdn'

Arrow Keys

  • -'up' / 'down' / 'left' / 'right'

Modifier Keys

  • -'ctrl' / 'control'
  • -'shift'
  • -'alt'
  • -'win' / 'winleft' / 'winright'
  • -'cmd' / 'command' (Mac)

Lock Keys

  • -'capslock'
  • -'numlock'
  • -'scrolllock'

Punctuation

  • -'.' / ',' / '?' / '!' / ';' / ':'
  • -'[' / ']' / '{' / '}'
  • -'(' / ')'
  • -'+' / '-' / '*' / '/' / '='

---

🛡️ Safety Features

Failsafe Mode

Move mouse to any corner of the screen to abort all automation.

`python

Enable failsafe (enabled by default)

dc = DesktopController(failsafe=True)

`

Pause Control

`python

Pause all automation for 2 seconds

dc.pause(2.0)

Check if automation is safe to proceed

if dc.is_safe():

dc.click(500, 500)

`

Approval Mode

Require user confirmation before actions:

`python

dc = DesktopController(require_approval=True)

This will ask for confirmation

dc.click(500, 500) # Prompt: "Allow click at (500, 500)? [y/n]"

`

---

🎨 Advanced Examples

Example 1: Automated Form Filling

`python

dc = DesktopController()

Click name field

dc.click(300, 200)

dc.type_text("John Doe", wpm=80)

Tab to next field

dc.press('tab')

dc.type_text("john@example.com", wpm=80)

Tab to password

dc.press('tab')

dc.type_text("SecurePassword123", wpm=60)

Submit form

dc.press('enter')

`

Example 2: Screenshot Region and Save

`python

Capture specific area

region = (100, 100, 800, 600) # left, top, width, height

img = dc.screenshot(region=region)

Save with timestamp

import datetime

timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")

img.save(f"capture_{timestamp}.png")

`

Example 3: Multi-File Selection

`python

Hold Ctrl and click multiple files

dc.key_down('ctrl')

dc.click(100, 200) # First file

dc.click(100, 250) # Second file

dc.click(100, 300) # Third file

dc.key_up('ctrl')

Copy selected files

dc.hotkey('ctrl', 'c')

`

Example 4: Window Automation

`python

Activate Calculator

dc.activate_window("Calculator")

time.sleep(0.5)

Type calculation

dc.type_text("5+3=", interval=0.2)

time.sleep(0.5)

Take screenshot of result

dc.screenshot(filename="calculation_result.png")

`

Example 5: Drag & Drop File

`python

Drag file from source to destination

dc.drag(

start_x=200, start_y=300, # File location

end_x=800, end_y=500, # Folder location

duration=1.0 # Smooth 1-second drag

)

`

---

⚡ Performance Tips

1. Use instant movements for speed: duration=0

2. Batch operations instead of individual calls

3. Cache screen positions instead of recalculating

4. Disable failsafe for maximum performance (use with caution)

5. Use hotkeys instead of menu navigation

---

⚠️ Important Notes

  • -Screen coordinates start at (0, 0) in top-left corner
  • -Multi-monitor setups may have negative coordinates for secondary displays
  • -Windows DPI scaling may affect coordinate accuracy
  • -Failsafe corners are: (0,0), (width-1, 0), (0, height-1), (width-1, height-1)
  • -Some applications may block simulated input (games, secure apps)

---

🔧 Troubleshooting

Mouse not moving to correct position

  • -Check DPI scaling settings
  • -Verify screen resolution matches expectations
  • -Use get_screen_size() to confirm dimensions

Keyboard input not working

  • -Ensure target application has focus
  • -Some apps require admin privileges
  • -Try increasing interval for reliability

Failsafe triggering accidentally

  • -Increase screen border tolerance
  • -Move mouse away from corners during normal use
  • -Disable if needed: DesktopController(failsafe=False)

Permission errors

  • -Run Python with administrator privileges for some operations
  • -Some secure applications block automation

---

📦 Dependencies

  • -PyAutoGUI - Core automation engine
  • -Pillow - Image processing
  • -OpenCV (optional) - Image recognition
  • -PyGetWindow - Window management

Install all:

`bash

pip install pyautogui pillow opencv-python pygetwindow

``

---

Built for OpenClaw - The ultimate desktop automation companion 🦞

Launch an agent with Desktop Control on Termo.