← Back to tools

peekaboo

macOS screenshot capture, AI visual analysis, and GUI automation CLI

Description

Peekaboo is a macOS command-line tool and MCP server that enables AI agents to interact with the GUI through screen capture, visual question-answering, and programmatic automation. It supports pixel-accurate screenshots, structured menu extraction, window management, click/type/scroll actions, and natural language task execution via multiple AI providers.

Install

homebrewbrew install steipete/tap/peekaboo
npxnpx -y @steipete/peekaboo

AI Summary

macOS GUI automation tool providing pixel-accurate screen capture, AI-powered visual analysis, and 25+ automation commands for clicks, typing, window management, and natural language task execution.

Capabilities

  • + Capture pixel-accurate screenshots of windows, screens, and menu bars with optional Retina 2x scaling
  • + Visual question-answering using multiple AI providers (OpenAI, Anthropic, Google, xAI, Ollama)
  • + GUI automation: click, type, press keys, hotkey combos, scroll, swipe, drag-and-drop
  • + Window management: list, move, resize, focus windows and Spaces
  • + Application control: launch, quit, relaunch, switch apps
  • + Structured JSON extraction of menus and menu bar items without clicking
  • + Natural language task execution for complex multi-step workflows
  • + MCP server mode for integration with Claude Desktop, Cursor, and other MCP clients
  • + JSON automation script execution for reproducible workflows

Use When

  • An AI agent needs to see and interact with the macOS GUI
  • Automating multi-step GUI workflows that cannot be done via CLI alone
  • Capturing screenshots for documentation or testing
  • Querying screen content with visual AI (e.g., reading dialog text, finding buttons)
  • Integrating macOS GUI control into MCP-based agent workflows

Avoid When

  • x Running on non-macOS platforms
  • x The task can be accomplished through existing CLI tools without GUI interaction
  • x Running in headless CI environments without a display
  • x macOS version is older than 15.0 (Sequoia)

Usage Patterns

Capture a screenshot

peekaboo image --mode screen --retina --path ~/Desktop/screen.png

Takes a Retina-resolution screenshot of the full screen and saves it

Click a button by label

peekaboo click --on "OK" --snapshot latest

Clicks the button labeled OK using the latest UI snapshot

Execute a natural language task

peekaboo "Open Notes and create a TODO list with three items"

Uses AI to interpret and execute a multi-step GUI workflow

List all windows

peekaboo windows list --json

Returns JSON listing of all open windows with positions and sizes

Input / Output

stdin: Not typically used
stdout: JSON snapshots, screenshots (file paths), execution results
stderr: Status and diagnostic messages
Exit codes:
0 Success
1 Error

Typical Pipelines

peekaboo image --mode screen --path /tmp/screen.png && summarize /tmp/screen.png
peekaboo windows list --json | jq ".[] | select(.app == \"Safari\")"

Related Tools

View AGENTS.md for peekaboo