Jay Taylor's notes

back to listing index

steipete/Peekaboo: Peekaboo is a macOS CLI & optional MCP server that enables AI agents to capture screenshots of applications, or the entire system, with optional visual question answering through local or remote AI models.

[web search]
Original source (github.com)
Tags: macOS computer-use github.com
Clipped on: 2025-12-04

steipete/Peekaboo

Folders and files

Name
Last commit message
Last commit date
Dec 2, 2025
Nov 25, 2025
Dec 3, 2025
Nov 25, 2025
Dec 3, 2025
Nov 12, 2025
Dec 2, 2025
Nov 25, 2025
Dec 3, 2025
Nov 25, 2025
Nov 25, 2025
Dec 2, 2025
Nov 24, 2025
Nov 13, 2025
Nov 25, 2025
Dec 3, 2025
Nov 14, 2025
Nov 25, 2025
Nov 25, 2025
May 22, 2025
Nov 18, 2025
Nov 7, 2025
Nov 19, 2025
Nov 22, 2025
Dec 2, 2025
Dec 3, 2025
Nov 5, 2025
May 22, 2025
Nov 25, 2025
Dec 3, 2025
Nov 25, 2025
Nov 25, 2025
Nov 14, 2025
Nov 24, 2025
Nov 7, 2025
Nov 25, 2025

Peekaboo - Mac automation that sees the screen and does the clicks.

Peekaboo brings high-fidelity screen capture, AI analysis, and complete GUI automation to macOS. Version 3 adds native agent flows and multi-screen automation across the CLI and MCP server.

Note: v3 is currently in beta (3.0.0-beta1) and has a few known issues; see the changelog for details.

What you get

  • Pixel-accurate captures (windows, screens, menu bar) with optional Retina 2x scaling.
  • Natural-language agent that chains Peekaboo tools (see, click, type, scroll, hotkey, menu, window, app, dock, space).
  • Menu and menubar discovery with structured JSON; no clicks required.
  • Multi-provider AI: GPT-5.1 family, Claude 4.x, Grok 4-fast (vision), Gemini 2.5, and local Ollama models.
  • MCP server for Claude Desktop and Cursor plus a native CLI; the same tools in both.
  • Configurable, testable workflows with reproducible sessions and strict typing.
  • Requires macOS Screen Recording + Accessibility permissions (see docs/permissions.md).

Install

  • macOS app + CLI (Homebrew):
    brew install steipete/tap/peekaboo
  • MCP server (Node 22+, no global install needed):
    npx -y @steipete/peekaboo

Quick start

# Capture full screen at Retina scale and save to Desktop
peekaboo image --mode screen --retina --path ~/Desktop/screen.png

# Click a button by label (captures, resolves, and clicks in one go)
peekaboo see --app Safari --json-output | jq -r '.data.session_id' | read SID
peekaboo click --on "Reload this page" --session "$SID"

# Run a natural-language automation
peekaboo "Open Notes and create a TODO list with three items"

# Run as an MCP server (Claude/Cursor)
npx -y @steipete/peekaboo

# Minimal Claude Desktop config snippet (Developer → Edit Config):
# {
#   "mcpServers": {
#     "peekaboo": {
#       "command": "npx",
#       "args": ["-y", "@steipete/peekaboo"],
#       "env": {
#         "PEEKABOO_AI_PROVIDERS": "openai/gpt-5.1,anthropic/claude-opus-4"
#       }
#     }
#   }
# }
Command Key flags / subcommands What it does
see --app, --mode screen/window, --retina, --json-output Capture and annotate UI, return session + element IDs
click --on <id/query>, --session, --wait, coords Click by element ID, label, or coordinates
type --text, --clear, --delay-ms Enter text with pacing options
press key names, --repeat Special keys and sequences
hotkey combos like cmd,shift,t Modifier combos (cmd/ctrl/alt/shift)
scroll --on <id>, --direction up/down, --ticks Scroll views or elements
swipe --from/--to, --duration, --steps Smooth gesture-style drags
drag --from/--to, modifiers, Dock/Trash targets Drag-and-drop between elements/coords
move --to <id/coords>, --screen-index Position the cursor without clicking
window list, move, resize, focus, set-bounds Move/resize/focus windows and Spaces
app launch, quit, relaunch, switch, list Launch, quit, relaunch, switch apps
space list, switch, move-window List or switch macOS Spaces
menu list, list-all, click, click-extra List/click app menus and extras
menubar list, click Target status-bar items by name/index
dock launch, right-click, hide, show, list Interact with Dock items
dialog list, click, input, file, dismiss Drive system dialogs (open/save/etc.)
image --mode screen/window/menu, --retina, --analyze Screenshot screen/window/menu bar (+analyze)
list apps, windows, screens, menubar, permissions Enumerate apps, windows, screens, permissions
tools --source native|mcp, --server <name> Inspect native + MCP tools
config init, show, add, login, models Manage credentials/providers/settings
permissions status, grant Check/grant required macOS permissions
run .peekaboo.json, --output, --no-fail-fast Execute .peekaboo.json automation scripts
sleep --duration (ms) Millisecond delays between steps
clean --all-sessions, --older-than, --session Prune sessions and caches
agent --model, --dry-run, --resume, --max-steps, audio Natural-language multi-step automation
mcp serve, list, add, enable/disable, test Manage external MCP servers and serve Peekaboo

Models and providers

  • OpenAI: GPT-5.1 (default) and GPT-4.1/4o vision
  • Anthropic: Claude 4.x
  • xAI: Grok 4-fast reasoning + vision
  • Google: Gemini 2.5 (pro/flash)
  • Local: Ollama (llama3.3, llava, etc.)

Set providers via PEEKABOO_AI_PROVIDERS or peekaboo config add.

Learn more

Development basics

  • Requirements: macOS 15+, Xcode 16+/Swift 6.2. Node 22+ only if you run the pnpm docs/build helper scripts (core CLI/app/MCP are Swift-only).
  • Install deps: pnpm install then pnpm run build:cli or pnpm run test:safe.
  • Lint/format: pnpm run lint && pnpm run format.

License

MIT

Peekaboo is a macOS CLI & optional MCP server that enables AI agents to capture screenshots of applications, or the entire system, with optional visual question answering through local or remote AI models.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Repository age

Footer

© 2025 GitHub, Inc.