Guides

Desktop Agent

A macOS menu bar app that lets you hold a hotkey to speak to your AI agents with screen context.

| View as Markdown
Hunter Hodnett
Hunter Hodnett CPTO at Chipp
| 1 min read
# desktop # macos # voice # screen-context # enterprise

The Chipp Desktop Agent is a native macOS menu bar app. Hold a hotkey to speak to your AI agents while they can see your screen and act on your computer — smart dictation, form filling, cross-app workflows, and more.

⚠️

Desktop Agent is an Enterprise-only feature. Contact sales for access.

What It Can Do

CapabilityDescription
Hold-to-talk voiceHold Fn (or custom hotkey) to speak, release to send
Screen contextAgent sees your active app, window title, focused element, and selected text
Type textSmart dictation into any app with agent intelligence
Press keysExecute keyboard shortcuts (Cmd+Enter, Tab, etc.)
Paste textInsert larger blocks while preserving formatting
Open apps/URLsLaunch applications or navigate to websites
Take screenshotsRequest visual context when needed
Multi-agent switchingSwitch between agents from the menu bar

System Requirements

  • macOS 14.0 (Sonoma) or later
  • Intel or Apple Silicon (native support)
  • Permissions: Accessibility, Microphone, Input Monitoring

Getting Started

1

Install the App

Download the Desktop Agent from your Chipp dashboard. Drag it into your Applications folder.

2

Sign In

Launch the app and sign in with your Chipp builder account. Your session is stored securely in the macOS Keychain.

3

Grant Permissions

The app guides you through three macOS permissions:

  • Accessibility — Lets the agent read UI elements and simulate keystrokes in other apps
  • Microphone — For voice input while holding the hotkey
  • Input Monitoring — For global hotkey capture outside the app

Each permission opens System Settings to the right location.

4

Configure Your Hotkey

Default: Fn (hold to start, release to stop). Configurable to any key combination in the menu bar dropdown.

Use Cases

Smart Dictation

Hold Fn in any app and speak naturally. The agent understands context — in Gmail, it drafts emails; in Slack, it composes messages; in VS Code, it writes code.

Form Filling

On a web form, say “Fill this with Acme Corp’s details.” The agent reads the form fields, recalls information from memory, and fills each field.

Code Review

In VS Code, select code and say “Refactor this to use async/await.” The agent sees the selected text, rewrites it, and replaces the selection.

Cross-App Workflows

“Take what’s in this spreadsheet and draft a summary email.” The agent captures visible data, drafts the email, opens Mail, and pastes it.

Quick Research

“What was the conversion rate Sarah mentioned last week?” The agent speaks the answer aloud using memory — no typing needed.

The Command Bar

Press Cmd+K to open a Spotlight-style floating overlay where you can:

  • Type commands to the active agent
  • View recent actions and history
  • Switch agents without opening the menu bar

Screen Context Details

When you hold the hotkey, the agent can see:

ContextWhat It Reads
Active appWhich application is focused (Gmail, VS Code, Slack, etc.)
Window titleThe current window name (e.g., “Compose - Gmail”)
Focused elementText fields, buttons, and UI elements via Accessibility APIs
Selected textHighlighted content in the active app
ScreenshotsFull window or screen region on request

Context is captured on-demand only when the hotkey is held — there’s no background recording.

Troubleshooting

Hotkey not working?

  • Check Input Monitoring permission in System Settings → Privacy → Input Monitoring
  • If using a custom hotkey, verify it doesn’t conflict with system or app shortcuts

Agent can’t see the screen?

  • Verify Accessibility permission is granted
  • Some Electron apps (Slack, VS Code) expose limited accessibility data — the agent may request a screenshot instead

Text not typing into apps?

  • The app uses multiple methods (character-by-character, clipboard paste, accessibility APIs) with automatic fallback
  • If direct typing fails in a specific app, the agent falls back to clipboard paste
ℹ️

Desktop Agent includes all voice capabilities. For voice configuration details, see the Voice Agents guide.