Computer Control

MeghaOS can take the wheel. The computer_control plugin runs an agentic loop: it takes a screenshot, decides the next action, performs it, then looks again — repeating until the task is done. This is how it operates apps, runs terminal commands, and drives AI coding tools like Cursor or Claude Code.

Warning

Computer control physically clicks, types, and runs commands on your machine. It's powerful and genuinely useful, but treat each request like handing over your keyboard. Start with small tasks to build trust.

How it works

The loop is exposed at POST /api/computer/execute and can be halted with POST /api/computer/stop.

What it can do

Capability	Tool
Open an application	`OPEN_APP`
Click / double-click at a point	`CLICK_AT` · `DOUBLE_CLICK_AT`
Type text	`TYPE_TEXT`
Press keys / shortcuts	`PRESS_KEY`
Scroll	`SCROLL_AT`
See the screen	`SCREENSHOT` · `GET_SCREEN_SIZE`
Run a terminal command	`RUN_TERMINAL_COMMAND`
Drive an IDE's AI panel	`START_IDE_AGENT`

Triggering it

The /chat handler detects computer-control intent before treating a request as a UI-composition query, so build-style phrasings route to the control loop. Recognized cues include:

Mentioning an IDE: Cursor, VS Code, Antigravity, Windsurf, Zed
Saying "claude code" / "use claude to…"
An action verb + IDE/build keyword: "open… ", "build… ", "scaffold… "

Examples:

"Open Cursor in my ~/projects/site folder and use Claude Code to add a contact form."

"Open the terminal and run the test suite."

"Use VS Code to create a new Python project with a virtualenv."

Driving AI coding tools

START_IDE_AGENT opens an IDE, navigates to your project folder, and activates its built-in AI panel (or a Claude Code terminal), then types your instructions. From there the agent and the IDE's own assistant collaborate on the task. This is the path behind requests like "use Cursor to build X."

Screen awareness (without taking control)

If you only want the agent to understand what's on screen — not operate it — use the screen-reader path instead:

"what's on my screen right now?" → /api/screen/analyze (READ_SCREEN)
"insert this text where my cursor is" → /api/screen/insert (INSERT_TEXT)
Read the focused selection → GET_FOCUSED_TEXT

Stopping a run

If a control session goes somewhere you didn't intend, stop it immediately with POST /api/computer/stop (or the stop control in the UI). The loop checks for cancellation between actions.