OpenAI has launched a new macOS application for its Codex coding tool, incorporating agentic practices that allow AI agents to handle coding tasks independently. This release follows the trend of agentic software development, seen in tools like Claude Code and Cowork, where swarms of agents and subagents perform much of the programming grunt work.

The company first introduced Codex as a command-line interface in April, followed by a web interface one month later. The macOS app, unveiled on Monday, supports running multiple agents in parallel. It integrates agent skills and advanced workflows that have gained popularity over the past year.

The launch comes less than two months after OpenAI released GPT-5.2-Codex, described as its most powerful coding model. OpenAI aims for this combination to attract users from competitors like Claude Code.

CEO Sam Altman addressed the model’s capabilities during a press call. “If you really want to do sophisticated work on something complex, 5.2 is the strongest model by far,” he said. “However, it’s been harder to use, so taking that level of model capability and putting it in a more flexible interface, we think is going to matter quite a bit.”

Coding benchmarks present a mixed picture. GPT-5.2 holds the top position on TerminalBench, which evaluates AI performance on command-line programming tasks. However, scores from Gemini 3 and Claude Opus are lower but fall within the benchmark’s margin of error. On SWE-bench, which tests AI ability to fix real-world software bugs, results show no clear advantage for GPT-5.2.

Agentic use cases remain challenging to benchmark accurately. User experiences with state-of-the-art models can vary significantly.

The Codex macOS app introduces several new features. It enables background automations set to run on a schedule, with results queued for user review upon return. Users can select agent personalities ranging from pragmatic to empathetic to suit their working style.

Altman emphasized the app’s development speed. “You can use this from a clean sheet of paper, brand new, to make a really quite sophisticated piece of software in a few hours,” he stated. “As fast as I can type in new ideas, that is the limit of what can get built.”


Featured image credit