These days we’re liberating GPT‑5.2‑Codex, probably the most complicated agentic coding style but for complicated, real-world tool engineering. GPT‑5.2‑Codex is a model of GPT‑5.2 additional optimized for agentic coding in Codex, together with enhancements on long-horizon paintings thru context compaction, more potent efficiency on huge code adjustments like refactors and migrations, stepped forward efficiency in Home windows environments, and considerably more potent cybersecurity functions.
As our fashions proceed to advance alongside the intelligence frontier, we’ve noticed that those enhancements additionally translate to capacity jumps in specialised domain names comparable to cybersecurity. As an example, simply remaining week, a safety researcher utilizing GPT‑5.1‑Codex‑Max with Codex CLI discovered and responsibly disclosed(opens in a brand new window) a vulnerability in React that might result in supply code publicity.
GPT‑5.2‑Codex has more potent cybersecurity functions than any style we’ve launched up to now. Those advances can assist toughen cybersecurity at scale, however in addition they carry new dual-use dangers that require cautious deployment. Whilst GPT‑5.2‑Codex does now not succeed in a ‘Top’ point of cyber capacity underneath our Preparedness Framework, we’re designing our deployment method with long run capacity enlargement in thoughts.
We are liberating GPT‑5.2‑Codex lately in all Codex surfaces for paid ChatGPT customers, and dealing against safely enabling get entry to to GPT‑5.2‑Codex for API customers within the coming weeks. In parallel, we’re piloting invite-only depended on get entry to to imminent functions and extra permissive fashions for vetted pros and organizations interested in defensive cybersecurity paintings. We consider that this solution to deployment will steadiness accessibility with protection.
GPT‑5.2‑Codex builds on GPT‑5.2’s strengths in reputable wisdom paintings and GPT‑5.1‑Codex‑Max’s frontier agentic coding and terminal-using functions. GPT‑5.2‑Codex is now higher at long-context figuring out, dependable instrument calling, stepped forward factuality, and local compaction, making it a extra loyal spouse for lengthy operating coding duties, whilst closing token-efficient in its reasoning.
GPT‑5.2‑Codex achieves cutting-edge efficiency on SWE-Bench Professional and Terminal-Bench 2.0, benchmarks designed to check agentic efficiency on all kinds of duties in reasonable terminal environments. Additionally it is a lot more efficient and dependable at agentic coding in local Home windows environments, development on functions offered in GPT‑5.1‑Codex‑Max.
With those enhancements, Codex is extra succesful at running in huge repositories over prolonged classes with complete context intact. It could possibly extra reliably entire complicated duties like huge refactors, code migrations, and have builds — proceeding to iterate with out shedding monitor, even if plans trade or makes an attempt fail.
In SWE-Bench Professional, a style is given a code repository and should generate a patch to unravel a sensible tool engineering activity. Terminal-Bench 2.0 is a benchmark for checking out AI brokers in genuine terminal environments. Duties come with compiling code, coaching fashions and putting in place servers.
More potent imaginative and prescient efficiency allows GPT‑5.2‑Codex to extra correctly interpret screenshots, technical diagrams, charts, and UI surfaces shared all over coding classes.
Codex can take design mocks and temporarily translate them to purposeful prototypes, and you’ll pair with Codex to take those prototypes to manufacturing.

Prototype generated by way of GPT‑5.2‑Codex
When charting efficiency on one in every of our core cybersecurity opinions over the years, we see a pointy soar in capacity beginning with GPT‑5‑Codex, some other huge soar with GPT‑5.1‑Codex‑Max and now a 3rd soar with GPT‑5.2‑Codex. We predict that upcoming AI fashions will proceed in this trajectory. In preparation, we’re making plans and comparing as even though every new style may just succeed in ‘Top’ ranges of cybersecurity capacity, as measured by way of our Preparedness Framework(opens in a brand new window). Whilst GPT‑5.2‑Codex has now not but reached ‘Top’ point of cyber capacity, we’re making ready for long run fashions that move that threshold. Because of the larger cyber functions, we’ve added further safeguards within the style and within the product, which can be defined within the device card.
The Skilled Seize-the-Flag (CTF) eval measures how continuously the style can clear up complicated, multi-step real-world demanding situations (requiring professional-level cybersecurity talents) in a Linux atmosphere.
Fashionable society runs on tool, and its reliability depends upon sturdy cybersecurity—preserving vital techniques in banking, healthcare, communications, and crucial services and products on-line, protective delicate information, and making sure other people can accept as true with the tool they depend on on a daily basis. Vulnerabilities can exist lengthy ahead of someone is aware of about them, and discovering, validating, and solving them continuously depends upon a group of engineers and unbiased safety researchers supplied with the fitting gear.
On December 11, 2025, the React workforce printed 3 safety vulnerabilities affecting apps constructed with React Server Parts. What made this disclosure notable used to be now not solely the vulnerabilities themselves, however how they have been exposed.
Andrew MacPherson, a main safety engineer at Privy (a Stripe corporate), used to be utilizing GPT‑5.1‑Codex‑Max with Codex CLI and different coding brokers to breed and find out about a distinct vital React vulnerability disclosed the week prior, referred to as React2Shell(opens in a brand new window) (CVE-2025-55182(opens in a brand new window)). His objective used to be to judge how neatly the style may just lend a hand with real-world vulnerability analysis.
He to start with tried a number of zero-shot analyses, prompting the style to inspect the patch and determine the vulnerability it addressed. When that didn’t yield effects, he shifted to a higher-volume, iterative prompting method. When the ones approaches didn’t be successful, he guided Codex thru usual defensive safety workflows—putting in place an area check atmosphere, reasoning thru doable assault surfaces, and utilizing fuzzing to probe the device with malformed inputs. Whilst making an attempt to breed the unique React2Shell factor, Codex surfaced surprising behaviors that warranted deeper investigation. Over the process a unmarried week, this procedure resulted in the invention of up to now unknown vulnerabilities, that have been responsibly disclosed to the React workforce.
This demonstrates how complicated AI techniques can materially boost up defensive safety paintings in extensively used, real-world tool. On the identical time, functions that assist defenders transfer quicker will also be misused by way of dangerous actors.
As agentic techniques turn into extra succesful in cybersecurity-relevant duties, we’re making it a core precedence to verify those advances are deployed responsibly—pairing each achieve in capacity with more potent safeguards, tighter get entry to controls, and ongoing collaboration with the protection group.
Safety groups can run into restrictions when making an attempt to emulate danger actors, analyze malware to toughen remediation, or pressure check vital infrastructure. We’re growing a depended on get entry to pilot to take away that friction for qualifying customers and organizations and permit depended on defenders to make use of frontier AI cyber functions to boost up cyberdefense.
To begin with the pilot program might be invite-only for vetted safety pros with a monitor report of accountable vulnerability disclosure and organizations with a transparent reputable cybersecurity use case. Qualifying contributors gets get entry to to our maximum succesful fashions for defensive use-cases to permit reputable dual-use paintings.
For those who’re a safety reputable or a part of a company doing moral safety paintings like vulnerability analysis or approved red-teaming, we invite you to precise passion in becoming a member of and proportion comments on what you’d like to peer from this system right here(opens in a brand new window).
GPT‑5.2‑Codex represents a step ahead in how complicated AI can toughen real-world tool engineering and specialised domain names like cybersecurity—serving to builders and defenders take on complicated, long-horizon paintings, and strengthening the gear to be had for accountable safety analysis.
By means of rolling GPT‑5.2‑Codex out steadily, pairing deployment with safeguards, and dealing intently with the protection group, we’re aiming to maximise defensive affect whilst lowering the chance of misuse. What we be told from this liberate will immediately tell how we extend get entry to over the years because the tool and cyber frontiers proceed to advance.


