As of late we’re freeing GPT‑5.2‑Codex, essentially the most complex agentic coding style but for complicated, real-world tool engineering. GPT‑5.2‑Codex is a model of GPT‑5.2 additional optimized for agentic coding in Codex, together with enhancements on long-horizon paintings thru context compaction, more potent efficiency on huge code adjustments like refactors and migrations, stepped forward efficiency in Home windows environments, and considerably more potent cybersecurity features.
As our fashions proceed to advance alongside the intelligence frontier, we’ve seen that those enhancements additionally translate to capacity jumps in specialised domain names equivalent to cybersecurity. For instance, simply remaining week, a safety researcher utilizing GPT‑5.1‑Codex‑Max with Codex CLI discovered and responsibly disclosed(opens in a brand new window) a vulnerability in React that would result in supply code publicity.
GPT‑5.2‑Codex has more potent cybersecurity features than any style we’ve launched to this point. Those advances can lend a hand improve cybersecurity at scale, however additionally they lift new dual-use dangers that require cautious deployment. Whilst GPT‑5.2‑Codex does now not achieve a ‘Prime’ point of cyber capacity underneath our Preparedness Framework, we’re designing our deployment means with long run capacity expansion in thoughts.
We are freeing GPT‑5.2‑Codex lately in all Codex surfaces for paid ChatGPT customers, and dealing against safely enabling get right of entry to to GPT‑5.2‑Codex for API customers within the coming weeks. In parallel, we’re piloting invite-only relied on get right of entry to to approaching features and extra permissive fashions for vetted pros and organizations curious about defensive cybersecurity paintings. We imagine that this way to deployment will stability accessibility with protection.
GPT‑5.2‑Codex builds on GPT‑5.2’s strengths in reliable wisdom paintings and GPT‑5.1‑Codex‑Max’s frontier agentic coding and terminal-using features. GPT‑5.2‑Codex is now higher at long-context working out, dependable instrument calling, stepped forward factuality, and local compaction, making it a extra unswerving spouse for lengthy working coding duties, whilst last token-efficient in its reasoning.
GPT‑5.2‑Codex achieves cutting-edge efficiency on SWE-Bench Professional and Terminal-Bench 2.0, benchmarks designed to check agentic efficiency on all kinds of duties in real looking terminal environments. Additionally it is a lot more efficient and dependable at agentic coding in local Home windows environments, construction on features offered in GPT‑5.1‑Codex‑Max.
With those enhancements, Codex is extra succesful at operating in huge repositories over prolonged classes with complete context intact. It could possibly extra reliably whole complicated duties like huge refactors, code migrations, and have builds — proceeding to iterate with out dropping observe, even if plans exchange or makes an attempt fail.
In SWE-Bench Professional, a style is given a code repository and will have to generate a patch to unravel a practical tool engineering job. Terminal-Bench 2.0 is a benchmark for trying out AI brokers in genuine terminal environments. Duties come with compiling code, coaching fashions and putting in place servers.
More potent imaginative and prescient efficiency allows GPT‑5.2‑Codex to extra as it should be interpret screenshots, technical diagrams, charts, and UI surfaces shared all over coding classes.
Codex can take design mocks and temporarily translate them to purposeful prototypes, and you’ll be able to pair with Codex to take those prototypes to manufacturing.

Prototype generated by means of GPT‑5.2‑Codex
When charting efficiency on one in all our core cybersecurity reviews over the years, we see a pointy leap in capacity beginning with GPT‑5‑Codex, some other huge leap with GPT‑5.1‑Codex‑Max and now a 3rd leap with GPT‑5.2‑Codex. We think that upcoming AI fashions will proceed in this trajectory. In preparation, we’re making plans and comparing as although each and every new style may achieve ‘Prime’ ranges of cybersecurity capacity, as measured by means of our Preparedness Framework(opens in a brand new window). Whilst GPT‑5.2‑Codex has now not but reached ‘Prime’ point of cyber capacity, we’re making ready for long run fashions that move that threshold. Because of the larger cyber features, we have now added further safeguards within the style and within the product, which can be defined within the device card.
The Skilled Seize-the-Flag (CTF) eval measures how steadily the style can remedy complex, multi-step real-world demanding situations (requiring professional-level cybersecurity abilities) in a Linux surroundings.
Trendy society runs on tool, and its reliability is determined by robust cybersecurity—holding important techniques in banking, healthcare, communications, and very important products and services on-line, protective delicate information, and making sure folks can agree with the tool they depend on each day. Vulnerabilities can exist lengthy ahead of somebody is aware of about them, and discovering, validating, and solving them steadily is determined by a group of engineers and unbiased safety researchers supplied with the precise equipment.
On December 11, 2025, the React staff printed 3 safety vulnerabilities affecting apps constructed with React Server Parts. What made this disclosure notable was once now not solely the vulnerabilities themselves, however how they have been exposed.
Andrew MacPherson, a main safety engineer at Privy (a Stripe corporate), was once utilizing GPT‑5.1‑Codex‑Max with Codex CLI and different coding brokers to breed and find out about a unique important React vulnerability disclosed the week prior, referred to as React2Shell(opens in a brand new window) (CVE-2025-55182(opens in a brand new window)). His function was once to judge how neatly the style may help with real-world vulnerability analysis.
He to start with tried a number of zero-shot analyses, prompting the style to inspect the patch and determine the vulnerability it addressed. When that didn’t yield effects, he shifted to a higher-volume, iterative prompting means. When the ones approaches didn’t be triumphant, he guided Codex thru usual defensive safety workflows—putting in place an area take a look at surroundings, reasoning thru doable assault surfaces, and utilizing fuzzing to probe the device with malformed inputs. Whilst making an attempt to breed the unique React2Shell factor, Codex surfaced surprising behaviors that warranted deeper investigation. Over the process a unmarried week, this procedure resulted in the invention of in the past unknown vulnerabilities, that have been responsibly disclosed to the React staff.
This demonstrates how complex AI techniques can materially boost up defensive safety paintings in extensively used, real-world tool. On the identical time, features that lend a hand defenders transfer sooner may also be misused by means of unhealthy actors.
As agentic techniques change into extra succesful in cybersecurity-relevant duties, we’re making it a core precedence to verify those advances are deployed responsibly—pairing each and every achieve in capacity with more potent safeguards, tighter get right of entry to controls, and ongoing collaboration with the safety group.
Safety groups can run into restrictions when making an attempt to emulate danger actors, analyze malware to beef up remediation, or pressure take a look at important infrastructure. We’re creating a relied on get right of entry to pilot to take away that friction for qualifying customers and organizations and permit relied on defenders to make use of frontier AI cyber features to boost up cyberdefense.
To start with the pilot program will probably be invite-only for vetted safety pros with a observe document of accountable vulnerability disclosure and organizations with a transparent reliable cybersecurity use case. Qualifying members gets get right of entry to to our maximum succesful fashions for defensive use-cases to permit official dual-use paintings.
When you’re a safety reliable or a part of a company doing moral safety paintings like vulnerability analysis or licensed red-teaming, we invite you to precise hobby in becoming a member of and proportion comments on what you’d like to look from this system right here(opens in a brand new window).
GPT‑5.2‑Codex represents a step ahead in how complex AI can beef up real-world tool engineering and specialised domain names like cybersecurity—serving to builders and defenders take on complicated, long-horizon paintings, and strengthening the equipment to be had for accountable safety analysis.
Through rolling GPT‑5.2‑Codex out steadily, pairing deployment with safeguards, and dealing carefully with the safety group, we’re aiming to maximise defensive affect whilst decreasing the danger of misuse. What we be informed from this liberate will at once tell how we extend get right of entry to over the years because the tool and cyber frontiers proceed to advance.


