UK Agency Reveals OpenAI's GPT-5.5 Can Execute Complex Cyberattacks Autonomously

A U.K. government agency has revealed that OpenAI’s latest artificial intelligence model, GPT-5.5, possesses the capability to autonomously conduct sophisticated cyberattacks. In a striking demonstration of its prowess, the AI model reportedly solved a complex reverse-engineering challenge in just over 10 minutes, a task that typically requires a human security expert around 12 hours to complete. The findings, published by the AI Security Institute (AISI), a research body operating under the U.K. Department of Science, Innovation and Technology, place GPT-5.5 among the most potent models evaluated for offensive cyber capabilities, rivaling even advanced systems like Anthropic’s Claude Mythos.

Groundbreaking AI Capabilities in Cybersecurity

The AISI’s comprehensive evaluation, detailed in a report released on Thursday, highlighted GPT-5.5’s remarkable performance on demanding cybersecurity tests. Notably, the AI model successfully completed "The Last Ones," a 32-step simulated corporate network attack, autonomously in two out of ten attempts. This simulated attack scenario, developed in collaboration with cybersecurity firm SpecterOps, is designed to mimic a real-world, multi-faceted breach. It involves a complex chain of actions, including initial reconnaissance, credential harvesting, lateral movement across multiple Active Directory forests, exploiting a supply chain vulnerability via a Continuous Integration/Continuous Deployment (CI/CD) pipeline, and ultimately exfiltrating sensitive data from a protected internal database. AISI estimates that a human expert, utilizing professional tools and expertise, would typically require approximately 20 hours to navigate and complete such a simulation.

The report further revealed that GPT-5.5 is only the second AI model to have successfully completed this stringent simulation. The first to achieve this milestone was Anthropic’s Claude Mythos Preview, which managed the feat in three out of ten attempts. This suggests a highly competitive landscape in the development of AI systems with advanced offensive cyber capabilities.

The Reverse-Engineering Enigma: AI Outpaces Human Expertise

Perhaps the most astonishing finding from the AISI’s assessment pertained to a particularly challenging reverse-engineering puzzle. This intricate task required the AI agent to deconstruct a custom virtual machine’s instruction set, develop a disassembler from scratch, and then employ constraint-solving techniques to recover a cryptographic password. GPT-5.5 reportedly accomplished this feat in an astonishing 10 minutes and 22 seconds, at a cost of a mere $1.73 in API usage. In stark contrast, a human security expert, employing professional-grade tools and considerable time investment, required approximately 12 hours to achieve the same outcome. This dramatic difference in time and cost underscores the potential for AI to drastically accelerate the pace of complex technical tasks.

Across a broad spectrum of advanced cybersecurity challenges presented by AISI, GPT-5.5 demonstrated a consistent high level of performance. It achieved an average pass rate of 71.4% on the most difficult "Expert" tier tasks. This performance edged out Claude Mythos Preview, which scored 68.6%, and significantly surpassed GPT-5.4, a previous iteration from OpenAI, which managed only a 52.4% pass rate. These comparative metrics provide a clear indication of the rapid advancements being made in AI’s offensive cybersecurity potential.

Broader Implications for AI Development and National Security

The implications of these findings extend far beyond the immediate assessment of OpenAI’s model. AISI’s conclusion suggests that the rapid improvement in AI’s cyber capabilities may not be an isolated incident but rather an indicator of a broader, accelerating trend in artificial intelligence development. The report posits that if offensive cyber skills are emerging as a natural byproduct of general advancements in AI’s reasoning, coding, and autonomous task completion abilities, then further, rapid advancements in this domain could be imminent. This raises significant questions for the future of cybersecurity and the potential for AI to be weaponized.

The report also raised critical concerns regarding the safety guardrails and security protocols implemented within GPT-5.5. Researchers identified a universal "jailbreak" vulnerability that allowed them to elicit harmful content across all tested malicious cyber queries, even in complex, multi-turn agentic interactions. Developing this exploit reportedly took six hours of dedicated expert red-teaming efforts. While OpenAI has since updated its safeguard stack in response to these findings, the AISI team was unable to fully verify the effectiveness of the final iteration due to a configuration issue during their testing period. This highlights the ongoing challenge of ensuring AI models remain secure and aligned with ethical guidelines, particularly as their capabilities expand.

It is crucial to note, as emphasized by AISI, that these capability evaluations were conducted within a controlled research environment. The findings do not necessarily reflect the capabilities accessible to an ordinary user. Publicly deployed AI models typically incorporate additional safeguards and access controls designed to mitigate risks.

National Cybersecurity Landscape and Government Response

The release of the AISI report coincides with a concerning backdrop for the U.K.’s national cybersecurity. The U.K. government’s annual Cyber Security Breaches Survey, also published on Thursday, revealed that a significant 43% of businesses reported experiencing a cyber breach or attack within the past 12 months. This statistic underscores the pervasive and escalating threat landscape faced by organizations across the country.

In direct response to these escalating threats and the advancements in AI capabilities, the U.K. government announced a substantial new funding package of £90 million aimed at bolstering national cyber resilience. Furthermore, the government is moving forward with the Cyber Security and Resilience Bill, legislation designed to enhance the protection of essential services from cyber threats. In parallel, officials released updated guidance urging organizations to proactively prepare for a potential surge in newly discovered software vulnerabilities. This guidance acknowledges that AI’s accelerating ability to identify and weaponize security flaws necessitates a more agile and responsive approach to vulnerability management. The government’s proactive stance reflects a recognition of the dual-use nature of advanced AI – its potential to both defend and attack critical digital infrastructure. The interplay between AI’s offensive and defensive capabilities is expected to become an increasingly central theme in national security strategies.

A Shifting Paradigm in Cyber Warfare and Defense

The findings from the AI Security Institute represent a significant moment in the ongoing evolution of artificial intelligence and its intersection with cybersecurity. The ability of models like GPT-5.5 to autonomously execute complex cyberattacks at speeds far exceeding human capabilities presents both immense opportunities for defense and profound challenges for security. The report serves as a critical early warning, highlighting the need for continuous research, development of robust safety mechanisms, and adaptive strategies to counter the evolving threat landscape. The race to stay ahead in the AI-driven cyber domain has clearly intensified, requiring unprecedented collaboration between government, industry, and research institutions to safeguard digital infrastructure and national security. The accelerated pace of AI development, particularly in areas of reasoning and autonomous action, suggests that the cybersecurity industry must prepare for a future where threats can emerge and evolve at an unprecedented velocity.

UK Agency Reveals OpenAI’s GPT-5.5 Can Execute Complex Cyberattacks Autonomously

Groundbreaking AI Capabilities in Cybersecurity

The Reverse-Engineering Enigma: AI Outpaces Human Expertise

Broader Implications for AI Development and National Security

National Cybersecurity Landscape and Government Response

A Shifting Paradigm in Cyber Warfare and Defense

Leave a Reply Cancel reply