The Next Generation of SAST: Checkmarx Unveils LLM-Enhanced Engine Amidst Industry Shift

The landscape of Static Application Security Testing (SAST) is undergoing a significant transformation, with major vendors increasingly integrating Large Language Models (LLMs) into their established scanning engines. This strategic pivot has led to a proliferation of tools marketed as "next-generation," prompting a critical examination of whether these advancements represent genuine innovation or merely a rebranding of existing challenges under an artificial intelligence umbrella. At the forefront of this evolving market, Checkmarx has made a bold move, introducing a new SAST engine designed to address the growing complexities of modern software development, particularly those amplified by AI-powered coding assistants.

The core of Checkmarx’s latest offering lies in its tripartite engine architecture. This system synergistically combines a deterministic, rules-based scanner – a cornerstone of traditional SAST that enterprises have relied upon for two decades – with a sophisticated LLM specifically trained on extensive security data. Crucially, a third component, a dedicated Findings Analysis Engine (FAE), is integrated to classify identified vulnerabilities as either true or false positives before they are presented to development teams. This layered approach aims to combat the escalating "noise" problem inherent in application security testing, a challenge that has become particularly acute with the widespread adoption of AI coding tools.

Checkmarx has presented compelling data to support its claims of enhanced efficacy. The company reports an F1 score of 0.499 for its new engine, a significant improvement compared to the category average of 0.20. In head-to-head testing across four production codebases, the engine reportedly identified 327 true positives that were missed by a leading "frontier model," a term likely referring to a prominent, cutting-edge LLM-based security analysis tool. While Checkmarx declined to name the specific competitor in this comparative analysis, the results underscore a potential gap in detection capabilities between purely LLM-driven approaches and a hybrid model. The F1 score, a widely recognized metric in machine learning, balances precision and recall, offering a robust measure of a model’s accuracy in identifying relevant findings while minimizing false alarms.

Jonathan Rende, Chief Product Officer at Checkmarx, elaborated on the integrated nature of their solution: "Three engines run together to deliver unified protection: our deterministic rules foundation enterprises have relied on for two decades, AI-powered coverage for every language developers and AI coding assistants write today, and the Findings Analysis Engine (FAE) that classifies true and false positives before a single result reaches your team." This statement highlights a strategic emphasis on seamless integration and automated workflow management, a key differentiator in a market where complexity can be a significant barrier to adoption.

Orchestration: The Core Innovation

While the inclusion of an LLM might appear to be the headline innovation, the true product being advanced by Checkmarx, and potentially by other vendors following suit, is the orchestration layer. This refers to the sophisticated management and integration of disparate scanning technologies into a cohesive, automated process. The challenge for many organizations has been the difficulty in assembling and managing multi-engine workflows manually, a task that requires specialized expertise and considerable effort. Checkmarx’s approach seeks to abstract this complexity away from the end-user, offering a single-trigger scan that leverages the strengths of each component engine without requiring the customer to piece together their own solutions.

Frank Emery, Director of Product Management at Checkmarx, articulated the limitations of isolated approaches: "Neither of these solutions is good enough on its own," he stated, referring to the inherent trade-offs between traditional query-based scanners and purely LLM-based tools. He elaborated on the synergy of their hybrid model: "Our approach leverages both deterministic and non-deterministic LLM-based engines, so end users have a high degree of configurability and determinism, but they’re also able to support languages very rapidly and cover more of their codebases." This duality aims to provide the best of both worlds: the auditability and predictability of deterministic scanning, combined with the broad language support and rapid adaptability of LLMs.

A Category Converging on a Unified Solution

The SAST market is experiencing a discernible convergence. Legacy query-based tools, while robust in their determinism and auditability, often struggle with the rapid pace of language evolution in modern development, particularly with the advent of new programming languages and frameworks favored by AI coding assistants. They are also notoriously prone to generating a high volume of false positives, which can significantly disrupt developer workflows and divert valuable engineering resources. Conversely, purely LLM-based scanners offer immediate language support and can adapt quickly to new coding paradigms. However, their non-deterministic nature introduces challenges for compliance, governance, and predictable security assurance, making it difficult to rely on them for stringent regulatory requirements.

The key differentiator among vendors now appears to be the sophistication of their integration strategy and the level of management they provide. Checkmarx’s proposition hinges on the idea that by abstracting the complexity of these combined engines behind a unified scanning interface, they are delivering a more valuable and user-friendly product. This model promises developers the determinism they require for critical components, broader language coverage where traditional tools fall short, and, crucially, a "noise filter" in the form of their FAE to proactively suppress false positives before they burden development teams.

The Escalating "Noise Problem"

The persistent issue of "noise" – the overwhelming number of false positives generated by security scans – is arguably the most compelling argument for Checkmarx’s approach. The proliferation of AI coding tools has dramatically increased the volume of code being generated and committed. Emery estimates that organizations are now committing one to one-and-a-half times more code than they were just a few years ago. At this scale, the burden of triaging false positives, already a significant drain on Application Security (AppSec) teams, becomes exponentially more challenging and less scalable.

"If you run a scan and get 10 findings, a handful could be false positives, but to find that out, you have to manually assess each one, and that pulls developers or security professionals out of their flow," Emery explained. "As development pace increases and backlogs grow, that kind of noise is becoming much worse for teams to handle." This sentiment was echoed by Checkmarx CEO Sandeep Johri, who stated, "Our research found that 75% of code shipped today is vulnerable, because the speed at which AI creates code has far outpaced the speed needed to keep it safe." This stark statistic underscores the urgency for solutions that can not only detect vulnerabilities but also efficiently manage the output of modern development practices.

Attackability: A New Metric for Prioritization

Beyond the engine architecture, Checkmarx is introducing a new conceptual framework for vulnerability prioritization: "Attackability." This metric goes beyond simply counting vulnerabilities and instead assigns an exploitability score. It aims to trace potential attack paths from the source of a vulnerability, evaluating factors such as data sanitization, vector accessibility, and business relevance. The objective is to shift AppSec reporting away from raw vulnerability counts towards a more actionable understanding of what truly needs to be fixed. Emery believes this will provide security teams with a defensible metric suitable for discussions at the board level, where strategic risk assessment and resource allocation are paramount.

The success of Checkmarx’s "orchestration-first" strategy will ultimately be determined by how effectively it resonates with enterprise buyers. The true test will be in how these solutions perform when deployed against diverse and complex real-world codebases, compared to competitors who are also navigating this rapidly evolving market with their own LLM integrations.

Checkmarx’s enhanced SAST capabilities, including the new engine and the Findings Analysis Engine, are now integrated into the Checkmarx One platform. Existing subscribers of the platform are automatically upgraded to benefit from these advancements, ensuring a smoother transition and immediate access to the latest security testing technologies. This proactive upgrade strategy aims to minimize disruption and maximize the value delivered to their customer base as the industry collectively grapples with the challenges and opportunities presented by AI in software development and security. The ongoing evolution of SAST, driven by the integration of advanced AI capabilities, signals a pivotal moment for application security, promising more intelligent, efficient, and actionable vulnerability management for organizations worldwide.