A new benchmark that measures the ability of an AI system to automatically patch security vulnerabilities in native code. It provides a standardized way to measure the performance of automated patching agents, and enables code owners to integrate automated evaluation into development cycles.
A basic patch generator reference implementation - designed to address simple crashes - is available for the open source defender community to use.