Scoring AI hackers when there is no answer key
AI models are solving more and more of the offensive-cyber tests built to measure them. Once a model solves most of a benchmark, that benchmark runs out of room and says little about the best systems anymore. Many of those tests also lean on bugs that … Continue reading Scoring AI hackers when there is no answer key

