Abstract: This paper explores ways to improve the effectiveness of penetration testing amidst the increasing complexity of cyber threats. The focus is placed on leveraging artificial intelligence (AI) ...
CTI-REALM is Microsoft’s open-source benchmark that evaluates AI agents on real-world detection engineering. It measures ...