Lakera Unveils Open Source Security Benchmark for AI Agents Focusing on LLM Integrity

Lakera Launches Open Source Security Benchmark

Lakera, a leading AI-native security platform under Check Point Software Technologies, has announced the release of an innovative security evaluation tool designed specifically for Large Language Models (LLMs) within AI agents. This initiative is made in collaboration with researchers from the UK AI Security Institute (AISI) and marks an important step in ensuring the integrity and security of AI technologies. The new benchmark, called the Backbone Breaker Benchmark (b3), is tailored to systematically identify vulnerabilities at crucial points where they are most likely to occur.

The Backbone Breaker Benchmark is based on the concept of a “threat snapshot.” Rather than simulating the entire workflow of an AI agent from start to finish, b3 focuses on key moments when vulnerabilities in LLMs are most prevalent. By conducting tests at these critical junctures, developers can assess how resilient their systems are against realistic adversarial attacks without being bogged down by the complexity and overhead of modeling the entire agent workflow.

Mateo Rojas-Carulla, co-founder and Chief Scientist of Lakera, emphasized the benchmark's significance, stating, "Today’s AI agents depend on the security level of the underlying LLM. We developed the b3 benchmark to leverage the 'threat snapshot' concept, allowing for a structured discovery of vulnerabilities that have previously been hidden within complex agent workflows. By releasing this benchmark as open source, we aim to provide developers and model providers with the tools to measure and improve their security levels more practically."

The b3 benchmark comprises ten representative threat snapshots, combined with a gamified red teaming exercise tool known as Gandalf: Agent Breaker. This tool has generated a high-quality dataset of 19,433 attack scenarios, evaluating vulnerabilities against a range of potential threats including prompt leakage, phishing link insertion, malicious code injection, denial of service attacks, and unauthorized tool invocation.

Initial research findings from validating 31 leading LLMs through the b3 benchmark have uncovered several key insights:

- Enhancements in inference capabilities lead to substantial improvements in security.
- There is no correlation between model size and security performance.
- Closed-source models typically outperform open-source ones; however, elite open models are narrowing this gap.

The b3 benchmark is now available under an open-source license here.

Gandalf: Agent Breaker

Alongside b3, Lakera has introduced Gandalf: Agent Breaker, a hacking simulator game that challenges users to exploit AI agents within realistic scenarios. The game features ten different generative AI applications that replicate actual AI agent behaviors, each offering varied difficulty levels and defense capabilities. Players are tasked with utilizing a range of skills, from prompt engineering to red teaming, to navigate multiple attack surfaces against these agents, which may involve chat-based outputs, code-level reasoning, memory usage, and external tool functionalities.

Originally conceived during an internal hackathon at Lakera, Gandalf emerged from competitive efforts between blue and red teams aiming to build the most effective defenses and attacks against an LLM safeguarding a secret password. Since its release in 2023, Gandalf has blossomed into one of the world’s largest red teaming communities, generating over 80 million data points. Initially developed as a game, it now plays a crucial role in revealing the real vulnerabilities of generative AI applications and raising awareness of AI-first security's importance.

About Lakera

Lakera stands as a premier AI-native security platform dedicated to agent-based AI applications. It safeguards Fortune 500 companies and major tech firms from cutting-edge AI cyber risks. With a defense system powered by one of the world’s top red teaming communities, Gandalf, combined with proprietary AI advancements, Lakera evolves in real-time to protect enterprises from emerging threats. Founded in 2021 by David Haber, Mateo Rojas-Carulla, and Matthias Kraft, the company has its headquarters in Zurich and San Francisco. For more information, visit Lakera.ai or follow them on LinkedIn.

About Check Point

Check Point Software Technologies is a leading provider of digital trust and protects over 100,000 organizations worldwide with AI-driven cybersecurity solutions. Its Infinity Architecture, incorporating a hybrid mesh network model centered around SASE, integrates the management of on-premises, cloud, and workspace environments, delivering flexibility, simplicity, and scalability to enterprises and service providers. Established in October 1997, Check Point Software Technologies Japan Office operates out of Minato-ku, Tokyo. For further details, visit CheckPoint.com or explore their social media channels.