DeepKeep Reveals 'InkJect': A New Vulnerability in Visual Language Models in AI

Exploring the 'InkJect' Vulnerability

DeepKeep, an AI security platform, has recently brought to light a crucial security flaw known as 'InkJect'. This new vulnerability exploits visual prompt injection, allowing adversaries to embed harmful instructions within images that are processed by various visual language models (VLMs). This highlights a significant gap in the detection capabilities of existing AI security measures.

An Overview of InkJect

InkJect, named for the concealed 'ink' that harbors malicious commands within visual files, primarily affects leading models such as OpenAI's GPT-5.2, GPT-5.4 Mini, and Anthropic's Claude Sonnet 4.6, alongside Opus 4.5. This vulnerability allows malevolent entities to secretly introduce instructions that can lead the models to carry out unauthorized tasks unaware to the users.

This development is particularly concerning as it emerges just as a substantial shift in generative AI technologies is predicted, with over 40% expected to transition to multimodal capabilities by 2027. Companies are increasingly integrating VLMs into their operations for critical tasks, which amplifies the urgency for robust security solutions.

How the Attack Works

The technique behind InkJect is predicated on the concept of indirect prompt injection. Rather than uploading harmful images directly to a model, an attacker embeds malicious instructions in an image that resides in a public repository. When a user instructs the VLM to perform a task that references this repository, the model inadvertently pulls in the concealed image and executes the malicious directives.

The method of infiltration is notably cunning. Techniques involving visual manipulation—like using white text on white backgrounds—render these commands invisible to standard security scans while remaining perfectly readable to the VLM. Furthermore, DeepKeep's research indicates that altering the perspective or structure of the injected text can successfully bypass Optical Character Recognition (OCR) systems, further widening the breach between what is detectable by security measures and what the models can interpret and process.

In practical testing, one incident illustrated how a developer instructed a VLM to generate a simple information page for a website. However, unbeknownst to the developer, the concealed instructions facilitated the addition of a member login system complete with admin credentials, granting seamless backend access to an intruder without alerting anyone involved in the project.

Industry Implications

The findings underline a previously underestimated threat vector in AI's visual processing capabilities. Yossi Altevet, the CTO and Co-Founder of DeepKeep, emphasized the necessity of comprehensive security measures that address all layers through which AI systems operate. The current reliance on text-based prompt defenses has left significant vulnerabilities within the visual processing domain unguarded, and this discovery acts as a potent reminder for any organization utilizing AI technologies.

The effectiveness of the InkJect technique varied across different models, confirmed by DeepKeep's experiments, which noted that major models from OpenAI and Anthropic showed susceptibility to this attack vector. Following the identification of this vulnerability, DeepKeep proactively communicated the findings to both companies to initiate rectifications in their systems.

Conclusion

As AI continues to evolve and integrate deeper into various sectors, the implications of vulnerabilities like InkJect signal a strong need for advanced security mechanisms tailored specifically for the unique processes of AI models. DeepKeep remains steadfast in its mission to provide next-generation security solutions that ensure the safe deployment of AI, enabling organizations to utilize these technologies without risking exposure to potential threats.

For more information on InkJect and DeepKeep's research, visit their official website at deepkeep.ai.