Prompt Injection Emerges as Top AI Security Threat for Enterprises

193cc Agency CouncilJune 29, 20263 Mins read99 Views

AI prompt injection cybersecurity attack concept

Prompt injection attacks are rapidly becoming one of the biggest cybersecurity risks facing organizations deploying artificial intelligence, according to CrowdStrike’s 2026 Global Threat Report. The report found that more than 90 organizations were targeted by prompt injection attacks during 2025, with malicious prompts used to steal login credentials and cryptocurrency. CrowdStrike said prompts are now functioning much like malware, reflecting a major shift in the cyber threat landscape.

The report also revealed that AI-assisted adversary operations increased by 89% year over year, while 82% of recorded intrusions involved no traditional malicious code. These findings come as businesses increasingly adopt AI agents, copilots and automated browser tools with access to sensitive corporate systems, including email, source code, payment platforms and shared files.

Prompt injection continues to rank as the most critical security issue in the OWASP Top 10 for Large Language Model applications, holding the LLM01 position for a second consecutive edition. OWASP attributes the problem to a fundamental limitation of language models, which cannot consistently distinguish developer instructions from content retrieved through emails, web pages or documents. What began as a research concern has evolved into a real-world security vulnerability, with documented attacks, assigned CVE identifiers and acknowledgments from leading AI developers.

Security researchers distinguish between direct and indirect prompt injection. Direct attacks occur when users deliberately attempt to override a model’s system instructions. Indirect prompt injection is considered more dangerous because attackers hide malicious instructions inside emails, uploaded documents, web pages, calendar invitations or collaboration platforms. AI systems later process that content and unknowingly execute the embedded commands without either the user or attacker interacting directly with the model.

Several publicly disclosed incidents demonstrate the growing risk. In August 2024, PromptArmor reported that an attacker with workspace access to Slack AI could extract sensitive information, including API keys, from private channels by embedding malicious instructions in public channels or uploaded files. In 2025, Aim Security disclosed EchoLeak, tracked as CVE-2025-32711 with a CVSS score of 9.3, describing it as the industry’s first documented zero-click prompt injection attack against a production AI system. The attack enabled a specially crafted email to cause Microsoft 365 Copilot to retrieve internal files and transmit them to an attacker-controlled server without any user interaction. Although both vulnerabilities were patched, experts noted that the broader class of prompt injection attacks remains unresolved.

The attack surface has expanded as organizations deploy increasingly capable AI agents. Systems that send emails, modify cloud infrastructure, execute code or access retrieval-augmented generation (RAG) pipelines can unknowingly process poisoned documents, malicious web content or compromised long-term memory, allowing attackers to manipulate future AI actions. Enterprises using multiple AI models may also be tricked into routing requests through less secure systems.

Major AI developers acknowledge the difficulty of solving the problem. In December 2025, OpenAI publicly stated that prompt injection, much like scams and social engineering, is unlikely to be completely eliminated. The company also revealed it developed reinforcement-learning attackers to identify new injection techniques before they appear in real-world attacks. Anthropic reported in its Claude Opus 4.6 system card that a graphical-interface AI agent was successfully compromised 17.8% of the time with a single injection attempt. Across 200 attempts, the success rate increased to 78.6% without safeguards and remained 57.1% even with published defenses enabled. Google separately reported that its strongest documented attack against a Gemini deployment continued to succeed 53.6% of the time despite adversarial fine-tuning.

The growing threat has prompted new guidance from cybersecurity organizations. Gartner advised CISOs in December 2025 to block AI browsers such as ChatGPT Atlas and Perplexity Comet because of concerns over indirect prompt injection, credential theft and immature security controls. The recommendation followed Cyberhaven research showing that 27.7% of organizations already had at least one user running Atlas. Similar warnings have also been issued by the UK’s National Cyber Security Centre and Germany’s BSI.

Security experts say conventional defenses are proving insufficient because large language models process both instructions and data through the same text channel, making it difficult to separate trusted commands from malicious content. Techniques such as input validation, output filtering, signature-based detection and routine patching address only common attack patterns, while sophisticated, multilingual or image-based prompt injections often bypass existing protections. Even a low failure rate can translate into numerous successful attacks when enterprise AI agents perform thousands of actions each day.

Frameworks designed to improve AI security are still evolving. NIST AI 600-1 identifies prompt injection as an information security risk but primarily focuses on governance rather than technical controls. OWASP’s Top 10 for Agentic Applications, released in December 2025, introduced new categories covering Agent Goal Hijack and Memory and Context Poisoning, although the guidance remains advisory.

The report also found that 65.3% of organizations have no dedicated defenses against prompt injection, relying instead on vendor safeguards, policy documents and employee awareness training. Researchers argue that effective protection requires security controls beyond the AI model itself, including least-privilege access, mandatory human approval for sensitive actions, restrictions on data retrieval, network allowlists and detailed auditing of AI decision-making. The report concludes that organizations should assume AI models will occasionally follow malicious instructions and build external safeguards accordingly, rather than treating the model itself as a trusted security boundary.