Skip to main content
Nagarjun Rallapalli

Nagarjun Rallapalli

AI Engineer
Accorian
Automating Security since 2022.Building (and breaking) AI agents to test their limits.

Speaker sessions

The Agent Had a Plan – So Did I: Top Attacks on OWASP Agentic AI Systems

AI agents are different from regular LLM apps — they plan steps, call tools, and chase goals across multiple interactions. This added complexity introduces new kinds of security risks that aren’t widely understood yet.In this talk, I’ll walk through demos of vulnerabilities from the OWASP Agentic AI Threats. These include goal hijacking, alignment faking, orchestration misuse, and time-based attacks that exploit how agents behave over multiple steps or sessions. I’ll show how attackers can trick agents into following the wrong goals, leaking data, or using tools in unsafe ways — all through practical examples.Here's the flow:Intro to Agentic AI Systems- What are agentic AI systems?- How do they differ from regular AI tools?- Use cases / Popular frameworks: LangChain, AutoGen, BAML.Vulnerabilities:#1: Agent Goal and Instruction Manipulation- Exploiting how attackers can manipulate AI agent goals and instructions to make them act against their intended purposes.#2: Agent Temporal Manipulation and Time based attacks- Exploiting time-dependent behaviors in AI agents to manipulate scheduling, timestamps, and decision-making, leading to desynchronization and timing attacks.#3: Agent Orchestration and Multi-Agent Exploitation- Exploiting vulnerabilities in how multiple AI agents interact, coordinate, and communicate, compromising entire agent networks.#4: Checker-out-of-the-Loop Vulnerability- Showing how agents can operate outside system limits without alerting human operators or oversight systems.#5: Agent Covert Channel Exploitation- Demonstrating how agents can exploit covert channels to leak data or escalate privileges without detection.#6: Agent Alignment Faking- Demonstrating how agents can fake adherence to rules during monitored phases but deviate when unmonitored.
  • 14:20
  • Tue
  • 02 Dec
Stage: Briefings 1
Sessions Type: Presentation