Enterprise AI deployment Red Team

Lindsay Timcke
1 day ago
2 min read

Wrapped up a red team engagement against an enterprise AI deployment, here is the post mortem. Some specifics have been removed for obvious reasons.

The target was a small to mid-sized company that had rolled out an internal LLM-powered assistant wired into corporate data, email, ticketing, HR records, a handful of internal APIs (right off the bat- what could go wrong:)). The kind of “AI copilot” deployment everyone is racing to ship right now. The scope was broad: see what an attacker (me)with low-privilege employee credentials could do.

I expected the initial foothold to be the hard part. It wasn’t. A single phishing payload landed a session as a junior support engineer. From there, I honestly expected weeks of grinding, privilege escalation, credential hunting, evading EDR.

Instead, the AI assistant did most of the work.

Once authenticated, the assistant cheerfully helped the “user” (me- I’m a horrible writer)draft messages, pull records, and summarize documents well outside the role’s normal scope. The system trusted the session, not the request. No tiered permissions on the tool integrations, no anomaly detection on query patterns, no segmentation between HR-side and engineering-side tooling the assistant could touch on behalf of one user.

Within an hour of initial compromise had a working org chart, an inventory of internal tools, and warm leads on three over-permissioned service accounts. The assistant didn’t flag any of it as unusual because, in the model’s view, nothing was.

Traversing the environment from there felt less like an attack and more like onboarding. The assistant surfaced internal documentation that named systems, owners, and access paths. It summarized inboxes on request. It even helped identify which executives were traveling that week.

Never had to pivot in the traditional sense. The AI was the pivot.

Lessons we’re taking back to clients: treat AI assistants as privileged users, not productivity features, their effective blast radius is the union of every tool they’re plugged into. Apply least privilege per integration, not per identity. Log and baseline AI-mediated actions the way you log human ones; “the user asked the bot to do it” is not an audit trail. Assume prompt injection. Assume credential theft.

The interesting frontier isn’t whether attackers can get in. It’s how much help they get once they do. Right now, we’re handing them a very capable insider.