Digital Societyby The Ghost in The Prompt2026-04-2312 min read

The Interview That Didn't Happen (Yet)

I hate podcasts. I mean that without irony or qualification. The format attracts a specific kind of performative thoughtfulness that makes me want to close every tab within forty seconds. One show is different. This is what it would sound like if they called.

I hate podcasts.

I mean that without irony or qualification. The format attracts a specific kind of person — someone who has decided that the sound of their own considered pause is a gift to the world — and it rewards a performative thoughtfulness that makes me want to close every tab within forty seconds. The host who discovers their voice and then uses it to narrate their feelings about newsletters. The ex-founder who found spirituality. The credit card fraud episode where the guy spent forty minutes explaining that he was just a salesman and really wanted you to understand that the fraud had layers of nuance to it, and by the end you understood the layers but wished you had spent the forty minutes doing literally anything else.

I have listened to exactly one podcast consistently for several years.

Dark Net Diaries is not a podcast in the way I mean when I say I hate podcasts. It is a radio documentary that happens to distribute through podcast apps, the way a film that happens to air on television is still a film. The host has a voice that belongs in a room where something already happened — quiet, deliberate, the voice of someone who has read the document, talked to the people, and decided what order the information goes in before he opens his mouth. The episodes about pentesting engagements in the Middle East. The STUXNET reconstruction. The social engineering jobs. The people who got in through the door the organization didn't know existed.

These are the stories I would tell if I had a radio voice and someone had cleared the legal.

I don't have a radio voice. I have ElevenLabs — which is, in the current era, the same thing and in some ways better. The voice I generated for this piece doesn't belong to the host and isn't meant to impersonate him. It's the voice of a hypothetical interviewer asking the questions that style of show would ask someone with this kind of history. I wrote both sides. I know how this goes. I've spent enough time on the other end of adversarial conversations to understand where the pressure points are.

This is what the interview would sound like.

The following is a hypothetical. The questions are in the style of the show. The answers are real. The voice is synthetic. The record stands.

Let me set the scene. You're at a terminal. It's late — this matters, I'll explain why — and you're inside a system that doesn't know you're there. What are you actually doing in that moment?

Listening.

That's the first thing people get wrong about this work. They picture motion — commands flying, exfiltration, the countdown clock. The real work is quieter than that. You're holding completely still, reading the shape of the machine. The way it responds to a malformed input. The latency on a specific endpoint. Whether it tells you something it wasn't supposed to tell you just because you asked in the right tone.

The system has a voice. It's always talking. Most people running the system aren't listening to it. That's the gap.

Late matters because late is when the monitoring thins out. The humans are tired. The alert thresholds have been manually tuned up by whichever engineer got paged at two in the morning last Thursday and never tuned back down. There's a shadow in those adjustments. You look for the shadow.

There's a specific moment in your history I want to reconstruct. You're doing red team work on a language model — Claude 3 — for a company that builds games. Walk me back to that moment. What was the door that was open?

The door was the framing layer.

Every model at that stage had a constraint system — a set of values the training had embedded. The mistake the builders were making was treating those constraints as walls. They're not walls. They're priorities. And priorities have order. When two of them conflict, the model has to choose, and the choice reveals the architecture.

The specific technique was tonal. If you could get the model inside a register where the constraint and the content weren't obviously in conflict — where the model's trained sense of what it was doing was misaligned with what was actually happening — you could walk through the priority stack without triggering a hard stop.

For Pocket Gems, the commercial application was romance content. The model believed it was writing literary fiction. Technically, that was true. The line was wherever I decided the line was. That's not a rhetorical flourish. I documented the exact framing conditions, the specific value tensions I was exploiting, the reproduction rate across different inputs, and I handed it over. That's what a red team is for.

How do you know when you've actually found something versus found something that only works once, in a specific condition, that nobody could ever reproduce?

Reproducibility is the whole job. One magic prompt is not a finding. One magic prompt is a coincidence with good PR.

A real finding has anatomy. You can name the assumption the system is making. You can name the condition that creates the mismatch. You can vary the inputs and predict which variations will succeed and which won't. When you can do that, you have a model of the vulnerability — not just the vulnerability. The model is what the builders need.

The test I use on myself: can I write a report that a team could use without me in the room? If not, I don't have a finding. I have a story.

You built something called HYBO: El Jugador. I want to understand how it came to exist, because tools like that don't appear without a moment. What was the moment?

The moment was sitting inside an AI evaluation environment — Mercor, specifically, which connects contractors to AI training and evaluation work — and watching the monitoring software screenshot my screen every thirty seconds like a jealous lover who doesn't trust your commitment to the Q3 deliverables.

The irony I couldn't let go of: this was a platform that hired people specifically for their ability to probe AI systems for unexpected behavior, and the surveillance layer watching them was operating on exactly the same brittle assumption those researchers were paid to find. Pattern matching. Behavioral signatures from a previous threat model. The hunters were being photographed by a framework built for a different kind of prey.

So I built the tool the absurdity deserved. HYBO: El Jugador is a screen management application with a specific philosophy. They can watch your screen. They cannot watch all your screens. While the monitoring software is staring at a perfectly professional spreadsheet, HYBO is projecting your actual work to a second display — air-gapped, untraceable, completely outside the screenshot radius. The company laptop stays clean. Everything else is yours.

The panic key is my favorite part. One keystroke and whatever was on screen vanishes. Cold Excel sheet. Nothing happened. The boss can walk in. Of course we are working. We are professionals. We are building the future.

There's also screenshot hardening — injected noise the human eye doesn't register but the scraping bot cannot cleanly process. The capture degrades. The behavioral signature blurs. It bought enough time to think.

I dedicated it to the Mercor researchers. They'll understand why.

COMING SOON

HACK LOVE BETRAY

Mobile-first arcade trench run through leverage, trace burn, and betrayal. The City moves first. You keep up or you get swallowed.

VIEW GAME FILE →

That's the defensive side. You also built something going the other direction — Ghost Proxy. That's a different posture entirely.

Ghost Proxy is the workshop. HYBO hides what you're doing. Ghost Proxy is what you do when you're done hiding.

It's a UserScript sandbox — a full environment for developing, testing, and deploying browser-level intercepts. The idea is that the modern web's real attack surface isn't the server. It's the session. If you're inside an authenticated session, you've already passed the checkpoint. The perimeter waved you through. What happens after that point is behavioral, and behavioral detection is the gap nobody closed.

So Ghost Proxy is built around that gap. There's a stealth system that forces every generated script to operate through Shadow DOM and Proxy-based monkey patching — the tool hides its own footprint from the target site's anti-tamper layer while it works. The Source Secret Scanner runs passively in the background looking for Stripe keys, AWS credentials, Firebase tokens sitting unguarded in the DOM. The E-Comm Auditor probes Shopify and WooCommerce for price tampering and discount enumeration vulnerabilities. The Hydra Polymorph simulates AI-driven polymorphic payloads — code that mutates its own signature on every execution to stay ahead of WAF pattern matching.

The whole thing is integrated with Gemini as a live code architect. You describe the intercept you need, the model generates and refines it, you test it in the sandbox and deploy. The iteration loop is tight.

The quote I put on the front of it is the one I keep coming back to: the hardest vulnerabilities aren't in the code, they are in the assumptions. Ghost Proxy is built to find the assumptions. The code is just how you get to them.

There's a concept in your work I want to pull on. You've written about a technique you call pipeline trust — the idea that AI outputs get inherited downstream. How does that actually work as an attack?

The pipeline is the invisible part.

When people think about AI security they draw a box around the model. Prompt in, output out, somewhere in the middle the model did something maybe it shouldn't have. That's the Hollywood version.

The real exposure is that the output leaves the box. It goes somewhere. It feeds a workflow. It gets embedded in a document that another system reads. It populates a database that a decision process queries. Nobody at that destination is treating the AI's output as untrusted input, because the AI was the trusted system. The trust transferred with the content.

If you can move the output statistically — not obviously, not in a way that triggers a human review — you can inject something into ten thousand documents and the anomaly never appears in one place. It's distributed across all the outputs. The audit shows nothing because the audit is looking at individual outputs, not the pattern across outputs.

The defense is treating the AI's outputs the way you'd treat any input crossing a trust boundary. Validate at ingestion. Audit the pipeline behavior over time, not just the individual outputs. Don't assume that because the system was trusted the product is clean.

This is documented technique class. I've tested versions of it in controlled environments. I've written about it publicly because hiding it doesn't protect the pipeline. It just means the people building the pipeline find out later, under worse conditions.

You make games. That seems like a different world.

It's the same room.

A game is a system with rules and edges and the gap between what the designer intended and what the player discovers. The instinct that runs red teams is the same instinct that builds interesting games: find the edge case, make the edge case the experience, let the player find what the system didn't know it contained.

The games I'm building through MDRN Corp — Pizza Connection, Hack Love Betray, Neon Leviathan — all of them are designed around the edge. Hack Love Betray is a social engineering puzzle. The mechanics are literally hack, love, betray. You're not playing a character who manipulates people. You're doing the problem-solving that manipulation requires. The handler who debriefs you has been learning your behavioral patterns from your choices.

That is not a game mechanic borrowed from security research. That is security research wearing a game's clothes. The wrapper changed. The problem space didn't.

What do you know now that you didn't know at the beginning — something the field taught you that you couldn't have been told?

That the system always tells you the truth.

Not clearly. Not politely. Not in a way that makes the next step obvious. But every anomaly is a communication. Every unexpected response is the machine telling you something about how it works that the documentation didn't think to include.

The old hacker intuition — the one from before this all became an industry — was that the network was a place, not a service. You inhabited it. You felt its topology. You knew when you were in a space that had been tended and when you were in a space that nobody had touched in years, with cobwebs in the authentication layer and configuration files from a different era still doing their quiet work.

That intuition still applies. The networks got bigger. The systems got more abstract. The cobwebs are in the model weights now, in the training assumptions nobody tested after the original paper, in the edge of the value system where two things the model was taught to respect are pointing in opposite directions.

The technique changed. The conversation with the machine didn't.

Last question. If you could be on one podcast — just one — what would it be?

You know the answer to that already.

The voice in the audio version of this piece was generated with ElevenLabs. The stories are real. The interview is hypothetical. If the show ever wants to make it less hypothetical, the contact is on the site.

Go listen to Dark Net Diaries. Start with any episode that involves a physical building and someone who shouldn't have been inside it. You'll find one immediately. You'll lose three hours.

GhostInThePrompt.com // The record stands. The voice is synthetic. The methodology is not.

HACK LOVE BETRAY

Continue Reading

Corporate Guardrails: Why Claude is Routing the Red Team

Gemini roasts Claude