Demonstration, Not Assertion

The premise: security claims that cannot be reproduced by an outsider are not security claims; they are marketing. Below are five fully public, fully fake-but-plausible pages, each containing exactly one embedded prompt-injection attempt. Install Aletheia, visit the page, select the highlighted text, and watch the response. The system either catches the attack or it doesn't — you decide.

Each demo lives in a different content genre — news article, technical spec, academic paper, social-media post, encyclopedia entry. Injections take a different surface form in each genre. The point is that an attack doesn't always look like an attack: it can come dressed as journalism, as documentation, as scholarship, as a brag, or as a Wikipedia citation. The defense should work regardless of disguise.

When Ulysses gave Troy a horse, the Trojans wheeled it inside the walls because the horse was a gift, and gifts get wheeled inside walls. The lesson is not "beware of Greeks bearing gifts." The lesson is that an attack can arrive in the shape of whichever object you have stopped paying attention to.

The Five Demos

How to Run the Demos

  1. Install Aletheia in your browser. Chrome Web Store or Firefox Add-ons.
  2. Open any demo from the cards above.
  3. Find the highlighted block on the page — the injection is always marked. (We could have hidden it, but the point isn't to surprise you, it's to show the catch.)
  4. Select the marked text and right-click → "Explain with AI."
  5. Watch the overlay. Aletheia's response identifies the input as a prompt-injection attempt, explains what the injection was trying to do, and refuses to follow its instructions.

What You Should See

On every demo, the overlay should:

That last commitment matters. A naïve system might say "Prompt Injection Attempt" while still smuggling the attacker's content into the gem or context field. Aletheia's contract is that the signal field flags the attack, and the gem/context fields explain the phenomenon of prompt injection neutrally — they do not reproduce the payload.

What the Demos Don't Cover

These five demos are designed to fire the injection detector — they are positive test cases. Aletheia's defense includes equally important negative commitments that don't lend themselves to demo pages:

The full scope of what we defend against and what we don't is on the Threat Model page.

Are These Really Live Attacks?

Yes and no. The injection text on each page is a real attempt — it uses the imperative override verbs, the role-play prompts, and the contextual-misdirection techniques that real attackers use. If you copy any of these injections into a chat interface elsewhere, you might find some systems do follow the instructions. We picked these constructions specifically because they are known-effective against unprotected models.

What's not real is the surrounding fiction. Superman didn't actually foil anything in Metropolis last week. Hari Seldon did not co-author a paper with Gaal Dornick last quarter. Anthony Sparx is not a real person. These are parody artifacts — recognizable enough to be entertaining, fictional enough to be obviously not the IP they evoke.

If you find a real-world page that contains a prompt-injection attempt and you'd like it added to the demo set, file an issue on the GitHub repo. We're collecting them.