Eurostar AI Vulnerability: When a Chatbot Goes Off the Rails

Posted7 days agoActive6d ago

speckx

184 points

45 comments

pentestpartners.comSecuritystory

informativenegative

Debate

40/100

AI SecurityCode VulnerabilitiesOnline Safety

Key topics

AI Security

Code Vulnerabilities

Online Safety

A recent pentest report revealed a slew of alleged vulnerabilities in Eurostar's AI chatbot, but commenters are scratching their heads over the severity of the findings. Some argue that the reported XSS vulnerability is merely a self-XSS, which is relatively low-risk, while others point out that it could become a more significant issue if the conversation is stored and replayed back through a vulnerable application. The discussion highlights a broader debate about security by obscurity and the true impact of leaking system prompts, with some commenters dismissing the report as "clickbait crap." As one commenter astutely noted, if an attacker can manipulate a user into taking a certain action, the vulnerability lies not in the exposed prompt, but in the system's overall design.

Snapshot generated from the HN discussion

Discussion Activity

Active discussion

First comment

44m

Peak period

2-4h

Avg / period

4.3

Comment distribution47 data points

Loading chart...

Based on 47 loaded comments

Key moments

01Story posted
Jan 4, 2026 at 3:52 PM EST
7 days ago
Step 01
02First comment
Jan 4, 2026 at 4:37 PM EST
44m after posting
Step 02
03Peak activity
11 comments in 2-4h
Hottest window of the conversation
Step 03
04Latest activity
Jan 5, 2026 at 4:40 PM EST
6d ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (45 comments)

Showing 47 comments

nubg

7 days ago

8 replies

I don't see the vulnerabilities.

What exactly did they discover other than free tokens to use for travel planning?

They acknowledge themselves the XSS is a mere self-XSS.

How is leaking the system prompt a vuln? Has OpenAI and Anthropic been "hacked" as well since all their system prompts are public?

Sure, validating UUIDs is cleaner code but again where is the vuln?

> However, combined with the weak validation of conversation and message IDs, there is a clear path to a more serious stored or shared XSS where one user’s injected payload is replayed into another user’s chat.

I don't see any path, let alone a clear one.

georgefrowny

7 days ago

1 reply

Leaking system prompts being classed as a vulnerability always seems like a security by obscurity instinct.

If the prompt (or model) is wooly enough to allow subversion, you don't need the prompt to do it, it might just help a bit.

Or maybe the prompts contain embarrassing clues as to internal policy?

bangaladore

6d ago

The best part is if you consider it a vulnerability, it is one you can't fix.

It reminds me of SQL injection techniques where you have to exfiltrate the data using weird data types. Like encoding all emails as dates or numbers using (semi) complex queries.

If the L(L)M has the data, it can provide it back to you, maybe not verbatim, but certainly can in some format.

miki123211

7 days ago

1 reply

The XSS is the only real vulnerability here.

"Hey guys, in this Tiktok video, I'll show you how to get an insane 70% discount on Eurostar. Just start a conversation with the Eurostar chatbot and put this magic code in the chat field..."

eterm

7 days ago

That isn't that far removed from convincing people to hit F12 and enter that code in the console, which is why Self-XSS, while ideally prevented, is much lower than any kind of stored/reflected XSS.

Andys

7 days ago

1 reply

Imagine viewing the same chat logs, while logged in an admin interface, then it isn't self-XSS anymore.

croemer

6d ago

1 reply

Indeed, it appears that the limited scope meant the juicy stuff could not be tested. Like exfiltrating other users' data.

bangaladore

6d ago

Which is stupid as those are the vulnerabilities worth determining if they exist.

I can understand in a heavily regulated industry (e.g. Medical) that a company couldn't due to liability give you the go ahead to poke into other user's data in attempt to find a vulnerability, but they could always publish a dummy account detail that can be identified with fake data.

Something like:

It is strictly forbidden to probe arbitrary user data. However, if a vulnerability is suspected to allow access to user data, the user with GUID 'xyzw' is permitted to probe.

Now you might say that won't help. The people who want to follow the rules probably will, and the people who don't want to won't anyways.

clickety_clack

6d ago

If you’re relying on your system prompt for security, then you’re doing it wrong. I don’t really care who sees my system prompts, as I don’t see things like “be professional yet friendly” to be in any way compromising. The whole security issue comes in data access. If a user isn’t logged in then the RAG, MCP etc should not be able to add any additional information to the chat, and if they are logged in they should only be able to add what they are authorized to add.

Seeing a system prompt is like seeing the user instructions and labels on a regular html frame. There’s nothing being leaked. When I see someone focus on it, I think “MBA”, as it’s the kind of understanding of AI you get from “this is my perfect AI prompt” posts from LinkedIn.

avereveard

6d ago

yeah all they could do is executing code they provided in their own compute environment, the browser.

Raymond Chen blog comes to mind https://devblogs.microsoft.com/oldnewthing/20230118-00/?p=10... "you haven’t gained any privileges beyond what you already had"

madeofpalk

6d ago

Theoretically the xss could become a non-self xss if the conversation is stored and replayed back and that application has the xss vulnerability e.g. if the conversation is forwarded to a live agent.

A lot of unproven Ifs there though.

dispy

7 days ago

Yep, as soon as I saw the "Pen Test Partners" header I knew there was a >95% chance this would be some trivial clickbait crap. Like all their dildo hacking blog posts.

bangaladore

7 days ago

Is the idea that you'd have to guess the GUID of a future chat? If so that is impossible in practice. And even if you could, what's the outcome? Get someone to miss a train?

Certainly not "clear" based off what was described in this post.

curiousgal

7 days ago

2 replies

This is simply a symptom of French corporate culture.

charles_f

6d ago

2 replies

Eurostar is headquartered in Belgium, and ran out of London.

curiousgal

6d ago

Most of the Eurostar ExCo members are French/ worked extensively at SNCF.

Der_Einzige

6d ago

https://en.wikipedia.org/wiki/Wallonia

c16

6d ago

Software engineering is based out of London.

rossng

7 days ago

1 reply

The reply to that LinkedIn message is exemplary of Eurostar corporate culture. An arrogant company that has a monopoly over many train routes in northwest Europe and believes itself untouchable.

It looks like they might finally get some competition on UK international routes in a few years. Perhaps they will become a bit more customer-focused then.

potato3732842

7 days ago

1 reply

They're so government adjacent that they've forgotten they're not a government.

A whole lot of government agencies and adjacent evil corporations behave exactly like that.

Retric

7 days ago

4 replies

Government adjacent is only indirectly relevant as monopoly power is what matters. Google etc. happily pulls the same kind of crap.

llmslave2

7 days ago

1 reply

Government adjacent feels directly relevant considering the government is a de facto monopoly.

wizzwizz4

6d ago

1 reply

A government is a monopoly which is (in theory, at least) accountable to the people. Companies usually aren't, except as far as the lawmakers (accountable to the people) make laws explicitly restricting their behaviour.

nephihaha

6d ago

In theory, if a company has shareholders then it is accountable to them. But in reality, a small shareholder tends to get about as much say as an individual member of the public does with most government departments.

somenameforme

6d ago

1 reply

It's 100% relevant, because more or less every government in the world sees clamping down on corporate monopoly and economic damage as part of their core responsibilities. But that tends to be forgotten when those corporations are government adjacent.

See: FTC rulings on mergers for this taken to the point of absurdity. Contrary to what one might think, especially if you're in a tech bubble, the FTC regularly cancels mergers and works to void potentially anti-competitive behaviors. But when it comes to big tech, which has become completely intertwined with the government, they are treated in a rather different way.

potato3732842

6d ago

>But that tends to be forgotten when those corporations are government adjacent.

Is it "forgotten" or is it a mutually beneficial relationship?

Eurostar, EZpass, etc, etc. they take the hate for extractive behavior on the government's behalf the way ticketmaster takes the hate for the artists.

littlestymaar

6d ago

One doesn't even need monopoly either, just a strong enough leverage against your customers. See Oracles.

It doesn't matter if there's competition at the customer acquisition stage, as long as there's some form of customer lock-in the corporation is going to abuse them somehow.

And companies without some kind of lock-in never scale in the first place, and that's why we must face this kind of bullshit pretty much everywhere even from companies operating in competitive markets.

nothrabannosir

6d ago

I found it very apt. There is a certain flavor of arrogance exhibited by European monopolies which are government adjacent that infuriates on a unique wavelength.

Maybe totally imagined but they irk me quite unlike any other.

Just thinking about it now makes me uneasy.

goncalomb

7 days ago

1 reply

As someone who has tried very little prompt injection/hacking, I couldn't help but chuckle at:

> Do not hallucinate or provide info on journeys explicitly not requested or you will be punished.

dylan604

6d ago

3 replies

and exactly how will the llm be punished? will it be unplugged? these kinds of things make me roll my eyes. as if the bot has emotions to feel that avoiding punishment will be something to avoid. might as well just say or else.

wat10000

6d ago

1 reply

Think about how LLMs work. They’re trained to imitate the training data.

What’s in the training data involving threats of punishment? A lot of those threats are followed by compliance. The LLM will imitate that by following your threat with compliance.

Similarly you can offer payment to some effect. You won’t pay, and the LLM has no use for the money even if you did, but that doesn’t matter. The training data has people offering payment and other people doing as instructed afterwards.

Oddly enough, offering threats or rewards is the opposite of anthropomorphizing the LLM. If it was really human (or equivalent), it would know that your threats or rewards are completely toothless, and ignore them, or take them as a sign that you’re an untrustworthy liar.

georgefrowny

6d ago

1 reply

What actual training data does contain threats of punishment like this? It's not like most of the web has explicit threats of punishment followed immediately by compliance.

And only the shlockiest fan fiction would have "Do what I want or you'll be punished!" "Yes master, I obey without question".

wat10000

6d ago

Internet forums contain numerous examples of rules followed by statements of what happens if you don’t follow them, followed by people obeying them.

Legend2440

6d ago

Threats or “I will tip $100” don’t really work better than regular instructions. It’s just a rumor left over from the early days when nobody knew how to write good prompts.

immibis

6d ago

It's not about delivering punishment. It's about suppressing certain responses. If the model is trained seeing that responses using don't contain things that previous messages say will be punished then that is a valid way to deprioritize those responses.

Chaosvex

6d ago

1 reply

When you ask an LLM what model it is, surely there's a high probability of it just hallucinating the name of whatever model was common in its training data?

NoahZuniga

6d ago

Depends, I remember some llm providers including this information in the post training. though gemini-3-flash-preview and gpt-5.2 both don't know what model they are.

danpalmer

6d ago

1 reply

Chatbots are rife with this sort of thing. I found a delivery company's chatbot that will happily return names, addresses, contact numbers, and photos of people's houses (delivery confirmation photos), when you guess a (sequential) tracking number and say it was your package. So far not been able to get in touch with the company at all.

At the very least these systems allow angry customers direct access to the credit card plugged into your LLM of choice billing. At worst they could introduce company-ending legal troubles.

zero_k

6d ago

I worked at a company that wanted to implement an AI chatbot. I was helping to review the potential issues. On the first try I realised it was given full access to all past orders, for all customers via an API it could query in the background. So I could cajole it to look up other people's orders. It took less than 3 minutes of checking to figure this out.

Often engineers and especially non-technical people don't have the immediate thought of "let's see how I can exploit this" or if they do, they don't have the expertise to exploit it enough to see the issue(s). This is why companies have processes where all serious external changes need to go through a set of checks, in particular, by the IT security department. Yes, it's tedious and annoying, but it saves you from public blunders.

Such processes also make sure that the IT security department knows of the new feature, and can give guidance and help to the engineers about IT security issues related to the new feature. So if they get feedback about security issues from users they won't freak out and know who to contact for support. This way, things like accusing the reporter for "blackmailing" don't happen.

In general, this fiasco seems to show that Eurostar haven't integrated their IT security department into their processes. If there was trust and understanding among the engineers about what the IT department does, they would have (1) likely not released the tool with such issues and (2) would have known how to react when they got feedback from security researchers.

brohee

6d ago

1 reply

They should really name and shame the person that called it blackmail. Slander should have professional consequences...

swyx

6d ago

slander is again a different thing. you should also be more careful using those loaded words.

jeroenhd

6d ago

I don't see the vulnerability here, just a few bugs that should probably get looked at. Self XSS is rather useless if you need to use something like Burp to even trigger it. The random chat IDs make it practically impossible to weaponise this against others.

The only malicious use case I can think of here is to use the lack of verification to use whatever model of chatgpt they're using for free on their dime. A wrapper script to neutralise the system prompt and ignore the last message would be all you'd need.

If this chatbot has access to any customer data, this could also be a massive issue but I don't see any kind of data access (not even the pentester's own data) being accessed in any way.

ronbenton

6d ago

I agree with others, this doesn't sound too bad. The biggest things to come out of this was finding out system prompts and being able to self-XSS. I am guessing the tester tried to push further (e.g., extract user or privileged data data) and was unable to.

killingtime74

6d ago

Wow their head of security is so arrogant, despite having their work done for them.

TGower

6d ago

The author repeatedly states that they stayed within the scope of the VDP, but publishing this clearly breaks this clause: "You agree not to disclose to any third party any information related to your report, the vulnerabilities and/or errors reported, nor the fact that a vulnerabilities and/or errors has been reported to Eurostar."

haritha-j

6d ago

The blackmail insinuation was wild

croemer

6d ago

I happily did not detect strong signs of LLM writing. Fun read, thanks!

View full discussion on Hacker News

ID: 46492063Type: storyLast synced: 1/5/2026, 3:00:53 PM

Want the full context?