Proposal: AI Content Disclosure Header
Key topics
The internet is abuzz with a proposal to introduce an "AI Content Disclosure Header" to flag content modified or generated by artificial intelligence, sparking a lively debate about the granularity and practicality of such a system. Commenters are weighing in on the challenges of categorizing content, with some pointing out that even simple grammar checkers could be considered "AI-modified." While some argue that disclosure is essential, others are questioning the effectiveness of a header-based approach and suggesting alternative solutions, such as markup or more comprehensive disclosure regulations. As the discussion unfolds, a surprising consensus is emerging: the need for transparency is clear, but the proposed solution may be too simplistic for the complexities of modern content creation.
Snapshot generated from the HN discussion
Discussion Activity
Active discussionFirst comment
38m
Peak period
11
0-2h
Avg / period
6
Based on 48 loaded comments
Key moments
- 01Story posted
Aug 26, 2025 at 5:08 PM EDT
4 months ago
Step 01 - 02First comment
Aug 26, 2025 at 5:46 PM EDT
38m after posting
Step 02 - 03Peak activity
11 comments in 0-2h
Hottest window of the conversation
Step 03 - 04Latest activity
Aug 27, 2025 at 3:45 PM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
probably ai-modified -- the core content was first created by humans, then modified (translated into another language). translating back would hopefully return you the original human generated content (or at least something as close as possible to the original).
For those, my instinct is to fallback to markup which would seem to work quite well. There is the pesky issue of AI content in non-markup formats - think JSON that don't have the same orthogonal flexibility in annotating metadata.
> ai-modified Indicates AI was used to assist with or modify content primarily created by humans. The source material was not AI-generated. Examples include AI-based grammar checking, style suggestions, or generating highlights or summaries of human-written text.
I'd love to browse without that.
It does not bother me that someone used a tool to help them write if the content is not meant to manipulate me.
Let's solve the actual problem.
Doing it in a HTTP header is furthermore extremely lossly, files get copy around and that header ain't coming with them. It's not a practical place to put that info, especially when we have Exif inside the images themselves.
The proper way to handle this is mark authentic content and keeping a trail of how it was edited, since that's the rare thing you might to highlight in a sea of slop, https://contentauthenticity.org/ is trying to do that.
While this doesn't invalidate the proposal, it does suggest we'd see similar abuse patterns emerge, once this header becomes a ranking factor.
Most web servers use mtime for Last-Modified header.
It would be crazy for Google to treat that as authorship date, and I cannot believe that they do.
I'm not sure what Google uses for authorship date, but if you do date-range based web searches, the actual dates of the content no longer have any meaningful relationship to what was set in the earch criteria (news seems mostly better but with some problems, but actual web search is hopeless). In both directions -- searching for recent stuff gets plenty of very old stuff mixed in, but searching for stuff from a period well in the past gets lots of stuff from yesterday, too.
Still I believe MIME would be the right place to say something about the Media, rather than the Transport protocol.
On a lighter note: we should consider second order consequences. The EU commission will demand its own EU-AI-Disclosure header be send to EU citizens, and will require consent from the user before showing him AI generated stuff. UK will require age validation before showing AI stuff to protect the children's brains. France will use the header to compute a new tax on AI generated content, due by all online platform who want to show AI generated content to french citizens.
That's a Pandora box I wouldn't even talk about, much less open...
if this takes off I'll:
win win!(of course, it was never going to be useful)
There are dedicated headers for other properties, e.g. language.
Feels weird to me that encoding is part of MIME, but language isn't, although I understand why.
Though FWIW, I think the Content-Encoding header is basically a mistake, should should been Content-Transform.
I think the recent drama related to the UK's Online Safety Act has shown that people are getting sick of country-specific laws simply for serving content. The most likely outcome is sites either block those regions or ignore the laws, realizing there is no practical enforcement avenue.
Yes, there is a huge problem with AI content flooding the field, and being able to identify/exclude it would be nice (for a variety of purposes)
However, the issue isn't that content was "AI generated"; as long as the content is correct, and is what the user was looking for, they don't really care.
The issue is content that was generated en-masse, is largely not correct/trustworthy, and serves only to to game SEO/clicks/screentime/etc.
A system where the content you are actually trying to avoid has to opt in is doomed for failure. Is the purpose/expectation here that search/cdn companies attempt to classify, and identify, "AI content"?
The current approach is that the content served is the same for humans and agents (ie, a site serves consistent content regardless of the client), so who a specific header is "meant for" is a moot point here.
https://www.ietf.org/rfc/rfc3514.txt
Note date published
Potential flaw: I'm concerned that attackers may be slow to update their malware to achieve compliance with this RFC. I suggest a transitional API: Intrusion detection systems respond to suspected-evil packets that have the evil bit set to 0 with a depreciation notice.
Maybe better define an RDF vocabulary for that instead, so that individual DIVs and IMGs can be correctly annotated in HTML. ;)