X-Ray: a Python Library for Finding Bad Redactions in PDF Documents

Posted12 days agoActive9 days ago

rendx

702 points

121 comments

github.comstoryHigh profile

informativeneutral

PDF AnalysisRedactionClient-Side EncryptionInformation Extraction

Key topics

PDF Analysis

Redaction

Client-Side Encryption

Information Extraction

X-ray: A Python library for finding bad redactions in PDF documents

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

19m

Peak period

0-6h

Avg / period

12.9

Comment distribution129 data points

Loading chart...

Based on 129 loaded comments

Key moments

01Story posted
Dec 23, 2025 at 4:54 PM EST
12 days ago
Step 01
02First comment
Dec 23, 2025 at 5:13 PM EST
19m after posting
Step 02
03Peak activity
63 comments in 0-6h
Hottest window of the conversation
Step 03
04Latest activity
Dec 27, 2025 at 8:45 AM EST
9 days ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (121 comments)

Showing 129 comments

seanw444

12 days ago

3 replies

The context for OP posting this is that many of the recently-released Epstein documents were PDFs "redacted" by being drawn on top of.

agumonkey

12 days ago

2 replies

I wasn't sure of this, even though sometimes you'd see remains of the original characters near rectangles edges.. does this mean the leaked documents have been de-redacted ?

kstrauser

12 days ago

1 reply

At least some, yes: https://daringfireball.net/linked/2025/12/23/trump-doj-pdf-r...

agumonkey

12 days ago

2 replies

yeah i expected every political team, even the low level ones, to be fully aware of naive pdf "edition"... alas, incompetence often does that

arthurcolle

12 days ago

1 reply

Checks and balances for a more technological era.

airstrike

12 days ago

[delayed]

zahlman

12 days ago

2 replies

I'm actually surprised not to have yet heard widespread conspiracy theorization that this is deliberate for some inscrutable reason or other.

kstrauser

12 days ago

Something something "chess, not checkers, this proves he has them on the run!"

dcollect

12 days ago

its funny how some people invent ways to be cynical at all times lmao

why don't you come up with one of those instead of just crying about it? lmao.

k1t

12 days ago

1 reply

Yes, in some cases, eg. https://news.ycombinator.com/item?id=46364121

agumonkey

12 days ago

1 reply

oh that's a beautiful sight

hopefully this is straw that breaks the camel's back

XorNot

12 days ago

2 replies

Why would that be the case? The government isn't redacting "yes we contacted aliens" they're redacting information about military capabilities that might be of use to adversaries.

agumonkey

12 days ago

1 reply

sorry the title mentioned epstein files, so i was hoping incriminating facts that would accelerate trump's fall

jibal

12 days ago

No reason to be sorry ... you are right and the other person seems quite confused about the context.

iberator

12 days ago

How do you know if its redacted?

arthurcolle

12 days ago

Also good for UFO/UAP/"anomalous phenomena" documents and remote viewing PDFs for what it's worth :)

formerly_proven

12 days ago

Is there a good free tool to properly redact PDFs? My workflow is to place black annotation rectangles on top and then print as PDF with "force rasterization" on. The resulting PDF files then just consist of pages with one image each. But this tends to be really suboptimal, because it's usually a grayscale or color rasterization, so file sizes are very large vs. monochrome PDFs with CCITT G3/G4 compression. Post-processing PDFs to convert them to CCITT is rather annoying and I only know of CLI ways.

IceHegel

12 days ago

1 reply

Given recent high profile redaction events, I think one simple use of AI would be to have it redact documents according to an objective standard.

That should in theory prevent overly redacted documents for political purposes.

An approach that could be rolled out today would be redacting with human review, but showing what % of redactions the AI would have done, and also showing the prompt given to the AI to perform redactions.

mmazing

12 days ago

1 reply

Honestly, it doesn't take any inference or need for AI, there's simply data in the documents that can be extracted.

bogtog

12 days ago

I don't think the commentor above is saying that an AI should necessarily apply the redaction. Rather, an AI can serve as an objective-ish way of determining what should be redacted. This seems somewhat analogous to how (non-AI) models can we used to evaluate how gerrymandered a map is

unfocused

12 days ago

2 replies

Adobe Pro, when used properly, will redact anything in a PDF permanently.

Whoever did these "bad" redactions doesn't even know how to use a PDF Editor.

We have paralegals and lawyers "mark for redaction", then review the documents, then "apply redactions". It's literally be done by thousands of lawyers/paralegals for decades. This is just someone not following the process and procedure, and making mistakes. It's actually quite amateurish. You should never, ever screw up redactions if you follow the proper process. Good on the X-ray project on trying to find errors.

I just want to add, applying black highlights on top of text is in fact, the "old" way of redaction, as it was common to do this, and then simply print the paper with the black bars, and send the paper as the final product.

Whoever did it is probably old, and may have done it thinking they were going to print it on paper afterwards!! Just guessing as to why someone would do this.

tgsovlerkhgsel

12 days ago

2 replies

Or they may not understand how PDF works and think that it's the same as paper.

Especially with the "draw a black box over it" method, the text also stops being trivially mouse-selectable (even if CTRL+A might still work).

Another possibility is, of course, that whoever was responsible for this knew exactly what they were doing, but this way they can claim a honest mistake rather than intentionally leaking the data.

zahlman

12 days ago

> Or they may not understand how PDF works and think that it's the same as paper.

Yes; that's presumably included in being "amateurish" and "not following proper process".

aidos

12 days ago

A while back I did a little work with a company that were meant to help us improve our security posture. I terminated the contract after they sent me documents in which they’d redacted their own AWS keys using this method.

selectodude

12 days ago

2 replies

Any attorney or law enforcement that works for the US Federal Government receives very, very comprehensive instructions on how to redact information on basically the first day of training. There is absolutely zero doubt among any of my DOGE'd friends that this was 100 percent on purpose malicious compliance.

hsbauauvhabzb

12 days ago

1 reply

So you think it was trump supporters as opposed to in spite of trump? Genuine question - Who stands to gain? I don’t follow this enough to know.

selectodude

12 days ago

1 reply

No, it’s lifelong civil servants who received the order to wholesale redact these documents from the Trump toady leading the department.

hsbauauvhabzb

11 days ago

1 reply

I’m not sure what’s unredactable, but naming victims isn’t something I imagine either party is particularly interested in doing. I imagine the HN malicious // ineptitude rule is in play here, rather than some sub conspiracy conspiracy.

selectodude

11 days ago

1 reply

Donald Trump is not a victim.

hsbauauvhabzb

11 days ago

For sure there’s something going on there, but do the unredacted pdfs prove that?

unfocused

12 days ago

Agreed. I worked on the Canadian side of the legal side and there is a very comprehensive process for redaction. Nobody does redaction unless they follow the process. Never seen anyone 15+ years do something silly like this in the office.

mlissner

12 days ago

2 replies

Cool to see this here. It’s funny because we do so many huge, complex, multiyear projects at Free Law Project, but this is the most viral any of our work has ever gone!

Anyway, I made X-ray to analyze the millions of documents we have in CourtListener so that we can try to educate people about the issue.

The analysis was fun. We used S3 batch jobs, but we haven’t done the hard part of looking at the results and reporting them out. One day.

thangalin

12 days ago

1 reply

https://www.argeliuslabs.com/deep-research-on-pdf-redaction-...

> Information Leaking from Redaction Marks: Even when content is properly removed, the redaction marks themselves can leak some information if not done carefully. For example, if you have a black box exactly covering a word, the length of that black box gives a clue to the word’s length (and potentially its identity).

Does X-ray employ glyph spacing attacks and try to exploit font metric leaks?

mlissner

12 days ago

4 replies

No, we worked with researchers that developed that kind of system, but didn't broadcast our work b/c the research was too sensitive. Seems the cat is out the bag now though.

I think the combination of AI and font-metrics is going to be wild though. You ought to be able to make a system that can figure out likely words based on the unredacted ones and the redaction's size. I haven't seen any redaction system yet that protects against this.

thangalin

12 days ago

1 reply

> I haven't seen any redaction system yet that protects against this.

The linked article suggests widening redacted areas more than needed with some randomization applied to the width. Strikes me that that wouldn't do much except add a few more possible solutions.

vlovich123

12 days ago

1 reply

Yeah, the more robust protection is to widen to a constant. But in the general case that could require reflowing the pdf. But honestly single word redactions are really probably useless with cheap AI that can highly accurately fill in the gaps

rgmerk

12 days ago

1 reply

Depends what you're trying to hide.

If the redaction is a person's name, and there's nothing else to give the person's identity away, single word redaction probably works reasonably well, AI or no AI.

godelski

12 days ago

1 reply

  > If the redaction is a person's name

I'm not sure if you're aware, but peoples names are variable in length. We are talking about a system that can identify single character differences. So that does reduce the search space, especially since names are not all possible letter permutations. Combine that with the fact that it isn't uncommon to see partial first letters show up. You can even see some instances in the Epstein files.

Of course, you can also take this further. Even if you can't recover names you can get meta information about how many parties are involved by recognizing different length redactions correspond to different entities. While same length redaction doesn't guarantee same entity it is a hint.

mycall

12 days ago

2 replies

It is also common for authors to misspell names (proper nouns) in an attempt to determine who leaks docs (and to force non-matches for FOIA requests).

godelski

11 days ago

Random side fact but this was also a thing map makers did back in the day. Including fake towns. In that way they could identify who was stealing their work.

mhast

12 days ago

If you want to fingerprint text you can also do it by small insignificant changes to text which doesn't change the meaning.

If you have a number such locations with alternatives then you can make a number of identifiable versions by combining alternates.

vlovich123

12 days ago

1 reply

I thought glyph spacing attacks are an old idea; like I recall reading about such ideas 10-20 years ago unless I’m misremembering. Can you clarify why it was considered “too sensitive” if the whole point of this effort is to showcase these attacks?

mlissner

12 days ago

2 replies

It’s a fine line. Most redactions are for the good, to protect someone or something. For example even in the Epstein files, where some redactions are being abused, most redactions are protecting victims.

If there’s a way to undo huge amounts of redactions, that’d certainly be a net negative. Sort of like if encryption were suddenly broken, you wouldn’t publish a paper saying so.

Our goal has always been to educate about the problem so that it can be addressed. We didn’t have resources to push on the font metrics approach, so we stayed mostly quiet about it.

btreecat

12 days ago

1 reply

> If there’s a way to undo huge amounts of redactions, that’d certainly be a net negative. Sort of like if encryption were suddenly broken, you wouldn’t publish a paper saying so.

I can't state emphatically enough how this is not the right mental playbook.

If you have found a vulnerability, it's likely someone else has too. By sitting on it, you only create more future victims.

Disclosure will lead to fixing this issue, mitigating it's precense, or switching tools/workflows, possibly a combination of. Sitting on it only ensures that folks who think they are protected, actually aren't.

mlissner

12 days ago

2 replies

We’re familiar with vulnerability disclosure philosophies, but what if the problem can’t be fixed because there’s no forward secrecy for the hundreds of millions of documents that are already out there?

It’s tricky stuff and we have limited resources, unfortunately.

opello

11 days ago

So what is the state of the art in redaction? Re-publish the document with an insert that says [redaction] so that no (or maybe minimal) length side-channel exists? I imagine someone thinks about clever ideas and it would be fun to read about them and the trade-offs.

btreecat

9 days ago

>, but what if the problem can’t be fixed because there’s no forward secrecy for the hundreds of millions of documents that are already out there?

What if you are not the only folks who have found and exploited this vulnerability?

You can play the "what if" game to justify not doing the right thing all day long, when really it should be one "if" that guide you. What if someone else found this?

vlovich123

11 days ago

Given that hiding among and behind victims is how abusers continue, I’m not so sure redactions really are all that beneficial when you count future victims in the pool of interested parties. And the public interest certainly isn’t helped by secrecy and redactions and selective release.

While protecting victims is noble, something like this really needs the light of day and a truth and reconciliation commission so that everyone associated with the crime ring is punished and accounted for.

And no, if you do find somehow all encryption is mathematically broken, it’s your duty to publicize it even if existing secrets are jeopardized (you mitigate as best you can obviously in the short term) because it’s likely people more powerful than you might have that knowledge anyway and are engaged in asymmetric warfare.

NoboruWataya

12 days ago

1 reply

This is going to be a disaster IMO because AI will just hallucinate what it thinks is the most probable redacted word and people will take that as gospel.

PunchyHamster

12 days ago

1 reply

"don't redact or we will hallucinate something worse and make people believe it as gospel" is nice deterrent

rafram

12 days ago

We don’t need a “deterrent” against things being redacted in publicly released documents. We can have transparency without the whole world finding out the names of victims and witnesses, people’s phone numbers and SSNs, etc., every time a document is released.

hahn-kev

12 days ago

Maybe we should all just use mono-space fonts for everything

hsbauauvhabzb

12 days ago

1 reply

Presumably with font kerning and pixel perfect recreation of the source, it would be possible to guess the word very accurately.

The strings oioioi and oooiii will have different widths in some fonts because character organisation matters a lot.

setopt

12 days ago

1 reply

I suppose it gets a bit more complex again if you enable stuff like microtype, but even then you can probably measure how much inter-letter and inter-word spacing has been adjusted by just scanning other text in the same line.

I think the conclusion is honestly that PDF is an outdated format for keeping records that might have to be redacted in the future, like court documents. Something reflowable like epub could have the text replaced with constant-space black squares instead no hints leaked as someone mentioned in a parallel comment.

hsbauauvhabzb

12 days ago

I’ve never heard anyone suggest PDF is a good format, and while I don’t know the spec, I imagine based on the acrobat cve list it’s an absolute clusterfuck.

gigatexal

12 days ago

1 reply

Hilarious that DOJ didn’t flatten the layers so you can unredact stuff. What a clown show of incompetent idiots. Or… a skillful one over on the powers that be internally from someone who knew better but knew that they wouldn’t know … and did this to help us all

gigatexal

12 days ago

Only MAGA weirdos would downvote this ;-)

brotchie

12 days ago

3 replies

You'd think the go-to workflow for releasing redacted PDFs would be to draw black rectangles and then rasterize to image-only PDFs :shrug:

shbooms

12 days ago

2 replies

often times you will have requirements that the documents you release be digitally searchable and so in these cases, this would not be an option

8note

12 days ago

1 reply

run some ocr on them after to recreate the text layer?

albert_e

12 days ago

1 reply

With the aggressive push of LLMs and Generative AI ..i am expecting a lot of OCR features to become "smarter" by default, namely go beyond mechanical OCR and start inserting hallucinations and sematically/contextually "more correct" information in OCR output

It's not hard to imagine some powerful LLMs being able to undo some light redactions that are deducible based on context

blharr

11 days ago

Or worse, making up names or information instead of writing the reaction.

pottertheotter

12 days ago

2 replies

This made me think of something I came across recently that’s almost the opposite problem of requiring PDFs to be searchable. A local government would publish PDFs where the text is clearly readable on screen, but the selectable text layer is intentionally scrambled, so copy/paste or search returns garbage. It's a very hostile thing to do, especially with public data!

eviks

12 days ago

Hostile indeed, and also happens in user-facing documents like product manuals!

2ICofafireteam

9 days ago

I have encountered PDFs that would exhibit this behavior in one browser but not in another.

One fun thing I encountered from local government is releasing files with potato quality resolution and not considering the page size.

I had a FOI request that returned mainly Arch D sized drawings but they were in a 94 DPI PDF rendered as letter sized. It was a fun conversation trying to explain to an annoyed city employee that putting those large drawings in a 94 DPI letter size page effectively made it 30-ish DPI.

selinkocalar

12 days ago

1 reply

As someone who's built an entire business on "anti-screenshots" this is brilliant.

PDF redaction fails are everywhere and it's usually because people don't understand that covering text with a black box doesn't actually remove the underlying data.

I see this constantly in compliance. People think they're protecting sensitive info but the original text is still there in the PDF structure.

embedding-shape

12 days ago

Not to mention some PDF editors preserve previous edits in the PDF file itself, which people also seems unaware of. A bit more user friendly description of the feature without having to read the specification itself: https://developers.foxit.com/developer-hub/document/incremen...

postalcoder

12 days ago

The rasterize seems to be the hardest part for most

context: https://www.resetera.com/threads/sega-sammy-fiscal-report-mi...

the pdf: https://www.segasammy.co.jp/cms/wp-content/uploads/pdf/en/ir...

embedding-shape

12 days ago

5 replies

I haven't gone through more than just 10% of the files released today, but noticed that at least EFTA00037069.pdf for example has a `/Prev` pointer, meaning the previous revision of the file is available inside of the PDF itself. In this case, the difference is minor, but I'm guessing if it's in one file, it could be more. You can run `qpdf --show-object=trailer EFTA00037069.pdf` on a PDF file to see for yourself if it's there.

I'm almost fully convinced that someone did this bad intentionally, together with the bad redactions, as surely people tasked with redacting a bunch of files receive some instructions on what to do/not to do?

dcollect

12 days ago

1 reply

went through them all nothing of note misdirection speculation fuel is all this is

JaneDoe2 is redacted 150 times

for example

titaniumtown

12 days ago

2 replies

bot

dcollect

12 days ago

No, I built a robot to go through 11,077 documents and determined JaneDoe2 was 'redacted' 702 times.

Shouldn't you be dilating?

dcollect

12 days ago

dork https://boards.4chan.org/pol/thread/524231967

throwawaysleep

12 days ago

2 replies

All the reporting I have read suggests that they are roping anyone and everyone they can into doing redactions. So I suspect many simply lack the experience to do it well.

embedding-shape

12 days ago

1 reply

Ok, so say someone says "We're overloaded, we need more people" so someone else says "Ok, department Q, R and T changes priority to doing redaction" then at least one person somewhere in this chain has to at least consider that every person from Q, R and T must go through at least a 3 slide powerpoint or whatever saying what's happening, this is what to do, this is what to not do, right?

throwaway173738

12 days ago

1 reply

Lol you’re assuming anyone in the management chain believes there’s any nuance or thought to the task beyond the superficial. I can assure you that lots of managers lack the humility to appreciate how little they might actually know.

sawjet

12 days ago

It depends on which administration you support if the redactions have been completed in good faith.

blitzar

12 days ago

They should all have been using the same redaction tooling.

If I were to hazard a guess, pure speculation, I would say the unretrievable parts were court / previously redacted and the retrievable parts are the latest round of panicked rushed redactions.

xhevahir

12 days ago

> as surely people tasked with redacting a bunch of files receive some instructions on what to do/not to do?

You've phrased this as a question; I gather that you know better than to assume a modicum of competence from these people.

victor9000

12 days ago

I looked into this specific file, and the history doesn't contain anything too interesting. The root file is already the fully redacted and flattened document, and the edit in question is the addition of a numbered footer to each page.

mmmlinux

11 days ago

Give a room full of high school students instructions for a 3 step process. I guarantee at least 10% are going to screw it up somehow.

jmward01

12 days ago

4 replies

Hmmm.. The more I think about this the more any font kerning is likely a major leak for redaction. Even if the boxes have randomness applied to them, the words around a blacked out area have exact positioning that constrains the text within so that only certain letter/space combinations could fit between them. With a little knowledge of the rendering algorithm and some educated guessing about the text a bruit force search may be able to do a very credible job of discovering the actual text. This isn't my field. Anyone out there that has actually worked on this problem?

mlissner

12 days ago

1 reply

Really depends on the length and predictability of the redaction, but yes. If it's short and contextually it's only likely to be either "yes" or "no", you've got it. If it's longer and could contain an unknown person's name along with some other words, well, that's harder.

jmward01

12 days ago

1 reply

I feel like this creates a hash value and the real question is how unique of a value does it represent and how easy it is to narrow it down given throwing a dictionary at it. Similarly, unknown names could likely be teased out like a one-time pad. If they appear in multiple sentences then their randomness quickly repeats and becomes something that potentially could be isolated from the rest of the words around them. This would probably be a fun problem for a cryptography class to work on.

skykooler

12 days ago

If so, then finding the redacted string would be similar to trying to brute-force a hash (though presumably slower, since text layout algorithms are probably more complex than a single hash invocation).

dylan604

12 days ago

1 reply

> the more any font kerning is likely a major leak for redaction

Now I want a font that randomly adjusts the kerning automagically to be used by people in standard word processors not some graphics app. In this way, every time the same word appears in the document, the kerning is different between each one.

chews

12 days ago

1 reply

My autism wants that idea straight into a dumpster fire.

dylan604

12 days ago

not really sure what this means.

most people cannot detect differences in kerning, and must be extreme adjustments to get people to notice. even then, the words would need to be aligned above/below each other for people to see the differences. however, a computer program analyzing the size of a bounding box would notice single pixel differences. so randomly adjusting the kearning per word by pixels between each letter would go unnoticed by the vast majority of readers, but could play absolute havoc with algos trying to decipher possible word combos based on bounding box size.

IshKebab

12 days ago

1 reply

Unlikely to be possible except for the smallest redactions, like if you have a single name redacted and a list of candidates. But I think kerning wouldn't help you much more than just knowing the rough length anyway.

ComplexSystems

11 days ago

Kerning and perplexity together could probably solve quite a few of these.

worewood

12 days ago

There was a recent vulnerability, where researchers were able to extract information from an encrypted chat session from an LLM, by analyzing packet size/timings of the underlying SSL connection. A classic side-channel attack. Seems possible to draw a parallel between the two.

shrubble

12 days ago

3 replies

Shockingly, you can see redaction info from within your browser's PDF viewer. I am using Brave on Linux, and went here:

https://www.justice.gov/multimedia/Court%20Records/Matter%20...

As a test, select with your mouse the entire first line of paragraph number 90, and then paste it into a text editor or a shell. The unredacted text appears!

ktpsns

12 days ago

2 replies

This is exactly the type of bad redactions which the X-ray software will also find.

sawjet

12 days ago

1 reply

You can X-ray a PDF?

stressback

12 days ago

Unsure if you are serious but the commenter is referring to the tool name that this post links to

Fnoord

11 days ago

[delayed]

belter

11 days ago

There is no way looking at most of the now unredacted text that it should redacted.

It´s clear that the DOJ was paying overtime, based on the number of redactions, so the agents and lawyers just roamed free...

sroussey

12 days ago

Why would “Financial Strategy Group, Ltd” be redacted?

blitz_skull

12 days ago

5 replies

Explain like I’m stupid: what is the most gracious interpretation of redaction when releasing files like this?

Why should anyone involved retain any anonymity?

I’m asking in good faith because naively it seems like this should not even exist. All of it should be exposed.

krapp

12 days ago

1 reply

Protecting the identity of victims, eyewitnesses or informants.

sawjet

12 days ago

1 reply

Don't forget the co-conspirators!

krapp

12 days ago

The weirdest part about that is this administration was clearly willing to just stall and could have done what the CIA and FBI does all the time and just "disappear" all of the documents.

What would be the fallout? The Democrats are complicit, the regime all but controls the judiciary (at least the Supreme Court.) And a lot of these guys are billionaires and untouchable anyway unless someone does a Luigi on them. They have the ability to just brute force past the controversy and yet they've chosen to attempt the most ridiculously inept coverup possible.

On the one the sheer stupidity of this administration and its incompetence at implementing fascism means that as bad as things are they could be much worse. On the other hand I fear that once JD Vance or someone just as evil but without Trump's instability takes power we're going to wish we'd done something more when we had the chance.

OsrsNeedsf2P

12 days ago

2 replies

Iirc WikiLeaks took the position of any information that would directly lead to the bodily harm of an individual (or something to that effect). The rational being, "Yes, group A did something horrible that warrants investigation, but if we publish their GPS coordinates they will be blown to smitherines"

vlovich123

12 days ago

1 reply

Unless those people impacted were friendly to US interests? if I recall correctly they published the names of collaborators and informants in Iraq. They also published military tactics that would help those trying to kill US soldiers. GPS coordinates by comparison generally go stale very quickly.

PoignardAzur

12 days ago

1 reply

No, that was the 2010 "diplomatic cables" release. Basically, they disseminated an encrypted version of the data cache, and gave the decryption key to a few key people, including Guardian journalist David Leigh, with the expectation he'd report on the info without sharing sensitive intel.

David Leigh then published the decryption key in his 2011 book about Wikileaks (for some reason) and the info became publicly available. Everyone pinned the blame on Assange.

Moral of the story: journalists can and will disclose ridiculously sensitive info you give them for a bit of fame and you should be extremely careful about covering your tracks.

vlovich123

12 days ago

So no, they don’t redact things even when it can put people in harm’s way.

dragonwriter

12 days ago

There was, to say the least, not a specific law mandating release of the material held by WikiLeaks and specify what was to be, and what was not to be, redacted, so I don't see that as much of a guide here.

empath75

12 days ago

1 reply

The files of a high profile and long running investigation are going to be full of false leads, hoaxes and other bullshit. The reason they don’t just always release the files after closing cases is that there genuinely are going to he innocent people caught in the crossfire who have privacy rights.

This case is so important and such a clusterfuck that the files need to be opened anyway.

ozim

12 days ago

1 reply

Person asking above question explains he doesn’t understand so I guess he also doesn’t understand prosecutors, lawyers, law enforcement, judges make mistakes.

So yes this is best explanation. Revealing everything might bring great harm to innocent people just because they were somehow mentioned in the documents.

Just add all the experience we already have with “internet investigators” that ruin people lives for petty reasons.

MiguelX413

11 days ago

I'm unaware of the phenomenon as I live under a rock, please tell me about it.

supercheetah

12 days ago

1 reply

FWIW, a lot of of the victims (possibly all) are saying they don't care about redactions if they end up being used to protect perpetrators. They want to make sure everyone is held accountable.

dragonwriter

12 days ago

https://abcnews.go.com/US/epsteins-alleged-victims-accuse-do...

Specifically, a number of Epstein victims have complained that the release was unacceptable because it was incomplete, illegally redacted material other than victim names which was not excepted from release under the law mandating release, and because it failed to redact victim identities required to be protected under the law mandating release.

dragonwriter

12 days ago

The law mandating release requires redaction of victim identities, information relating to investigations that are still active, child sexual absue material, and information related to national security.

It generally prohibits other redactions, and expressly prohibits redactions for embarassment, reputational harm, or political sensitivity.

Of course, there is considerable concern that the actual reactions do not appear to comply with the legal requirements.

alessandroliva

12 days ago

1 reply

This being on top of the news on Esptein files being badly redacted is pretty funny

IshKebab

12 days ago

Are you under the impression that they're unconnected?

e38383

12 days ago

1 reply

It’s either redacted or not. There is no "bad". The text is either there or it isn’t, sorry but this is a binary option and not on a spectrum from bad to good.

47282847

12 days ago

1 reply

Maybe “attempted” would be more accurate? I personally don’t mind the “bad”, I get what is meant by it.

But since we’re talking about accuracy: I don’t agree on redactions being binary. You can redact with a pen that under certain lighting still reveals the text; you can redact parts that are easy to reconstruct when you have additional information; you can redact with a pen color that over time loses its function; etc. The “perfect” redaction would perhaps leave no clues as to even how much text was redacted? It seems to depend on the goal and context of the redaction, whether it achieves its purpose or not.

e38383

11 days ago

"attempted" would be more accurate IMO.

I still think that the word redacted is meant to destroy the original text, it might not remove the metadata (e.g. length).

Redaction is done mostly in ways with a possibility to reveal the underlying text, but all this is not redacted in my understanding of the word. I always liked the english word for this – the german word "schwärzen" just means to "blacken" the text and this was never the same for me.

But after further research I must agree with you, it just means to obscure or remove, but not clearly just remove. I have been using it for years in a stronger meaning that it's really meant.

One more but: we hopefully can all agree that putting a black bar over some text which still is just copy/pasteable is not even obscuring.

yoan9224

11 days ago

the timing of this with the epstein docs is pretty funny. honestly feels like someone did those redactions badly on purpose - anyone who works with pdfs knows you don't just draw black boxes over text. either massive incompetence or malicious compliance

bossyTeacher

11 days ago

This is not something to do with the redacted Epstein files, right?

dcollect

12 days ago

thanks bros lmao

text=about them to damage their credibility when they tried to go public with their stories of being text=Epstein also threatened harm to victims and helped release damaging stories =attorneys' fees and case costs in litigation related to this conduct. =Defendants also attempted to conceal their criminal sex trafficking and abuse

text=$327,497.48 and $6,487.04 in New York City text=trafficking and abuse conduct. text=destroy evidence relevant to ongoing court proceedings involving Defendants' criminal sex text=Epstein also instructed one or more Epstein Enterprise participant-witnesses to text=trafficked and sexually abused. text=conduct by paying large sums of money to participant-witnesses, including by paying for their

dcollect

12 days ago

lol thanks bros

=Defendants also attempted to conceal their criminal sex trafficking and abuse

dev1ycan

12 days ago

Yeah let's helpt he govt redact pdfs (literally).

5ak12agff

12 days ago

Given that no U.S. or Israeli citizen apart from Epstein and Maxwell has experienced severe repercussions and Andrew Windsor is the perfect fall guy, there is the possibility that nothing will be revealed from these uncovered redactions.

The releases haven't yielded anything so far. For all we know, Epstein used other methods of communications for the really sensitive stuff. This would not be a surprise, since the whole Maxwell family was deep into tech (Magellan, Chiliad) and Ehud Barak was the head of Israeli military intelligence in the 1980s.

The story is going to be closed in a bipartisan manner except that it might be used to remove some unwanted politicians. The New York Times has already released an article that "explains" Epstein's wealth which names all figures that appear in "conspiracy theories" in an innocent way. Basically, they claim that Epstein could just steal from billionaires like Wexner and the billionaires would roll over and do nothing.

That is the official confirmation that all intelligence angles will be squashed in a bipartisan manner. For all we know, the "incompetence" in the redactions may be a way of saying: "See, we have nothing to hide."

PunchyHamster

12 days ago

Great, now govt will have easy tool to check it

unstatusthequo

12 days ago

It’s a bit amusing seeing ediscovery principles go mainstream.

shevy-java

11 days ago

The issue is more that the current US government is not very bright. Nor very open. Kind of rogue-like.

I think governments should not be able to hide information from citizens in general. I don't trust those who hide stuff while being fed money from the taxpayers - that is a modern form of slavery.

tamimio

12 days ago

Tech people would be shocked and surprised to know how tech-illiterate non-tech people are. Reminds me of old days when the IT guy is AIO in some non-tech facility and is treated like god!!

eviks

12 days ago

Pity such an awful document format with so many basic fails at being digital, continues to reign in a lot of areas!

View full discussion on Hacker News

ID: 46369923Type: storyLast synced: 12/26/2025, 9:45:55 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN