Show HN: Every single torrent is on this website
infohash.lolI can't follow the logic here. How does this detect other announcers?
https://en.wikipedia.org/wiki/Honeytoken
> In the field of computer security, honeytokens are honeypots that are not computer systems. Their value lies not in their use, but in their abuse.
That's not detecting "announcers", but maybe more like detecting "indexers".
I think you’re correct, as the secondary freebooting indexers are adding their tracker(s) after the fact of the private torrent’s creation/origination to the original prefilled list of trackers, and inserting their tracker(s) to the reuploaded, usually public, torrent, and sometimes even removing the original private trackers so as to not phone home and tell on themselves.
I’m happy to be corrected, but private trackers typically bind the downloading IP of the torrent to the announcing tracker to validate legitimate clients. Private trackers don’t consider any extra trackers (announcers in this context) as valid or authorized. I have heard that modded BitTorrent clients can intentionally misreport upload stats to fudge the numbers for gaming your quota, as many private trackers/torrent sites enforce a positive >1.0 or higher minimum ratio.
I’ve heard of ways that folks with legitimate access to the private torrent tracker and torrents clone the IPs of other clients and then use a secondary torrent client to request blocks, bypassing the tracker entirely and not reporting any downloads (or uploads, for that matter), so the quota of the first legit client is not affected positively or negatively.
"I didn't share that! It was on infohash.lol first!"
More details here: https://allthemusic.info/faqs/
In theory this could be used to share torrent links by a different reference (ideally you could also add an anchor too). Somebody else could have a page that takes keywords and points you to pages hosted on the site.
DHT crawlers/indexers already exist to perform that function; they crawl and store infohashes (+ metadata when they receive it) and allow users to search that metadata to return relevant infohashes
This is a sample of the client-side code I found handling that: https://infohash.lol/_next/static/chunks/pages/p/%5Bpage%5D-...
There are also valid clients for completely unrelated protocols using the BitTorrent DHT to find each other.
* It has more wrong information than right information, with no way to tell the difference.
* If you had an oracle that could tell you how to get to the book you need, the navigation instructions to get to the book will be at least as long as the book, on average.
All computer files are sequences of bits. All sequences of bits are integers. All integers already exist in the infinite set of natural numbers. I can even calculate how big those numbers are given their bit count.
digits(bits) = ceil(bits * log10(2))
digits(32) = 10
digits(64) = 20
digits(128) = 39
digits(256) = 78
digits(512) = 155
digits(1024) = 309
digits(20 KiB) = 49,321
digits(2 GiB) = 5,171,655,946
We are merely discovering numbers through convoluted mental and technological processes. All our mental exertions result in the discovery of a number. This comment is a number.At our university lab we've been working on this for 25 years. Building a search engine is the easy part. Keeping a federated server with a billion users running is unsolved. Creating a fully -serverless- decentralised search engine is possible, you also need self-funding economy. Seems we're one of the few labs worldwide to still make actual operational prototypes of this stuff. More shameless self promotion:
"SwarmSearch: Decentralized Search Engine with Self-Funding Economy" [0]
Really handy to have s search engine to search this webpage with 45,671,926,166,590,716,193,865,151,022,383,844,364,247,891,968 pages and the rest of the web (no spyware, no tracking).
Of course this is silly, but interesting nonetheless. And we routinely speak about such high-dimensional spaces in research and engineering. Or we can imagine optimization as traversing a pre-existing search space. It may be structured as a graph or perhaps a Euclidean space. And in that space we can imagine a loss surface, that sits there in peace all along, with its global minimum somewhere. And instead of "constructing" a solution, we are simply hiking in this space and trying to spot that valley. But this is a bit fictional. We never physically "instantiate" this surface. It's an imagined abstraction. In reality we just have a vector and some rules as to how we change that vector. But we can imagine those changes to be movements in an imagined space.
It's like the idea that the sculptor doesn't create the sculpture, the sculpture was there all along, he just had to remove the superfluous matter to reveal what was already there (i.e. the atoms belonging to the final sculpture).
The most interesting thing is kind of on the border, between these absurdly large spaces and the more manageable ones that are feasible to enumerate.
Another similar mindblow thing was when I forgot the password to a file that I encrypted. It's a fascinating thing that the bit pattern on the disk is functionally random now, and cracking it would take longer than the age of the universe. But if only I knew the password, it would only take just a second. There is a definite sequence of keystrokes I can execute to bring the universe in a state where the content will appear on my screen, it's so close, yet it's so-so far if you don't remember the password. Just a little difference in your brain state and it flips from trivial to hopeless.
PS, if you like thinking about such things, I recommend Meta-Math by Gregory Chaitin, it's very fun (providing an address VS constructing the thing is basically the gist of algorithmic information theory).
> It's like the idea that the sculptor doesn't create the sculpture, the sculpture was there all along, he just had to remove the superfluous matter to reveal what was already there (i.e. the atoms belonging to the final sculpture).
I understand this argument but I have far more trouble applying this logic to real things. I'm not sure the same logic applies once the information is instantiated in the real world as a physical object. I haven't thought very deeply about it. I think the true sculpture exists only in the ideal world and the real world object is merely an approximation of it.
> Of course this is silly
It's an existential issue for me. At some point it became a political issue. I became a copyright abolitionist because of this insight. Copyright is logically reducible to monopolistic ownership of numbers. The sheer absurdity of it led me to reject the very idea of intellectual property as delusional nonsense.
The legal system is rather the spiritual successor of the original "system" where a wise Solomon-like elder would adjugate the issue based on their best judgment and intuition and customs, ideally seeking peace and social satisfaction and future harmony. Codified law channels this into some more pre-shaped form, but the fuel of the legal system is still the human judgment and common sense at the core. Often the law basically just prompts and nudges the judgment of the jurors or judge to a certain direction, but it can't account for all corner cases. The nerd mind asks ok ok but what if X, where do you draw the sharp line between X and Y? It doesn't matter. If it comes up, a court will decide it based on all available common sense and the implicit values of the culture.
In the cases where someone seemingly gets away with "rules-lawyering", then it's not purely their genius logic-brain that wins, but there is some kind of slanted playing field that's not really available to you. Of course the line between "annoying rules-lawyering based on literal interpretation of technicalities that obviously nobody intended to be interpreted so" and something that was not anticipated initially but does fit within the rules. This decision itself is based on judgment and intuition. In life, sometimes coming up with a "technically works" thing is rewarded and lauded (math proofs, pathological counterexamples, cracking an encryption library via side-channel attacks), other times you get an eye-roll and that's obviously cheating and wasn't meant (e.g. courts of law and fun at parties).
Intellectual property is just rent seeking by established corporations as well as protection against competition so that others can't enter the market. The days where they protected individual creators and inventors are long gone.
It's basically creating value out of nowhere in lieu of resources that are truly valuable, but inconvenient to trade directly. But then like a metrics that got corrupted (I forgot the name of the law for that), there are other that are trying to game the system (and succeeding) so that they can maximize their share.
See also https://en.wikipedia.org/wiki/What_the_Tortoise_Said_to_Achi...
The claim is that humans are not "creators" but generators, very much in the random number generator sense. We are interesting number generators.
This kind of imposing or order is an act of lowering the entropy of the sample in a very specific way, parties that know the 'key' to the sample will be able to experience the sample in a way that parties without the key would not, to them the sample is still boring or random. Your reduction of the act of creation to picking a particular number is belying the fact that absolutely nobody that creates something is picking that number: the number is a carrier, it is not the ideas embedded in it. You could translate that novel (or textbook, or sound or video or any other medium) into other media, descriptive, literal or you could even completely transform it. And there would still be a relationship to the original creation, hence the concept of a 'derived work', which for your numbers example would utterly fail: you could not take that number outside of knowing its meaning and come up with any of these derivations without having the key to decode it.
This kind of reductive reasoning is not helpful, it merely attempts to flatten a whole pile of some of the most accomplished and positive contributions by humanity to the generation of interesting numbers. And it is so much more than that.
Besides all this, any kind of attempt to digitize an actual work of art, rather than just a simple text is going to be a lossy process. You are never going to be able to replicate the original to the point that you have created something that is equal. You may be able to get close but it won't be the same thing. More so for sculptures than for two dimensional art, less so for for instance audio where the replication gear is getting really good. But generation loss is a thing and if you re-create and re-digitize then after a surprisingly low number of such generations you will end up with noise.
Authors, sculptors, painters, even programmers and other creative people are so much more than interesting number generators, even if their works can be encoded or approximated numerically. That's flipping the encoding analogy on its head, the map really isn't the territory.
And that’s without me asking you to define “real”, which would be another rabbit hole.
This isn't quite true. Natural language text compresses extremely well and you would only need length equivalent to the compressed form, not the original form. And if you wanted to go further, you could use a mapping where extremely short strings map to known popular books and only unknown works have longer encodings.
But if I built one, it would totally work that way.
Only if the oracle has all books that could possibly exist. If you're trying to find a book that already exists, that set is infinitely smaller.
Filtered how? By some keywords I don't want to know? What about encrypted zips of CSAM? There's no way to filter that in reality.
If you want to learn more about why and you can either speak German or can handle youtubes auto translate i recommend this documentation on the matter[0]. The Pedo Criminals are using scene methods to share their illegal content.
I guess you could filter all torrents that include just zips/rars/7zips. That would exclude a lot of harmless content. Probably too much harmless content to make it a default, but if you only care about hollywood releases it would be a useful filter
If there was a public list of hashes of (8/18KiB blocks of) CSAM content that would be useful for a filter, but I don't think such a thing exists
But wouldn't that just be a list of CSAM to look up?
I can generate a Google link with an infohash in the same fashion: https://www.google.com/search?q=1548262051907755713575797913...
Yandex is the only search engine that's even marginal useful for that now.
https://www.theregister.com/2001/09/11/worlds_first_decss_ex...
I'm of the impression that serving either the infohash or the torrent is considered to violate DMCA. DMCA does not just forbid sharing copyrighted material, but also sharing links to the copyrighted material or generally anything that can help people bypass copyright protections (including software that can decrypt even trivial DRM).
I made something similar a while ago, the Hdd of Babel [2], which contains all possible files(*) , and wrote down some thoughts on it [3].
I really like how it makes us think about the nature of information.
[1] https://libraryofbabel.app/
[2] https://mkaandorp.github.io/hdd-of-babel/
[3] https://dev.to/mkaandorp/this-website-contains-pictures-of-y...