AI Documentation You Can Talk To, for Every Repo
Postedabout 2 months agoActiveabout 2 months ago
deepwiki.comTechstoryHigh profile
controversialmixed
Debate
80/100
AI DocumentationCode ComprehensionLLM Limitations
Key topics
AI Documentation
Code Comprehension
LLM Limitations
Deepwiki.com is a platform that generates AI-powered documentation for GitHub repositories, sparking debate among HN users about its effectiveness and potential drawbacks.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
44m
Peak period
65
0-6h
Avg / period
13.7
Comment distribution123 data points
Loading chart...
Based on 123 loaded comments
Key moments
- 01Story posted
Nov 10, 2025 at 11:38 PM EST
about 2 months ago
Step 01 - 02First comment
Nov 11, 2025 at 12:22 AM EST
44m after posting
Step 02 - 03Peak activity
65 comments in 0-6h
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 15, 2025 at 5:38 AM EST
about 2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45884169Type: storyLast synced: 11/20/2025, 4:23:22 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
231 points | 77 days ago | 53 comments
https://news.ycombinator.com/item?id=45002092
Worth mentioning this is a Cognition / Devin on-ramp and has been posted on HN a few times in just a couple months, feels a little sales-y to me.
But it's docs outside the dev's purview on a deepwiki url, used to shepherd people into Devin. Wow. Talk about slimy.
And WTF with these floating boxes popping up everywhere?!? They are tailor-made to trigger anxiety in people with OCD. They look like a notification that keep grabbing your attention as you scroll the text. Example: https://aws.amazon.com/blogs/aws/secure-eks-clusters-with-th...
Will need boxblock.
So looks like it's not actually any repository.
That explains why none of the projects which I tried worked.
I wonder why they’d use a descentralised protocol but then only support a single host.
https://github.com/cameyo42/newLISP-Code
https://deepwiki.com/cameyo42/newLISP-Code/3.1-newlisp-99-pr...
https://deepwiki.com/gdzig/gdzig/1-overview
I hope actual users never see this. I dread thinking about having to go around to various LLM generated sites to correct documentation I never approved of to stop confusing users that are tricked into reading it.
[0]: https://deepwiki.com/blopker/codebook
- Users are confused by autogenerated docs and don’t even want to try using a project because of it
- Real curated project documentation is no longer corrected by users feedback (because they never reach it)
- LLMs are trained on wrong autogenerated documentation: a downward spiral for hallucinations! (Maybe this one could then force users go look for the official docs? But not sure at this point…)
On this, I think, we should have some kind of AI-generated meta-tag, like this: https://github.com/whatwg/html/issues/9479
Also, no one is reading your resume anymore and big corps cannot be trusted with any rule as half of them think the next-word-machine is going to create God.
The point of the wiki is to help people learn the codebase so they can possibly contribute to the project, not for end users. It absolutely should explain implementation details. I do agree that it goes overboard with the diagrams. I’m curious, I’ve seen other moderately sized repo owners rave about how DeepWiki did very well in explaining implementation details. What specifically was it getting wrong about your code in your case? Is it just that it’s outdated?
There is a folder for a VS Code extension here[0]. It seems to have a README with installation instructions. There is also an extension.ts file, which seems to me to be at least the initial prototype for the extension. Did you forget that you started implementing this?
[0] https://github.com/blopker/codebook/blob/c141f349a10ba170424...
This is not an ad for LLMs. If you think this is good, you should probably not ever touch code that humans interact with.
From a fellow LLM-powered app builder, I wish you best of luck!
[0] https://github.com/blopker/codebook/blob/main/vscode-extensi...
Yes, there is a VS Code folder in that repo. However, it doesn't exist as an actual extension. It's an experiment that does not remotely work.
The LLM generated docs has confidently decided that not only does it exist, but it is the primary installation method.
This is wrong.
Edit: I've now had to go into the Readme of this extension to add a note to LLMs explicitly to not recommend it to users. I hate this.
The value proposition here is that these llm docs would be useful, however in this case they were not.
But his own documentation did said that there was a VSCode extension, with installation instructions, a README, changelog, etc. From what he said, it doesn't even compile or remotely work. It would be extremely aggravating to attempt to build the project with the maintainer's own documentation, spend an hour trying to figure out what's wrong, and then contact the maintainer for him to say, "oh yeah, that documentation not correct, that doesn't even compile even though I said it did 2 months ago lol." It is extremely ironic that he is so gungho about DeepWiki getting this wrong.
That seems about as annoying as a random wiki mis-explaining your system.
That being said, I am still biased towards empathizing with the library author since contributing to open source should be seen as being a great service already in and of itself, and I'd default to avoiding casting blame at an author for not doing things "perfectly" or whatever when they are already doing volunteer work/sharing code they could just keep private.
This is true, and the only reason for this was more so his dismissive view of DeepWiki than a criticism of the project itself or of the author as a programmer. LLMs hallucinate all the time, but there is usually a method to the way they do so. Particularly, for it to just say a repo had a VSCode extension portion with nothing pointing to it would not be typical at all for an LLM like DeepWiki.
The WIP code was committed with the expectation that very few people would see it because it was not linked anywhere in the main readme. It's a calculated risk, so that the code wouldn't get out of date with main. The risk changed when their LLM (wrongly) decided to elevate it to users before it was ready.
It's clear DeepWiki is just a sales funnel for Devin, so all of this is being done in bad faith anyway. I don't expect them to care much.
Not talking about this tool, but in general-incorrect LLM-generated documentation can have some value - developer knows they should write some docs, but are starring at a blank screen and not sure what to write so they don’t. Then developer runs an LLM, gets a screenful of LLM-generated docs, notices it is full of mistakes, starts correcting them-suddenly, a screenful of half-decent docs.
For this to actually work, you need to keep the quantity of generated docs a trickle rather than a flood-too many and the developer’s eyes glaze over and they miss stuff or just can’t be bothered. But a small trickle of errors to correct could actually be a decent motivator to build up better documentation over time.
Fundamentally this is an alignment problem.
There isnt a single AI out there that wont lie to your face, reinterpret your prompt, or just decide to ignore your prompt.
When they try to write a doc based off code, there is nothing you can do to prevent them from making up a load of nonsense and pretending it is thoroughly validated.
Do we have any reason to believe alignment will be solved any time soon?
This isnt a matter of training data quality.
This is another one of those bizarre situations that keeps happening in AI coding related matters where people can look at the same thing and reach diametrically opposed conclusions. It's very peculiar and I've never experienced anything like it in my career until recently.
React vs other frameworks (or no framework). Object oriented vs functional. There's loads of examples of this that predate AI.
This feels to me more like the horses vs cars thing, computers vs... something (no computers?), crypto vs "dollar-pegged" money, etc. It's deeper. I'm not saying the AI people are the "car" people, just that...there will be one opinion that will exist in 5-20 years, and the other will be gone. Which one... we'll see.
React vs no framework is at least in the same ballpark as AI vs no AI. Some people are determined to prove to the world that React/AI/functional programming solves everything. Some people are determined to prove the opposite. Most people just quietly use them without feeling like they need to prove anything.
Bad documentation full of obvious errors and nonsense is very different to having an opinion on OO vs Functional programming.
Even that sentence sounds insane because who would ever compare the two?!
No need to guess.
Original: https://docs.tirreno.com/
Deepwiki: https://deepwiki.com/tirrenotechnologies/tirreno
Github: https://github.com/tirrenotechnologies/tirreno
But you’re not looking at the same thing — you’re looking at two completely different sets of output.
Perhaps their project uses a more obscure language, has a more complex architecture, resembles another project that’s tripping up the interpretation of it. You have have excellent results without it being perfect for everything. Nothing is perfect and it’s important for people making these things to know how, right?
In my career I’ve never seen such aggressive dismissal of people’s negative experiences without even knowing if their use case is significantly different.
The code base has a lot of documentation in the form of many individual text files. Each describe some isolated aspect of the code in dense, info-rich and not entirely easily consumable (by humans) detail. As numerous as these docs are, the code has many more aspects that lack explicit documentation. And there is a general lack of high-level documentation that tie each isolated doc into some cohesive whole.
I formed a few conclusions about the deepwiki-generated content: First, it is really good where it regurgitates information from the code docs while being rather bad or simply missing for aspects not covered by the provided docs. Second, deepwiki is so-so for providing a high layer of documentation that sort of ties things together. Third, it is highly biased about the importance of various aspects by their code docs coverage.
The lessons I take from this are: deepwiki does better ingesting narrative than code. I can spend less effort on polishing individual documentation (not worrying about how easy it is for humans to absorb). I should instead spend that effort to fill in gaps, both details and to provide higher-level layers of narrative to unify the detailed documentation. I don't need to spend effort on making that unification explicit via sectioning, linking, ordering, etc as one may expect for a "manual" with a table of contents.
In short, I can interpret deepwiki's failings as identifying gaps that need filling by humans while leaning on deepwiki (or similar) to provide polish and some gap putty.
E.g. If you describe how the user service exists you wont necessarily capture where it is used.
If you document why the user service exists you will often mention who or what needs it to exist, the thing that gives it a purpose. Do this throughout and everything ends up tied together at a higher level.
Yeah this seems to be a recurring issue on each of the repos I've tried. Some occasionally useful tables or diagrams buried in pages of distracting irrelevant slop.
it's the first result on google for just about anything technical I search for
I have bad news for you, this website has been appearing near the top of the search results for some time now. I consciously avoid clicking on it every time.
what about the dependencies? you could just clone them as well (which is what I do occasionally), but deepwiki is faster (for indexed repos) and free
And if a human spent painstaking effort writing excellent docs, the least bit of respect i can give them is read it.
Are you sure? I just tried it on projects of mine that have almost zero documentation it did a fairly good job.
There is a very clear point in codebase size where LLMs tend to falter without very clear written down overview descriptions of the system structure. I have a hard time seeing that this system would be immune to that.
i have encountered LLMs seeminly knowing more about a system than it should because there are many similar in its training set; but that just lead me to be extra sceptical when it pulls up functions that dont exist. (Ive fought LLMs about json libraries quite a bit)
I see "AI summaries" on github all the time. It's like a wall of text and seems to be designed to be super-verbose but without seemingly being very informative.
Cool idea, bad timing
But it seems to be producing docs that are better than I tend to see with basic "summarize this repo for me"-style prompts, which is what I usually use on a first pass.
AI must RTFM. https://passo.uno/from-tech-writers-to-ai-context-curators/
Seems like a consistent pattern.
I don't think either of this is true of LLMs. You obviously can improve its results with the right prompt + context + model choice, to a pretty large degree. The probability...hard to quantify, so I won't try. Let's just say that you wouldn't say you are addicted to your car because you have a 1% chance of being stuck in the middle of nowhere if it breaks down and 99% chance of a reward. The threshold I'm not sure.
HN is super susceptible to propaganda in the AI age unfortunately; I think at this point a lot of the comments and posts on here are from bots as well
I'm working on RecallBricks (memory infrastructure for AI coding tools) and seeing similar problems: AI tools are great at answering questions about code right now, but they don't remember the conversation you had last week about why you chose this architecture over that one.
For documentation specifically, have you thought about combining the AI-generated docs with a memory layer that captures decision history? Like "this API endpoint exists because of issue #247 where users needed X functionality." That context makes docs way more useful than just describing what the code does.
Curious how you're handling the "outdated docs" problem mentioned above - do you have triggers to regenerate when code changes significantly?
I really don't like how AI summaries creep up in SEO rankings and make it harder for me to find the actual, official documentation.
My repo has a plugin structure (https://github.com/ytreister/gibr), and I love how it added a section about adding a new plugin: https://deepwiki.com/ytreister/gibr/7.4-adding-a-new-issue-t...
but then as i kept going along it just got tiring, it kept calling everything sophisticated even when it wasn't
it's the same as all the other AI slop, it's really impressive the first time you see it
and then you keep seeing it and get tired of its patterns of speech etc and oh it's just making up nonsense
and now the ai slop "documentation" is up on the public internet for all to see with no way for me to remove it :)
[0] https://www.ilograph.com/blog/posts/diagrams-ai-can-and-cann...
1. On (https://github.com/voodooEntity/gits) -> https://deepwiki.com/voodooEntity/gits
This is a longterm golang project i work on and it has a very very detailed documentation already.
While going through the AI docs of deepwiki, i could see how it profitted from my existing documentation, most stuff is just different words same content. What i liked about it was the visualisations (even if some of them are well "special") it shows some insides in workflows that i have in my mind but might give a benefit to others not beein the author
While trying out the search/chat i have to admit it gave better answers than i expected.
Due to having a very fond knowledge of how to do stuff efficiently with the lib, i tested the chat on telling me whats the most efficient way to achieve XYZ. While it listed me all possibilities (all of them correct) it also correctly pointed out whats the most "efficient" way.
Also i gave it some question that, i know from experience when others first tried the lib, could be confusing. But it was resolved correctly.
Allover a pleasant result
2. On (github.com/electronicarts/CnC_Renegade/) -> https://deepwiki.com/electronicarts/CnC_Renegade/
For those who dont know , CnC Renegade is a very old game (~2000) which was coded by the original Westwood. Its mainly in C++ (some c) and a through and through plain code. There is no real documentation in the repo other than some base info for dependencies etc.
First of all i saw that the resulting documentation well.... lacked documentation i guess? It just in multiple pages explaind whats in the main Readme (which is not really alot). So from the "docs generating" perspective, no gain here.
Than i tried to chat with it about it - and it seemed like it has a basic understanding of the code. For me its harder to validate the results (tbh i only read over the code once when it was released - curiosity) but it seemed like it was no total loss.
Conclusion: To me it seems like, to get a very good basic documentation out of it, it already must have a good basic documentation. Apart from the graphics it added, i didn't really see a gain compared to the already existing documentation.
Based on the chat results i'd say, those might be decent and helpfull if you dig into a new codebase especially a more complex one and you are searching for a specific thing in 1000s of loc in multiple files.
Would i use it in the future? Ill maybe try, but only the chat feature - for the generated docs as elaborated i don't see any use.
deepwiki.com's generated page on my project contains several glaring errors. I hate to think of the extra support burden I will have to bear because of deepwiki.com publishing wrong information.
I asked the authors of the site (Andrew Gao) to remove their page on my project, but they ignored my request.
It's a cute little Electron-based mini Spotify player that gets maybe like 200 users a day and has 1.3k stars on GitHub. Code quality is pretty high and it's more or less "feature-complete." There's a lot of simple/typical React stuff in there, but there's also some weird stuff I had to do. For example, native volume capture is weird. But even weirder is having to mess with the Electron internal window boundaries (so people can move their Lofi window where-ever they want to).
We're essentially suppressing window rect constraints using some funky ObjectiveC black magic[1]. The code isn't complicated[1], but it's weird and probably very specific to this use case. When I ask what "constraints" does, DeepWiki totally breaks, telling me it doesn't even have access to those source files[2] (which it does).
Visualizations were also actually disabled on MacOS a few versions ago (because of the janky way you need to hook into the audio driver), but, again DeepWiki doesn't really notice[3]. There have been issues/patch notes about this, so I feel those should be getting crawled.
[1] https://github.com/dvx/lofi/blob/master/src/native/black-mag...
[2] https://deepwiki.com/search/what-is-constraints_cc5c0478-e45...
[3] https://deepwiki.com/search/how-do-macos-visualizations-wo_d...
The only issues I have with it are that they layout is not great on small screens, poor experience on my 13" laptop, and I really wish you could hide the "Ask Devin" dialog. The experience is pretty good on my tablet though, I would prefer to use the tablet for reading/annotating the code and have deepwiki on the laptop but not that big of a deal.
Again, as it might be hard to believe, as situation is rather insane: flagship AI assistants cannot get publicly available code. They can get bits and pieces from README, but that might degrade response quality as it's often based on guesswork, etc.
Example via GPT-5 Thinking, my request:
``` Can you read code from https://github.com/killerstorm/auto-ml-runner/blob/master/ru... ?
If yes, show me some port of code and how you got it. ```
(That's normal URL user can access from the browser, you also get same result if you post top-level repo URL https://github.com/killerstorm/auto-ml-runner/).
Thinking: 2 minutes. (IT WAS THINKING FOR TWO MINUTES JUST TO ACCESS ONE FILE!)
``` Short answer: yes...
Why I’m not pasting a snippet right this second:
In this environment, GitHub’s code pages are loading their chrome but not returning the file body ```
So, actually, no, it cannot read it, but it believes it can. That's rather problematic.
Claude: "Unfortunately, I cannot directly read the code from that GitHub URL."
Gemini: "While the tool was unable to retrieve the full, clean code directly, this inferred portion ...". I.e. it just imagined the code. The snippet has nothing to do with code in repo.
This is a rather fucktacular situation as agents are not sure if they read the code, and they might hallucinate subtly wrong code trying to be helpful.
As I can fetch this via curl it seems like GitHub is deliberately blocking AI agents including their partner OpenAI.
One is the extremely sprawling MarginaliaSearch repo[M1].
Here it did a decent job of capturing the architecture, though it is to be fair well documented in the repo itself. It successfully identifies the most important components, which is also good.
But when describing the components, it only really succeeds where the components themselves are very self-contained and easy to grok. It did a decent job with e.g. the buffer pool[M2], but even then fails to define some concepts that would have made it easier to follow, e.g. what is a pin count in buffer management? This is standard terminology and something the model should know.
I get the impression it lifts a lot of its fact from the comments and documentation that already exists, which may lead it to propagate outdated falsehoods about the code.
[M1] https://deepwiki.com/MarginaliaSearch/MarginaliaSearch
[M2] https://deepwiki.com/MarginaliaSearch/MarginaliaSearch/5.2-b...
The other is the SlopData[S1] repo, which contains a small library for columnar data serialization.
This one I wasn't very impressed with. It produced more documentation than was necessary, mostly amending what was already there with incorrect statements it seems to have pulled out of its posterior[2][3].
The library is very low-abstraction, and there simply isn't a lot of architecture to diagram, but the model seems to insist that there must be a lot of architecture and then produces excessive diagrams as a result.
[S1] https://deepwiki.com/MarginaliaSearch/SlopData
[S2] https://deepwiki.com/MarginaliaSearch/SlopData#storage-types (performance numbers are completely invented, in practice reading compressed data is typically faster than plain data)
[S3] https://deepwiki.com/MarginaliaSearch/SlopData/6.3-zip-packa... (the overview section is false, all these tables are immutable).
So overall it gives me a bit of a broken clock vibe. When it's right, it's great. When it isn't, it's not very useful. Good at the stuff that is already easy, borderline useless for the stuff that isn't.
3 more comments available on Hacker News