Who Needs Git When You Have 1m Context Windows?
Posted3 months agoActive3 months ago
alexmolas.comTechstoryHigh profile
heatednegative
Debate
80/100
LLMVersion ControlSoftware Development
Key topics
LLM
Version Control
Software Development
The article discusses using a large context window LLM to recover a deleted file, sparking debate about the reliability and limitations of LLMs as a version control substitute.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
4d
Peak period
123
Day 5
Avg / period
26.7
Comment distribution160 data points
Loading chart...
Based on 160 loaded comments
Key moments
- 01Story posted
Oct 3, 2025 at 9:37 AM EDT
3 months ago
Step 01 - 02First comment
Oct 7, 2025 at 8:48 AM EDT
4d after posting
Step 02 - 03Peak activity
123 comments in Day 5
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 14, 2025 at 4:01 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45462877Type: storyLast synced: 11/20/2025, 7:35:46 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
I assume OP was lucky because the initial file seems like it was at the very start of the context window, but if it had been at the end it would have returned a completely hallucinated mess.
This is an amusing anecdote. But the only lesson to be learned is to commit early, commit often.
(Side ask to people using Jujutsu: isn't it a use case where jujutsu shines?)
But why are any PRs like this? Each PR should represent an atomic action against the codebase - implementing feature 1234, fixing bug 4567. The project's changelog should only be updated at the end of each PR. The fact that I went down the wrong path three times doesn't need to be documented.
We can bikeshed about this for days. Not every feature can be made in an atomic way.
To be honest, I usually get this with people who have never realized that you can merge dead code (code that is never called). You can basically merge an entire feature this way, with the last PR “turning it on” or adding a feature flag — optionally removing the old code at this point as well.
My industry is also fairly strictly regulated and we plainly cannot do that even if we wanted to, but that's admittedly a niche case.
No more than normal? Generally speaking, the author working on the feature is the only one who’s working on the new code, right? The whole team can see it, but generally isn’t using it.
> If the code is being changed for another reason, or the new feature needs to update code used in many places, etc. It can be much more practical to just have a long-lived branch, merge changes from upstream yourself, and merge when it's ready.
If you have people good at what they do ... maybe. I’ve seen this end very badly due to merge artefacts, so I wouldn’t recommend doing any merges, but rebasing instead. In any case, you can always copy a function to another function: do_something_v2(). Then after you remove the v1, remove the v2 prefix. It isn’t rocket science.
> My industry is also fairly strictly regulated and we plainly cannot do that even if we wanted to, but that's admittedly a niche case.
I can’t think of any regulations in any country (and I know of a lot of them) that dictate how you do code changes. The only thing I can think of is your own company’s policies in relation to those regulations; in which case, you can change your own policies.
> I can’t think of any regulations in any country (and I know of a lot of them) that dictate how you do code changes
https://blog.johner-institute.com/regulatory-affairs/design-...
Hey look at us, two alike thinking people! I never said "let's include all the mess".
Looking at the other extreme someone in this thread said they didn't want other people to see the 3 attempts it took to get it right. Sure if it's just a mess (or, since this is 2025, ai slop) squash it away. But in some situations you want to keep a history of the failed attemps. Maybe one of them was actually the better solution but you were just short of making it work, or maybe someone in the future will be able to see that method X didn't work and won't have to find out himself.
Main should be a clear, concise log of changes. It's already hard enough to parse code and it's made even harder by then parsing versions throughout the code's history, we should try to minimize the cognitive load required to track the number of times something is added and then immediately removed because there's going to be enough of that already in the finished merges.
You already have the information in a commit. Moving that to another database like a wiki or markdown file is work and it is lossy. If you create branches to archive history you end up with branches that stick around indefinitely which I think most would feel is worse.
> Main should be a clear, concise log of changes.
No, that's what a changelog is for.
You can already view a range of commits as one diff in git. You don't need to squash them in the history to do that.
I am beginning to think that the people who advocate for squashing everything have `git commit` bound to ctrl+s and smash that every couple minutes with an auto-generated commit message. The characterization that commits are necessarily messy and need to be squashed as to "minimize the cognitive load" is just not my experience.
Nobody who advocates for squashing even talks about how they reason about squashing the commit messages. Like it doesn't come into their calculation. Why is that? My guess is, they don't write commit messages. And that's a big reason why they think that commits have high "cognitive load".
Some of my commit messages are longer than the code diffs. Other times, the code diffs are substantial and there are is a paragraph or three explaining it in the commit message.
Having to squash commits with paragraphs of commit messages always loses resolution and specificity. It removes context and creates more work for me to try to figure out how to squash it in a way where the messages can be understood with the context removed by the squash. I don't know why you would do that to yourself?
If you have a totally different workflow where your commits are not deliberate, then maybe squashing every merge as a matter of policy makes sense there. But don't advocate that as a general rule for everyone.
But the fact is your complete PR commit history gives most people a headache unless it's multiple important fixes in one PR for conveniency's sake. Happens at least for me very rarely. Important things should be documented in say a separate markdown file.
That's called a commit. Not sure why some insist on replacing commits with vendor lock-in with less tooling and calling it progress.
We can agree that we don't need those additional steps once the PR is merged, though, right?
I automatically commit every time my editor (emacs) saves a file and I've been doing this for years (magit-wip). Nobody should be afraid of doing this!
I make "real" commits as I go and use a combination of `git commit --amend` and fixup commits (via git-autofixup) and `rebase --autosquash`. I periodically (daily, at least) fetch upstream and rebase on to my target branch. I find if you keep on top of things you won't end up with some enormous conflict that you can't remember how to resolve.
Every so often this still means that devs working on a feature will need to rebase back on the latest version of the shared branch, but if your code is reasonably modular and your project management doesn't have people overlapping too much this shouldn't be terribly painful.
This itself seems to me the thing which will make me push towards jj.
So if I am correct, you are telling me that I can have jj where I can then write anything in the project and it can sort of automatically record it to jj and afterwards by just learning some more about jj, I can then use that history to create a sane method for me to create git commits and do other thing without having to worry too much.
Like I like git but it scares me a little bit, having too many git commits would scare me even further but I would love to use jj if it can make things less scary
Like what would be the command / exact workflow which I am asking in jj and just any details since I am so curious about it. I have also suffered so much of accidentally deleting files or looking through chat logs if I was copy pasting from chatgpt for some one off scripts and wishing for a history of my file but not wanting git everytime since it would be more friction than not of sorts...
> you are telling me that I can have jj where I can then write anything in the project and it can sort of automatically record it to jj
By default, yes, jj will automatically record things into commits. There's no staging area, so no git add, stuff like that. If you like that workflow, you can do it in jj too, but it's not a special feature like it is in git.
> and afterwards by just learning some more about jj, I can then use that history to create a sane method for me to create git commits and do other thing without having to worry too much.
Yep. jj makes it really easy to chop up history into whatever you'd like.
> I would love to use jj if it can make things less scary
One thing that jj has that makes it less scary is jj undo: this is an easy to use form of the stuff I'm talking about, where it just undoes the last change you made. This makes it really easy to try out jj commands, if it does something you don't like, you can just jj undo and things will go back to the way before. It's really nice for learning.
> Like what would be the command / exact workflow which I am asking in jj
jj gives you a ton of tools to do this, so you can do a lot of different things. However, if what you want is "I want to just add a ton of stuff and then break it up into smaller commits later," then you can just edit your files until you're good to go, and then run 'jj split' to break your current diff into two. You'd break off whatever you want to be in the first commit, and then run it again to break off whatever you'd want into the second commit, until you're done.
If you are worried about recovering deleted files, the best way to be sure would to be using the watchman integration: https://jj-vcs.github.io/jj/latest/config/#watchman this would ensure that when you delete the file, jj notices. Otherwise, if you added a file, and then deleted it, and never ran a jj comamnd in between, jj isn't going to notice.
Then, you'd run `jj evolog`, and find the id of the change right before you deleted the file. Let's pretend that's abc123. You can then use `jj restore` to bring it back:
This says "I want to bring back the version of /path/to/file from abc123, and since that's the one before it was deleted, you'd get it back as you had it.I tend to find myself not doing this a ton, because I prefer to make a ton of little changes up front, which just means running 'jj new' at any point i want to checkpoint things, and then later squashing them together in a way that makes sense. This makes this a bit easier, because you don't need to read through the whole evolog, you can just look at a parent change. But since this is about restoring something you didn't realize you deleted, this is the ultimate thing you'd have to do in the worst case.
It's easier than that. Your jj commits are the commits that will be pushed - not all the individual git commits.
Conceptually, think of two types of commits: jj and git. When you do `jj new`, you are creating a jj commit.[1] While working on this, every time you run a command like `jj status`, it will make a git commit, without changing the jj commit. When you're done with the feature and type `jj new` again, you now have two jj commits, and many, many git commits.[2] When you do a `jj git push`, it will send the jj commits, without all the messiness of the git commits.
Technically, the above is inaccurate. It's all git commits anyway. However, jj lets you distinguish between the two types of commits: I call them coarse and fine grained commits. Or you can think hierarchically: Each jj commit has its own git repository to track the changes while you worked on the feature.[2]
So no, you don't need to intentionally use that history to create git commits. jj should handle it all for you.
I think you should go back to it and play some more :-)
[1] changeset, whatever you want to call it.
[2] Again - inaccurate, but useful.
Yes! For the case discussed in the article, I actually just wrote a comment yesterday on lobsters about the 'evolog': https://lobste.rs/s/xmlpu8/saving_my_commit_with_jj_evolog#c...
Basically, jj will give you a checkpoint every time you run a jj command, or if you set up file watching, every time a file changes. This means you could recover this sort of thing, assuming you'd either run a commend in the meantime or had turned that on.
Beyond that, it is true in my experience that jj makes it super easy to commit early, commit often, and clean things up afterwards, so even though I was a fan of doing that in git, I do it even more with jj.
I'm sure there's an emacs module for this.
When I eventually move on, I will likely find or implement something similar. It is just so useful.
I eventually added support for killing buffers, but I rarely do (only if there's stuff I need to purge for e.g. liability reasons). After a few years use, I now have 5726 buffers open (just checked).
I guess I should garbage collect this at some point and/or at least migrate it from the structure that gets loaded on every startup (it's client-server, so this only happens on reboot, pretty much), but my RAM has grown many times faster than my open buffers.
1. Commit 2. Push 3. Evacuate
"Hey copilot, what are all my passwords and credit card numbers"
with that said its true that it works =)
Guessing without understanding is extremely unlikely to produce the best results in a repeatable manner. It's surprising to me when companies don't know that. For that reason, I generally want to work with experts that understand what they're doing (otherwise is probably a waste of time).
> Lately I’ve heard a lot of stories of AI accidentally deleting entire codebases or wiping production databases.
I simply... I cannot. Someone let a poorly understood AI connected to prod, and it ignored instructions, deleted the database, and tried to hide it. "I will never use this AI again", says this person, but I think he's not going far enough: he (the human) should be banned from production systems as well.
This is like giving full access to production to a new junior dev who barely understands best practices and is still in training. This junior dev is also an extraterrestrial with non-human, poorly understood psychology, selective amnesia and a tendency to hallucinate.
I mean... damn, is this the future of software? Have we lost our senses, and in our newfound vibe-coding passion forgotten all we knew about software engineering?
Please... stop... I'm not saying "no AI", I do use it. But good software practices remain as valid as ever, if not more!
That is why I chose to compare it to the 2008 crash. The people who took the decisions to take the risks that lead to it came out of it OK.
Typing systems? Who needs them, the LLM knows better. Different prod, dev, and staging environments? To hell with them, the LLM knows better. Testing? Nope, the LLM told me everything's sweet.
(I know you're not saying this, I'm just venting my frustration. It's like the software engineering world finally and conclusively decided engineering wasn't necessary at all).
>(the human) should be banned from production systems as well.
The human may have learnt the lesson... if not, I would still be banned ;)[0]
[0] I did not delete a database, but cut power to the rack running the DB
I mean: if you're a senior, don't connect a poorly understood automated tool to production, give it the means to destroy production, and (knowing they are prone to hallucinations) then tell it "but please don't do it unless I tell you to". As a fun thought experiment, imagine this was Skynet: "please don't start nuclear war with Russia. We have a simulation scenario, please don't confuse it with reality. Anyway, here are the launch codes."
Ignoring all software engineering best practices is a junior-level mistake. If you're a senior, you cannot be let off the hook. This is not the same as tripping on a power cable or accidentally running a DROP in production when you thought you were in testing.
The AI agent dropped the “prod” database, but it wasn’t an actual SaaS company or product with customers. The prod database was filled with synthetic data.
The entire thing was an exercise but the story is getting shared everywhere without the context that it was a vibe coding experiment. Note how none of the hearsay stories can name a company that suffered this fate, just a lot of “I’m hearing a lot of stories” that it happened.
It’s grist for the anti-AI social media (including HN) mill.
I'm actually relieved that nobody (currently) thinks this was a good idea.
You've restored my faith in humanity. For now.
I tried to determine the origin of a story about a family being poisoned by mushrooms that an AI said were edible. The nation seemed to change from time to time and I couldn't pin down the original source. I got the feeling it was an imagined possibility from known instances of AI generated mushroom guides.
There seems to cases of warnings of what could happen that change to "This Totally Happened" behind a paywall followed by a lot of "paywalled-site reported this totally happened".
I'm merely saying "don't play (automated) Russian roulette".
sandbox-exec -p "(version 1)(allow default)(deny file-write* (subpath \"$HOME\"))(allow file-write* (subpath \"$PWD\") (subpath \"$HOME/.local/share/opencode\"))(deny file-write* (subpath \"$PWD/.git\"))(allow file-write* (subpath \"$HOME/.cache\"))" /opt/homebrew/bin/opencode
A big goal while developing Yggdrasil was for it to act as long term documentation for scenarios like you describe!
As LLM use increases, I imagine each dev generating so much more data than before, our plans, considerations, knowledge have almost been moved partially into the LLM's we use!
You can check out my project on git, still in early and active development - https://github.com/zayr0-9/Yggdrasil
like code containing the same identifier.
Not to mention sneakily functions back in after being told to remove them because they are defined elsewhere. Had a spell where it was reliably a two prompt process for any change, 1) do the actual thing, 2) remove A,B and C which you have reintroduced again.No matter how that sentence ends, I weep for our industry.
This was in the days before automated CI, so a broken commit meant that someone wasn't running the required tests.
"The phone/computer will just become an edge node for AI, directly rendering pixels with no real operating system or apps in the traditional sense."
https://en.wikipedia.org/wiki/Sun_Ray
So lets say we can just have a really lightweight customizable smartphone which just connects over wifi or wire to something like raspberry pi or any really lightweight/small servers which you can carry around and installing waydroid on it could make a really pleasant device and everything could be completely open source and you can make things modular if you want...
Like, maybe some features like music/basic terminal and some other things can be seen from the device too via linux x and anything else like running android apps calls up the server which you can carry around in a backpack with a powerbank
If I really wanted to make it extremely ideal, the device can have a way of plugging another power bank and then removing the first powerbank while still running the system so that it doesn't shut down and you literally got a genuinely fascinating system that is infinitely modular.
Isn't this sort of what stadia was? But just targeted more for gaming side since Games requires gpu's which are kinda expensive...
What are your thoughts? I know its nothing much but I just want a phone which works for 90% of tasks which lets be honest could be done through a really tiny linux or sun ray as well and well if you need something like an android app running, be prepared for a raspberry pi with a cheap battery running in your pocket. Definitely better than creating systems of mass surveillance but my only nitpick of my own idea would be that it may be hard to secure the communication aspect of it if you use something like wifi but I am pretty sure that we can find the perfect communication method too and it shouldn't be thaaat big of a deal with some modifications right?
The bulkiness of having a powerbank + rpi with you could get a little challenging to deal with
I mean that I take a screen and a esp32 or any microcontroller like raspberry pi and create a modular phone for just enough to boot from a device in my backpack lets say
And what you are saying is to take an already working phone and then running postmarketos on it to then connect to a host
Theoretically... (yes?) Postmarketos is a linux but their support is finnicky from what I know... like it scares me or makes me think I need a really specific phone which might cost a lot of sorts or comparatively more than say my modular approach
Everything else sure, they are the same.
I believe that the microcontroller approach isntead of postmarketos can be better because of more freedom of the amount of Os supported but that isn't that big of a deal
just searched and somebody has created something very similar to my ideal https://hackaday.com/2023/08/03/open-source-cell-phone-based...
just plug in a ssh server from raspberry pi of sorts and a wifi card to connect them of sorts :)
Now If you are wanting to do it, Do you want to contribute together? I will send you a mail after which we can talk on something like signal or feel free to message me on signal and anything else really!
If I can be honest, I want to hack around with my kaechoda 100 which worked with 32 mb... like it never lagged in 32 mb and my 1 gig android stutters and I definitely want to figure out what OS does kaechoda use that its so so fast and actually good-enough as well
Like anyways, I will message ya and if anybody is an expert in embedded, please also contact me if someone else is also interested like you! I genuinely want to make this a reality and write more about it :p
have a nice day and I will send you a mail to your gmail!
"There isn’t enough bandwidth to transmit video to all devices from the servers and there won’t always be good connectivity, so there still needs to be significant client-side AI compute."
So no real operating system, except an AI which operates the whole computer including all inputs and outputs? I feel like there's a word for that.
Computer chips already use ai/machine learning to guess what the next instructions are going to be. You could have the kernel do similar guessing.
But I don't think those AI's would be the same ones that write love letters.
I think what we'll see is LLM's for people facing things, and more primitive machine learning for resource management (we already have it).
Sorry, I'm partially responding you and partially to this thread in general.
I would expect this from a 3rd grader for sure, my friend's sister's book had a photo of ubuntu and windows and operating systems and it won't take me more than 5 minutes to a literal toddler who knows about operating systems from a very high level to explain to them after they know about operating system to teach them why this thing that elon said was the dumbest thing anyone ever said, period.
Is this what free market capitalism does when people at the top are completely incapable of forming basic thoughts that can make sense and not be a logical paradox?
I am thinking more and more after I read your post that maybe these guys won't really miss out on a few billion $ of taxes and other things, nothing can fix the holes in their lives.
Now it makes sense why we have a system which has failed. Why capitalism feels so heartless. So ruthless. Its run by people who don't have a heart or well in this case a brain as well
https://gravitypayments.com/the-gravity-70k-min/ This comes to my mind more and more, why can't companies pay their workers not what they think they can get away with but rather what they think is fair. Why can companies be morally bad for their workers and that's okay and why are people like these running such companies who can't equate 1+1 in the name of AI.
I don't want "apps on demand" that change when the AI training gets updated, and now the AI infers differently than yesterday - I want an app the bank vetted and verified.
I was working on a blog entry in a VS Code window and I hadn't yet saved it to disk. Then I accidentally hit the close-window keyboard shortcut... and it was gone. The "open last closed window" feature didn't recover it.
On a hunch, I ran some rg searches in my VS Code Library feature on fragments of text I could remember from what I had written... and it turned out there was a VS Code Copilot log file with a bunch of JSON in it that recorded a recent transaction with their backend - and contained the text I had lost.
I grabbed a copy of that file and ran it through my (vibe-coded) JSON string extraction tool https://tools.simonwillison.net/json-string-extractor to get my work back.
Primarly because it taught me to save every other word or so, in case my ISR caused the machine to freeze.
[1]: https://wiki.osdev.org/Interrupt_Service_Routines
It's disabled by default, but even with the default setups, you can find large snippets of code in ~/.gemini/tmp.
tl;dr: Gemini cli saves a lot of data outside the context window that enables rollback.
I'm sure other agents do the same, I only happen to know about Gemini because I've looked at the source code and was thinking of designing my own version of the shadow repo before I realized it already existed.
He complained to me that he "could not find it in ChatGPT history as well"
I think @alexmolas was lucky
32 more comments available on Hacker News