Mergiraf: Syntax-Aware Merging for Git
Posted2 months agoActiveabout 2 months ago
lwn.netTechstory
supportivepositive
Debate
20/100
GitMerge ConflictsSyntax-Aware Merging
Key topics
Git
Merge Conflicts
Syntax-Aware Merging
Mergiraf is a syntax-aware merging tool for Git that can simplify developers' lives by automatically resolving merge conflicts, and the community is generally enthusiastic about its potential benefits.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
15m
Peak period
30
Day 10
Avg / period
9
Comment distribution45 data points
Loading chart...
Based on 45 loaded comments
Key moments
- 01Story posted
Nov 3, 2025 at 9:54 AM EST
2 months ago
Step 01 - 02First comment
Nov 3, 2025 at 10:10 AM EST
15m after posting
Step 02 - 03Peak activity
30 comments in Day 10
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 17, 2025 at 4:55 AM EST
about 2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45799664Type: storyLast synced: 11/20/2025, 8:47:02 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
The example shown reminds me pf Zed's CRDTs [1], and their journey to build a fine-grained version control system for agentic development [2]—I imagine this work could prove useful to the Zed/Cursor team, and likely shares a lot of functionality with DeltaDB [2].
- [1]: https://zed.dev/blog/crdts
- [2]: https://zed.dev/blog/sequoia-backs-zed
It’s really cool to see tree-sitter unlock so many of these use cases. I love using [difftastic] for my diffing tool to get context aware diffs. So in the example from the article, the diff would highlight the `void` and `int` changes with a heavier background of red and green respectively
[difftastic]: https://github.com/Wilfred/difftastic
But curiously Zed hasn't been very interested in Tree-sitter. They don't seem to see it as having much strategic value to their company, which is odd because lots of other people do see it as a valuable platform. You have Tweag building code formatting on it, you had GitHub building stack graphs on it, you have Merigraph. You even have sone really "out there" stuff like the Software Evolution Library!
Zed doesn't want to build a semantic IDE. They've said it a million times, they want to build a text editor, so they just aren't going to put the tree representation at the center of the experience. A text editor's UX is built around the text buffer so that it emulates experience of coding while sitting at a typewriter filling out punch cards. We can do better than the typewriter as the anchoring metaphor for all UX!
I think those projects I listed that build on top of Tree-sitter (all ignored by Zed) all see the potential of semantic changes and of Tree-sitter as a platform for making them.
I don't mean a standalone syntax highlighter, I mean it's a whole environment in which you can write software and in which things integrate. An Integrated Development Environment.
But Zed doesn't want that product. That product, if they cared that they owned it, would compete with Zed
Well done.
Actually I've done this a hundred times now and it has yet to make a single mistake. I don't give a crap how much GPU it uses, grandpa.
First, let me pull up the diff and git status
......
....
...
.
Hmm, that didn't quite work, let me try that again!
I've been using 1-arg-1-line to avoid most conflicts
Instead of
do It's pretty hideous in this example but for bigger queries maintained over a long period of time it can be beneficial. I assume, it's been nearly 20 years since I did anything more serious with SQL.> Heatmap color-codes every diff line/token by how much human attention it probably needs. Unlike PR-review bots, we try to flag not just by “is it a bug?” but by “is it worth a second look?” (examples: hard-coded secret, weird crypto mode, gnarly logic).
https://0github.com/
That's to be expected. The philosophy behind git merges is that it will merge only if it is absolutely and unambiguously sure that the resolution is correct. That's when there is only one solution for the merge. It will just throw it's hands up and leave it to the developer if there is any ambiguity - that's if there's more than one way to do the merge.
Every single chunk of merge is a potential conflict. But have you ever contested the regular merge algorithm (ort by default) when it did work? Like when the merge was fully successful, or the successfully merged chunks within a conflicted merge? You can expect the same experience with any merge algorithm that sticks to the git philosophy of being a git [1]. Problems will happen only if they start using some complex heuristics or LLM or something unpredictable like that for the merge.
> It's automatically solved about 70% of my conflicts
At the risk of explaining the obvious, I'm going to try to explain this. (So please don't get angry at me if you already know this.) Imagine that you're trying to manually merge 2 branches without any sort of merge algorithm. For the first case, just assume that you don't know the programming language (imagine that it's in some foreign script). All you have to go by is the record of when each line was added in each branch. The best 'dumb' strategy you have to go with, is the 3-way merge [2]. The referenced page illustrates this. It clearly shows you the advantage of the 3-way merge algorithm over the traditional 2-way merge that we all are familiar with.
But this method still has a disadvantage. You are looking at the source files simply as a bunch of lines, without the knowledge of its more granular structures like the syntax. (Note: That assumption itself may be wrong. That's why merges and git in general doesn't work well on binary files.) At best, all you can hope for is that the two branches don't contain any edits on the same or the adjacent lines. You won't even know the order in which the lines should be arranged. Now you have a conflict - a merge that you're leaving for someone else to solve.
Now assume a second case. You know the programming language this time. But you have no idea what the program does - it's not your project. Even with that limitation, you'll still be able to do a better job than just comparing the lines blindly. Mergiraf docs has a page full of these examples [3]. You can see how obvious the merges look - there is no way you can go wrong. See if you can resolve them just by looking at the lines. That's why mergiraf gives you much better performance without any errors.
There is of course a deeper level of knowledge - the semantic level. The knowledge of what the program does. You need that knowledge to resolve 100% of the merges. And that ultimate merge algorithm is ... you.
> Pretty pleased.
Understandable. But I see a potential problem here. As you are aware, the files to submit to mergiraf are specified in the gitattributes file. There are two ways this can go wrong. First, someone else with your repo may not have or even know about mergiraf. The second, even bigger problem is that some people have global gitattributes files [4] where you place your default attributes. It's possible to setup mergiraf there. But if you do so, your colleagues may not even get a clue as to why certain merges succeed for you, but fail for all of them.
The above problem becomes a bigger issue because merge and rebase conflicts sometimes reappear in later merges or rebases. If that's something mergiraf can solve and you have it, then everything's fine. But if the conflict reappears for someone without mergiraf, they will have to repeat the manual resolution again and again. This happens because git simply wont commit a merge or rebase until we resolve the conflict manually. Therefore, git has no idea what we did in between to resolve it - that is not recorded anywhere. (Well, git-rerere [5] records it if we ask it to. But that's a local-only solution. Everyone will have to do it once on their system.)
There is actually a known solution to the problem. It's called 'first class conflicts' [6]. The idea is to record the conflicts and its resolution in the repo itself (the same info that rerere stores, but in the shared repo). This means that a conflict once resolved will not come back again, because the structured information to resolve it is available in the repo. This means not everyone needs mergiraf and nobody needs to repeat a completed manual resolution. It has other advantages too. You can just continue working after a conflicted merge and leave the resolution for later. Or you could send the conflicts to someone else more specialized in that area of the code.
I have seen this feature in Jujutsu [6] and Pijul [7]. Git doesn't have it probably because this wasn't around when it was developed. But Jujutsu uses git repository format and they somehow managed to implement first-class conflicts on it. Meanwhile, the concept is already there in git as rerere. So perhaps first-class conflicts are possible in Git too. It would be awesome if we had that in Git too. So if anybody who sees this knows how to do it, please please take it up as a wish!
[1] https://github.com/git/git/blob/e83c5163316f89bfbde7d9ab23ca...
[2] https://blog.git-init.com/the-magic-of-3-way-merge/
[3] https://mergiraf.org/conflicts.html
[4] https://git-scm.com/docs/git-config#Documentation/git-config...
[5] https://git-scm.com/docs/git-rerere
[6] https://jj-vcs.github.io/jj/latest/conflicts/
[7] https://pijul.com/manual/why_pijul.html#modeling-conflicts
Depends on what you mean by 'contested', but yes. You can have "merge conflicts", that are even correct as far as the syntax is concerned, but are garbage on a semantic level.
https://codeberg.org/mergiraf/mergiraf/issues/612#issuecomme...
So I think in a more sensible language you might get much better results than this.
That means that if I the programmer care about the order, I must now review lines, where no merge conflict is indicated. I am not sure I would like that.
Import order would have been a better example (they're always supposed to be sorted).
It could be so much better.