Code Formatting Comes to Uv Experimentally
Posted4 months agoActive4 months ago
pydevtools.comTechstoryHigh profile
heatedmixed
Debate
80/100
PythonUvCode FormattingTooling
Key topics
Python
Uv
Code Formatting
Tooling
The introduction of experimental code formatting to the uv Python package manager has sparked a heated discussion on HN, with some users welcoming the feature and others criticizing it as feature creep.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
29m
Peak period
43
0-2h
Avg / period
10.7
Comment distribution160 data points
Loading chart...
Based on 160 loaded comments
Key moments
- 01Story posted
Aug 21, 2025 at 4:26 PM EDT
4 months ago
Step 01 - 02First comment
Aug 21, 2025 at 4:56 PM EDT
29m after posting
Step 02 - 03Peak activity
43 comments in 0-2h
Hottest window of the conversation
Step 03 - 04Latest activity
Aug 22, 2025 at 10:35 PM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 44977645Type: storyLast synced: 11/20/2025, 8:28:07 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Have developers really been waiting for this? What's wrong with ruff format?
So maybe nobody has been waiting for this and the feedback will be: we don’t need this.
Also it uses ruff under the hood. If it’s integrated with uv, an advantage is one less tool to juggle.
Being able to just to "go fmt", without needing any additional tools, is a really great thing about the language.
The only danger that I can see (and it is a big one) is that Python doesn't have the same kind of consistency that go does. It might end up failing just because so many people use Python in such different ways.
uv is trying to deliver the same experience as cargo does for rust. Just install a single CLI and it's the only tool you have to worry about when working on your Python project.
There's nothing wrong with `ruff format`, but it's nice to have `uv` as a universal entrypoint.
I don't know much about go.
https://docs.astral.sh/uv/concepts/build-backend/
"uv build" makes .wheel files, so it is analogous to "cargo publish" (which makes .crate files) as opposed to "cargo build"
I would call this a packaging tool as opposed to a build system.
This isn't exactly right: `uv build` executes a PEP 517[1] build backend interface, which turns a Python source tree into installable distributions. Those distributions can be sdists or wheels; in this sense, it's closer to `cargo build` than `cargo publish`.
The closer analogy for `cargo publish` would be `uv publish`, which also already exists[2]: that command takes an installable distribution and uploads it to an index.
TL;DR: `uv build` is a proxy for a build system, because that's how distribution construction is standardized in Python. I would not say it's analogous to `cargo publish`, since it's responsible for builds, not publishes.
[1]: https://peps.python.org/pep-0517/
[2]: https://docs.astral.sh/uv/guides/package/#publishing-your-pa...
I guess it gets more complicated with .whl since those can contain build artifacts as well and not just build instructions.
It's true that 'cargo publish' can also upload your .crate files to a remote repository, while uv breaks that functionality out into a separate command called 'uv publish' but I think that's neither here nor there on the difference between bundling the source and building the source.
I am not sure I like this either. I think linter and formatter are more like developer dependencies, especially because both formatter and linters are generally things that you want to lock to a specific version to avoid e.g. CI failures or mass changes when a version is updated. But I can understand that having a formatter always available may be nice.
You cite good examples where other languages have chosen to standardise tooling. We can discuss the pros and cons of that choice. But it is a choice, and Python already made a different choice.
> There should be one-- and preferably only one --obvious way to do it.
- PEP20 - The Zen of Python
But it can't in general build torch-tensorrt or flash-attn because it has no way of knowing if Mercury was in retrograde when you ran pip. They are trying to thread a delicate an economically pivotal needle: the Python community prizes privatizing the profits and socializing the costs of "works on my box".
The cost of making the software deployable, secure, repeatable, reliable didn't go away! It just became someone else's problem at a later time in a different place with far fewer degrees of freedom.
Doing this in a way that satisfies serious operations people without alienating the "works on my box...sometimes" crowd is The Lord's Work.
This is a self-inflicted wound, since flash attention insist on building a native C++ extension which is completely unnecessary in this case.
What you can do is the following:
1) Compile your CUDA kernels offline. 2) Include those compiled kernels in a package you push to pypi. 3) Call into the kernels with pure Python, without going through a C++ extension.
I do this for the CUDA kernels I maintain and it works great.
Flash attention currently publishes 48 (!) different packages[1], for different combinations of pytorch and C++ ABI. With this approach it would have to only publish one, and it would work for every combination of Python and pytorch.
[1] - https://github.com/Dao-AILab/flash-attention/releases/tag/v2...
flash-attn in particular has its build so badly misconfigured and is so heavy that it will lock up a modern Zen 5 machine with 128GB of DDR5 if you don't re-nice ninja (assuming of course that you remembered it just won't work without a pip-visible ninja). It can't build a wheel (at least not obviously) that will work correctly on Ampere and Hopper both, it incorrectly declares it's dependencies so it will demand torch even if torch is in your pyproject.toml and you end up breaking build isolation.
So now you've got your gigabytes of fragile wheel that won't run on half your cards, let's make a wheel registry. Oh, and machine learning everything needs it: half of diffusers crashes at runtime without it. Runtime.
The dirty little secret of these 50MM offers at AI companies is that way more people understand the math (which is actually pretty light compared to say graduate physics) than can build and run NVIDIA wheels at scale. The wizards who Zuckerberg will fellate are people who know some math and can run Torch on a mixed Hopper/Blackwell fleet.
And this (I think) is Astral's endgame. I think pyx is going to fix this at scale and they're going to abruptly become more troublesome to NVIDIA than George Hotz or GamersNexus.
I think it's a combination of historical factors and contemporary misaligned incentives both in the small and the large. There are also some technical reasons why Python is sort of an "attractive nuisance" for really problematic builds.
The easy one that shouldn't be too controversial is that it has a massive C/C++ (and increasingly Rust) native code library ecosystem. That's hard to do under the best of circumstances, but it's especially tough in Python (paradoxically because Python is so good at this: when wrapping the fast library that's proven is really easy you do it all the time). In the absence of really organized central planning and real "SAT Solver Class" package managers (like `uv`, not like `pip`), a mess is more or less just nature taking it's course. That's kinda how we got here (or how we got to 2016 maybe).
But lots of language ecosystems predate serious solvers and other modern versioning, why is Python such a conspicuous mess on this in 2025? How can friggin Javascript have it together about 100x better?
That's where the bad incentives kick in. In the small, there is a lingering prestige attached to "AI Researcher" that makes about zero sense in a world where we're tweaking the same dozen architectures and the whole game is scaling it, but that's the way the history went. So people who need it to work once to write a paper and then move on? `pip freeze` baby, works on my box. Docker amplifies this "socialize the costs" thing because now you can `pip freeze` your clanky shit, spin it up on 10k Hopper cards, and move on. So the highest paid, most regarded, most clout-having people don't directly experience the pain, it's an abstraction to them.
In the large? If this shit worked then (hopefully useful oversimplification alert) FLOPs would be FLOPs. The LAPACK primitives and even more modern GEMM instructions can be spelled some fast way on pretty much any vendor's stuff. NVIDIA is usually ahead a word-shrink or two, but ROCm in principle supports training at FP8 and on CDNA (expensive) cards it does, on RNDA (cheap) cards, it says it does on the label but crashes under load so you can't use it if your time is worth anything.
The big labs and FAANGs are the kind of dark horse here. In principle you'd assume Meta would want all their Torch code to run on AMD, but their incentives are complicated, they do a lot of really dumb shit that's presumably good for influential executives because it's bad for shareholders. It's also possible that they've just lost the ability to do that level of engineering, it's hard and can't be solved by numbers or money alone.
2. If you are publishing research, any time spent beyond what's necessary for the paper is a net loss, because you could be working on the next advancement instead of doing all the work of making the code more portable and reusable.
3. If you're trying to use that research in an internal effort, you'll take the next step to "make it work on my cloud", and any effort beyond that is also a net loss for your team.
4. If the amount of work necessary to prototype something that you can write a paper with is 1x, the amount of work necessary to scale that on one specific machine configuration is something like >= 10x, and the amount of work to make that a reusable library that can build and run anywhere is more like 100x.
So it really comes down to - who is going to do the majority of this work? How is it funded? How is it organized?
This can be compared to other human endeavours, too. Take the nuclear industry and its development as a parallel. The actual physics of nuclear fission is understood at this point and can be taught to lots of people. But to get from there to building a functional prototype reactor is a larger project (10x scale), and then scaling that to an entire powerplant complex that can be built in a variety of physical locations and run safely for decades is again orders of magnitude larger.
There's a reason why even among the most diehard Linux users very few run Gentoo and compile their whole system from scratch.
In this instance the build is way nastier than building the NVIDIA toolchain (which Nix can do with a single line of configuration in most cases), and the binary artifacts are almost as broken as the source artifact because of NVIDIA tensor core generation shenanigans.
The real answer here is to fucking fork flash-attn and fix it. And it's on my list, but I'm working my way down the major C++ packages that all that stuff links to first. `libmodern-cpp` should be ready for GitHub in two or three months. `hypermodern-ai` is still mostly a domain name and some scripts, but they're the scripts I use in production, so it's coming.
If I'm going to invest time here then I'd rather just write my own attention kernels and also do other things which Flash Attention currently doesn't do (8-bit and 4-bit attention variants similar to Sage Attention, and focus on supporting/optimizing primarily for GeForce and RTX Pro GPUs instead of datacenter GPUs which are unobtanium for normal people).
uv is already a complex tool, but it has excellent documentation. Adding a straightforward, self-explanatory subcommand like this doesn’t seem out of place.
It's simple:
- Everything I use and I see: good
- Everything I don't use and I see: bloat
- Everything I use and I don't see: good
- Everything I think I don't use and I don't see: Java
This smacks of feature-creep and I won't be incorporating it into any pipelines for the foreseeable future.
On the other hand, both `ruff` and `ty` are about code style. They both edit the code, either to format or fix typing / lint issues. They are good candidates to be merged.
Cargo cargo cult?
I prefer the plugin system. I don't like god programs like what the npm monstrosity became.
I remember David Beazley mentioning that code with Makefiles were relatively easier to analyze based on ~Terabyte of C++ code and no internet connection (pycon 2014) https://youtube.com/watch?v=RZ4Sn-Y7AP8
The inverse would be no one is allowed to create any projects that you don’t personally agree with.
You can use uv without ruff. You can use ruff without uv. You can invoke ruff yourself if that’s what you want. Or use any other formatter.
I don’t think I understand what your complaint is.
However, to answer the question generally: people want this for the same reason that most people call `cargo fmt` instead of running rustfmt[1] directly: it's a better developer experience, particularly if you don't already think of code formatting as an XY-type problem ("I want to format my code, and now I have to discover a formatter" versus "I want to format my code, and my tool already has that").
[1]: https://github.com/rust-lang/rustfmt
All 3 scenarios benefit from removing the choice of build tool, package manager, venv manager, formatter, linter, etc., and saying, "here, use this and get on with your life".
And why is any of this relevant to first-time Python learners? (It's already a lot to ask that they have to understand version control at the same time that they're learning specific language syntax along with the general concept of a programming language....)
It requires less knowledge at the front end, which is when people are being bombarded with a ton of new things to learn.
Learners don’t have to even know what ruff is immediately. They just know that when they add “format” to the command they already know, uv, their code is formatted. At some later date when they know Python better and have more opinions, they can look into how and why that’s accomplished, but until then they can focus on learning Python.
uv isn’t a package manager only, its best thought of as a project manager, just like go or cargo. Its “one thing” is managing your Python project.
Maybe I'm wrong about that. But I don't know that it can actually be A/B tested fairly, given network effects (people teaching each other or proselytizing to each other about the new way).
Let's imagine you're learning a new language, which has tools with names that I just made up. Which has a clearer pattern to you?
Or Clearly, the first is easier to remember. The benefit of namespacing is that you only need to remember one weird/unique name, and then everything under that can have normal names. If you don't have namespacing, you need weird/unique names for every different tool.But most importantly, apart from breaking away from "UNIX-philosophy tools", what do you lose in practical terms?
uv is like cargo for python.
If you only need a fast type checker you can just use ty, if you just need a fast formatter and linter you can just use ruff.
Combining ruff and ty doesn't make sense if you think about like this.
My understanding was that uv is for installing dependencies (e.g. like pip) with the added benefit of also installing/managing python interpreters (which can be reasonably thought of as a dependency). This makes sense. Adding more stuff doesn't make sense.
Think npm / go / cargo, not apt/yum/pip.
Given uv is openly strongly inspired by cargo and astral also has tooling for code formatting, the integration was never a question of “if”.
I am not against auto formatters in general, but they need to be flexible and semantically aware. A log call is not the same as other calls in significance. If the auto formatter is too silly to do that, then I prefer no auto formatter at all and keep my code well formatted myself, which I do anyway while I am writing the code. I do it for my own sake and for anyone who comes along later. My formatting is already pretty much standard.
Translating this to uv, this will streamline having multiple python packages in the same directory/git repo, and leave e.g. vendored dependencies alone.
Also, since their goal really is "making cargo for python", it will likely support package-scoped ruff config files, instead of begin file- or directory-based.
The analogy would be to Cargo: `cargo fmt` just runs `rustfmt`, but you can also run `rustfmt` separately if you want.
If you're writing microservices, the Rust ecosystem sells itself at this point.
I'd like to hear more about that. I'm also curious what makes Rust particularly suited to "API-first" backends. My understanding of the language is that it's concurrency primitives are excellent but difficult to learn and it's gc-less runtime.
They're actually incredibly easy to learn if your software paradigm is the request-response flow.
The borrow checker might kill your productivity if you're writing large, connected, multi-threaded data structures, but that simply isn't the nature of 90% of services.
If you want to keep around global state (db connectors, in-memory caches, etc.) Arc<Mutex<T>> is a simple recipe that works for most shared objects. It's dead simple.
You can think of Rust with Axum/Actix as a drop-in replacement for Go or Python/Flask. With the added benefits of (1) no GC / bare metal performance, (2) much lower defect rate as a consequence of the language ergonomics, (3) run it and forget it - no GC tuning, very simple scaling.
Rust has effectively made writing with the performance of C++ feel like writing Ruby, but with unparalleled low defect rates and safety on account of the type system.
This is a little overblown.. speaking VERY HAND-WAVILY, sea_orm < active record by a factor of about 10x more mental overhead but is at least that much more performant...
but yea, vibe-coding rust micro services is pretty amazing lately, almost no interactions with borrow checker, and I'm even using cucumber specs...
I currently wouldn't recommend any Rust ORM, Diesel included. They're just not quite ready for prime time.
If you're not one to shy away from raw SQL, then SQLx is rock-solid. I actually prefer it over ORMs in general. It's type-checked at runtime or compile time against your schema, no matter how complex the query gets. It manages to do this with an incredibly simple design.
It's like an even nicer version of Java's popular jOOQ framework, which I always felt was incredibly ugly.
SQLx might be my very favorite SQL library of any language.
The whole point is you just install `uv` and stop thinking about the pantheon of tools.
The goal here is to see if users like a more streamlined experience with an opinionated default, like you have in Rust or Go: install uv, use `uv init` to create a project, use `uv run` to run your code, `uv format` to format it, etc. Maybe they won't like it! TBD.
(Ruff is installed when you invoke `uv format`, rather than bundled with the uv binary, so if you never use `uv format`, there aren't any material downsides to the experiment.)
That’s thoughtful design and could be worth mentioning in the blog post.
Perhaps even better… cargo-like commands such as uv check, uv doc, and uv test could subsume ruff, ty, and other tools that we haven’t seen yet ;)
A pyup command that installs python-build-standalone, uv, python docs, etc. would be totally clutch, as would standalone installers [0] that bundle it all together.
[0] https://forge.rust-lang.org/infra/other-installation-methods...
To your analogy, it’d be like `cargo clippy`
Stupidly I ran `uv format` without `--check` (no harm done and I can `git diff` it) so I didn't see the changes it made but `ruff check` does still show things that can be fixed with `ruff check --fix`. If I'm guessing correctly the difference is coming down to the fact that I have (in my submodule where all changes were made) a pyproject.toml file with ruff rules (there's also a .flake8 file. Repo is being converted). Either way, I find this a bit confusing userside. Not sure what to expect.
I think one thing I would like is that by default `uv format` spits out what files were changed like `uv format --check` does (s/Would reformat/Reformatted/g). Fine for the actual changes not to be displayed but I think this could help with error reduction. Running it again I can see it knows 68 files were changed. Where is that information being stored? It's pretty hard to grep out a number like that (`grep -R \<68\>`) and there's a lot of candidates (honestly there's nothing that looks like a good candidate).
Also, there's a `--quiet` flag, but the output is already pretty quiet. As far as I can tell the only difference is that quiet suppresses the warning (does `--quiet` also suppress errors?)
I like the result for `--quiet` but I have a strong preference that `uv format` match the verbosity of `uv format --check`. I can always throw information away but not recover. I have a strong bias that it is better to fail by displaying too much information than fail by displaying too little. The latter failure mode is more harmful as the former is much more easily addressed by existing tools. If you're taking votes, that's mine.Anyways, looking forward to seeing how this matures. Loved everything so far!
I'd imagine not, since `ruff format` and `ruff check` are separate things too.
But I'll also add another suggestion/ask. I think this could be improved
I think just a little more can go a long way. When --help is the docs instead of man I think there needs to be a bit more description. Just something like this tells users a lot more I think man/help pages are underappreciated. I know I'm not the only one that discovers new capabilities by reading them. Or even the double tab because I can't remember the flag name but see something I didn't notice before. Or maybe I did notice before but since the tool was new I focused on main features first. Having the ability to read enough information to figure out what these things do then and there really speeds up usage. When the help lines don't say much I often never explore them (unless there's some gut feeling). I know the browser exists, but when I'm using the tool I'm not in the browser.ruff at least seems to be compiled into uv, as the format worked here without a local ruff. This is significant more than just an interface. Whether they are managed and developed as separate tools doesn't matter.
> This is more about providing a simpler experience for users that don't want to think about their formatter as a separate tool.
Then build a separate interface, some script/binary acting as a unified interface, maybe with its separate distribution of all tools. Pushing it into uv is just adding a burden to those who don't want this.
uv and ruff are poor names anyway, this could be used to at least introduce a good name for this everything-python-tool they seem to aim for.
[1]: https://github.com/astral-sh/uv/pull/15017
The reality is, ecosystems evolve. First, we had mypy. Then more type checkers came out: pyre, pyright, etc. Then basedpyright. The era of rust arrived and now we have `ty` and `pyrefly` being worked on heavily.
On the linter side we saw flake8, black, and then ruff.
Decoupling makes adapting to evolution much easier. As long as both continue to offer LSP integrations it allows engineers to pick and chose what's best.
`uv format` is similar (you don't need to know about `ruff format` / black / yapf ).
But formatting code is a completely new area that does not fit uv.
I would love to see a world where there is a single or a set of standard commands that would prepare your python project (format, lint, test, publish). Maybe that’s the vision here?
That's probably the vision, given all from astral.sh, but `ty` isn't ready yet.
85 more comments available on Hacker News