Key Takeaways
Are we losing out on performance of the actual installed thing, then? (I'm not 100% clear on .pyc files TBH; I'm guessing they speed up start time?)
My first cynical instinct is to say that this is uv making itself look better by deferring the costs to the application, but it's probably a good trade-off if any significant percentage of the files being compiled might not be used ever so the overall cost is lower if you defer to run time.
Sure, but you pay that hit either way. Real-world performance is always usage based: the assumption that uv makes is that people run (i.e. import) packages more often than they install them, so amortizing at the point of the import machinery is better for the mean user.
(This assumption is not universal, naturally!)
(The key part being that 'less common' doesn't mean a non-trivial amount of time.)
I just read the thread and use Python, I can't comment on the % speedup attributed to uv that comes from this optimization.
It seems like tons of people are creating container images with an installer tool and having it do a bunch of installations, rather than creating the image with the relevant Python packages already in place. Hard to understand why.
For that matter, a pre-baked Python install could do much more interesting things to improve import times than just leaving a forest of `.pyc` files in `__pycache__` folders all over the place.
If it was a optional toggle it would probably become best practice to active compilation in dockerfiles.
I would bet on a subset for pretty much any non-trivial package (i.e. larger than one or two user facing modules). And for those trivial packages? Well they are usually small, so the cost is small as well. I'm sure there are exceptions: maybe a single gargantuan module thst consists of autogenerated FFI bindings for some C library or such, but that is likely the minority.
My Docker build generating the byte code saves it to the image, sharing the cost at build time across all image deployments — whereas, building at first execution means that each deployed image instance has to generate its own bytecode!
That’s a massive amplification, on the order of 10-100x.
“Well just tell it to generate bytecode!”
Sure — but when is the default supposed to be better?
Because this sounds like a massive footgun for a system where requests >> deploys >> builds. That is, every service I’ve written in Python for the last decade.
Unfortunately, it typically doesn't work out as well as you might expect, especially given the expectation of putting `import` statements at the top of the file.
They do.
> Are we losing out on performance of the actual installed thing, then?
When you consciously precompile Python source files, you can parallelize that process. When you `import` from a `.py` file, you only get that benefit if you somehow coincidentally were already set up for `multiprocessing` and happened to have your workers trying to `import` different files at the same time.
How/why did the package maintainers start using all these improvements? Some of them sound like a bunch of work, and getting a package ecosystem to move is hard. Was there motivation to speed up installs across the ecosystem? If setup.py was working okay for folks, what incentivized them to start using pyproject.toml?
It wasn't working okay for many people, and many others haven't started using pyproject.toml.
For what I consider the most egregious example: Requests is one of the most popular libraries, under the PSF's official umbrella, which uses only Python code and thus doesn't even need to be "built" in a meaningful sense. It has a pyproject.toml file as of the last release. But that file isn't specifying the build setup following PEP 517/518/621 standards. That's supposed to appear in the next minor release, but they've only done patch releases this year and the relevant code is not at the head of the repo, even though it already caused problems for them this year. It's been more than a year and a half since the last minor release.
... Ah, I got confused for a bit. When I first noticed the `pyproject.toml` deficiency, it was because Requests was affected by the major Setuptools 72 backwards incompatibility. Then this year they were hit again by the major Setuptools 78 backwards incompatibility (which the Setuptools team consciously ignored in testing because Requests already publishes their own wheel, so this only affected the build-from-source purists like distro maintainers). See also my writeup https://lwn.net/Articles/1020576/ .
> uv parses TOML and wheel metadata natively, only spawning Python when it hits a setup.py-only package that has no other option
Not as far as I can tell, except perhaps in extended-support legacy environments (for example, ActiveState is still maintaining a Python 2.x distribution).
I think Rust itself has this benefit
Tapping the Rust community is a decent reason to do a project in Rust.
Sometimes I thought our teams would be a terrible fit for more cookie-cutter applications where rapid development and deployment was the primary objective. We got into the weeds all the time (sometimes because of rust itself), but it happened to be important to do so.
Had we built those projects with JavaScript or Python I suspect the outcomes would have been worse for reasons apart from the language choice.
I genuinely can't understand why you suppose that has to do with the implementation language at all.
Languages that attract novice programmers (JS is an obvious one; PHP was one 20 years ago) have a higher noise to signal ratio than one that attracts intermediate and above programmers.
If you grabbed an average Assembly programmer today, and an average JavaScript programmer today, who do you think is more careful about programming? The one who needs to learn arcane shit to do basic things and then has to compile it in order to test it out, or the one who can open up Chrome's console and console.log("i love boobies")
How many embedded systems programmers suck vs full stack devs? I'm not saying full stack devs are inferior. I'm saying that more inferior coders are attracted to the latter because the barriers to entry are SO much easier to bypass.
Go: Enforces global, append-only integrity via a checksum database and version immutability; once a module version exists, its contents cannot be silently altered without detection, shifting attacks away from artifact substitution toward “publish a malicious new version” or bypass the proxy/sumdb.
Maven: Requires structured namespace ownership and signed artifacts, making identity more explicit at publish time; this raises the bar for casual impersonation but still fundamentally trusts that the key holder and build pipeline were not compromised.
Your average Go project likely has 10x fewer deps than a JS project. Those deps will not get auto-updated to their latest versions either. Much lower attack surface area.
The node supply chain attacks are also not unique to node community. you see them happening on crates.io and many other places. In fact the build time scripts that cause issues on node modules are probably worse off with the flexibility of crate build scripts and that they're going to be harder to work around than in npm.
uv doesn't exactly stop python package supply chain attacks...
Edit Okay for your sake, I did. It ends with "Screening off does not just apply to probability, it also applies to causality. If A causes B and B causes C, once you know the state of B, A provides no further information." which is such a laughably incorrect statement because it mistakenly treats a cause as having only one effect.
Less wrong is a bunch of people who think they understand Bayes better than they do.
If A causes other things besides B, then knowing about those other caused things tells us nothing about whether C happened, because we already know it did. "no further information" is elided to things that are relevant to the statement being made. Please apply basic charity in interpreting ideas expressed in prose; LWers who want to express something precisely in logical or mathematical notation are certainly not afraid to do so.
> Less wrong is a bunch of people who think they understand Bayes better than they do.
The objection you point out is not relevant to demonstrating an understanding of Bayes' Law. It's just a semantic quibble.
If you take a group of people who are squarely in the enterprise Java school of thought and have them write Rust, the language won't make much of a difference. They will eventually be influenced by the broader Rust community and the Rust philosophy towards programming, but, unless they're already interested in changed approaches, this will be a small, gradual difference. So you'll end up with Enterprise Java™ code, just in Rust.
But if you hire from the Rust community, you will get people who have a fundamentally different set of practices and expectations around programming. They will not only have a stronger grasp of Rust and Rust idioms but will also have explicit knowledge based on Rust (eg Rust-flavored design patterns and programming techniques) and, crucially, tacit knowledge based on Rust (Rust-flavored ways of programming that don't break down into easy-to-explain rules). And, roughly speaking, the same is going to be true for whatever other language you substitute for "Rust".
(I say roughly because there doesn't have to be a 1:1 relationship between programming languages, schools of thought and communities of practice. A single language can have totally different communities—just compare web Python vs data scientist Python—and some communities/schools can span multiple languages. But, as an over-simplified model, seeing a language as a community is not the worst starting point.)
But that’s precisely why it is good for developer tools. And it turns out people who write systems code are really damn good at writing tools code.
As someone who cut my teeth on C and low level systems stuff I really ought to learn Rust one of these days but Python is just so damn nice for high level stuff and all my embedded projects still seem to require C so here I am, rustless.
Sure its a little more verbose than bash one-liners, but if you need any kind of error handling and recovery, its way more effective than bash and doesn't break when you switch platforms (i.e. mac/bsd utility incompatibilities with gnu utilities).
My only complaint would be that dealing with OsString is more difficult than necessary. Way to much of the stdlib encourages programmers to just do "non-utf8 paths don't exist" and panic/ignore when encountering one. (Not a malady exclusive to rust, but I wish they'd gotten it right)
For instance the popular `fd` utility can't actually see files containing malformed utf-8, so you can hide files from system administrators naively using those tools by just adding invalid utf-8.
touch $'example\xff.txt'
fd 'example.*txt' // not found
fd -F $'example\xff.txt' // fails non-utf8
The existing rust libraries for manipulating OsString push people towards ignorance or rejection of non-utf8 filenames and paths.What I like about Rust is ADTs, pattern matching, execution speed. The things that really give me confidence are error handling (right balance between "you can't accidentally ignore errors" of checked exceptions with easy escape hatches for when you want to YOLO,) and the rarity of "looks right, but is subtly wrong in dangerous ways" that I ran into a lot in dynamic languages and more footgun languages.
Compile times suck.
It just doesn't come up in the web and devtools development worlds. Either you're dealing with user input, which is completely untrusted and has to be validated anyways, or you're passing around known validated data.
The closest is maybe ETL pipelines, but type checking can't help there either since your entire goal is to wrestle with horrors.
Type validation can help with some of that but at some point it becomes way easier to just use imperative validation for something like this. It turns out that validating things that are easy is easy no matter what you do, and validating complex rules that were written by people who think imperatively is almost impossible to do declaratively in a maintainable way.
The fastest iterating people engineers I’ve worked with often have a deep user focus rather than a language affiliation.
I think the cultural context has changed.
In "python paradox", 'knows python' is an indication that the developer is interested in something technically interesting but otherwise impractical. Hence, it's a 'paradox' that you end up practically better off by selecting for something impractical.
These days, Python is surely a practical choice, so doesn't really resemble the "interested in something technically interesting but impractical".
With that said, Rust was a good language for this in my experience. Like any "interesting" thing, there was a moderate bit of language-nerd side quest thrown in, but overall, a good selection metric. I do think it's one of the best Rewrite it in X languages available today due to the availability of good developers with Rewrite in Rust project experience.
The Haskell commentary is curious to me. I've used Haskell professionally but never tried to hire for it. With that said, the other FP-heavy languages that were popular ~2010-2015 were absolutely horrible for this in my experience. I generally subscribe to a vague notion that "skill in a more esoteric programming language will usually indicate a combination of ability to learn/plasticity and interest in the trade," however, using this concept, I had really bad experiences hiring both Scala and Clojure engineers; there was _way_ too much academic interest in language concepts and way too little practical interest in doing work. YMMV :)
> there was way too much academic interest in language concepts and way too little practical interest in doing work.
They are communicating something real, but perhaps misattributing the root cause.
The purely abstract ‘ideal’ form of software development is unconstrained by business requirements. In this abstraction, perfect software would be created to purely express an idea. Academia allows for this, and to a lesser extent some open source projects.
In the real world, the creation of software must always be subordinate to the goals of the business. The goals are the purpose, and the software is the means.
Languages that are academically interesting, unsurprisingly, attract a greater preponderance of academically minded individuals. Of these, only a percentage have the desire or ability to let go of the pure abstract, and instead focus on the business domain. So it inevitably creates a management challenge; not an insurmountable one, but a challenge.
Hence the simplified ‘these people won’t do the work!’.
Alternately, if you have the sort of work or culture that taps into people's intrinsic motivation, why would that work worse with Haskell or Clojure programmers than anybody else?
People are interested in different things along different dimensions. The way somebody is motivated by what they're doing and the way somebody is motivated by how they're doing it really don't seem all that correlated to me.
I will bring popcorn on python 4 release date.
This (zero-copy deserialization) is not a rust-specific technique, so I'm not entirely sure why the author describes it as one. Any good low level language (C/C++ included) can do this from my experience.
(But also, I think Rust can fairly claim that it's made zero-copy deserialization a lot easier and safer.)
Also, aside from memory bandwidth, there’s a latency cost inherent in traversing object graphs - 0 copy techniques ensure you traverse that graph minimally, just what’s needed to actually be accessed which is huge when you scale up. There’s a difference between one network request and fetching 1 MB vs making 100 requests to fetch 10kib and this difference also appears in memory access patterns unless they’re absorbed by your cache (not guaranteed for object graph traversal that a package manager would be doing).
(I'm agnostic on whether zero-copy "matters" in every single context. If there's no complexity cost, which is what Rust's abstractions often provide, then it doesn't really hurt.)
with zipfile.ZipFile(archive_name) as a:
with a.open(file_name) as f, io.BytesIO() as b:
b.write(f.read())
return b.getvalue()
(That does, of course, copy data around within memory, but.)This is not what zero-copy means. Here's a working definition[1].
Specifically, it's not just about keeping things in memory; copying in memory is normal. The goal is to not make copies (or more precisely, what Rust would call "clones"), but to instead convey the original representation/views of that representation through the program's lifecycle where feasible.
> a deserialized version of the data necessarily cannot be the same object as the original data
rust-asn1 would be an example of a Rust library that doesn't make any copies of data unless you explicitly ask it to. When you load e.g. a Utf8String[2] in rust-asn1, you get a view into the original input buffer, not an intermediate owning object created from that buffer.
> (That does, of course, copy data around within memory, but.)
Yes, that's what makes it not zero-copy.
[1]: https://rkyv.org/zero-copy-deserialization.html
[2]: https://docs.rs/asn1/latest/asn1/struct.Utf8String.html
Yeah, so you'd have to pass around the `BytesIO` instead.
I know that zero-copy doesn't ordinarily mean what I described, but that seemed to be how TFA was using it, based on the logic in the rest of the sentence.
That wouldn’t be zero-copy either: BytesIO is an I/O abstraction over a buffer, so it intentionally masks the “lifetime” of the original buffer. In effect, reading from the BytesIO creates new copies of the underlying data by design, in new `bytes` objects.
(This is actually a great capsule example of why zero-copy design is difficult in Python: the Pythonic thing to do is to make lots of bytes/string/rich objects as you parse, each of which owns its data, which in turn means copies everywhere.)
I'm not convinced this is going to bottleneck things, though.
(On the flip side, I guess the OS is likely to cache any disk write in memory anyway.)
It’s ~impossible in Python (because you don’t control memory) and hard in C/similar (because of use-after-free).
Rust’s borrow checker makes it easier, but it’s still tricky (for non-trivial applications). You have to do all your transformations and data movements while only referencing the original data.
json = '{"user":"nugget"}' // from somewhere
A simple way to extract json["user"] to a new variable would be to copy the bytes. In pythony/c pseudo code let user = allocate_string(6 characters)
for i in range(0, 6)
user[i] = json["user"][i]
// user is now the string "nugget"
instead, a zero copy strategy would be to create a string pointer to the address of json offset by 9, and with a length of 6. {"user":"nugget"}
^ ]end
The reason this can be tricky in C is that when you call free(json), since user is a pointer to the same string that was json, you have effectively done free(user) as well.So if you use user after calling free(json), You have written a classic _memory safety_ bug called a "use after free" or UAF. Search around a bit for the insane number of use after free bugs there have been in popular software and the havoc they have wreaked.
In rust, when you create a variable referencing the memory of another (user pointing into json) it keeps track of that (as a "borrow", so that's what the borrow checker does if you have read about that) and won't compile if json is freed while you still have access to user. That's the main memory safety issue involved with zero-copy deserialization techniques.
For example "No interpreter startup" is not specific to Rust either.
Unless I've been seeing very different submissions than you, "pet peeve" seems like the exact opposite of what is actually the case?
Poetry and uv avoid this issue.
I feel that sometimes there's a desire on the part of those who use tool X that everyone should use tool X. For some types of technology (car seat belts, antibiotics...) that might be reasonable but otherwise it seems more like a desire for validation of the advocate's own choice.
Isn't assigning out what all made things fast presumptive without benchmarks? Yes, I imagine a lot is gained by the work of those PEPs. I'm more questioning how much weight is put on dropping of compatibility compared to the other items. There is also no coverage for decisions influenced by language choice which likely influences "Optimizations that don’t need Rust".
This also doesn't cover subtle things. Unsure if rkyv is being used to reduce the number of times that TOML is parsed but TOML parse times do show up in benchmarks in Cargo and Cargo/uv's TOML parser is much faster than Python's (note: Cargo team member, `toml` maintainer). I wish the TOML comparison page was still up and showed actual numbers to be able to point to.
We also have the benchmark of "pip now vs. pip years ago". That has to be controlled for pip version and Python version, but the former hasn't seen a lot of changes that are relevant for most cases, as far as I can tell.
> This also doesn't cover subtle things. Unsure if rkyv is being used to reduce the number of times that TOML is parsed but TOML parse times do show up in benchmarks in Cargo and Cargo/uv's TOML parser is much faster than Python's (note: Cargo team member, `toml` maintainer). I wish the TOML comparison page was still up and showed actual numbers to be able to point to.
This is interesting in that I wouldn't expect that the typical resolution involves a particularly large quantity of TOML. A package installer really only needs to look at it at all when building from source, and part of what these standards have done for us is improve wheel coverage. (Other relevant PEPs here include 600 and its predecessors.) Although that has also largely been driven by education within the community, things like e.g. https://blog.ganssle.io/articles/2021/10/setup-py-deprecated... and https://pradyunsg.me/blog/2022/12/31/wheels-are-faster-pure-... .
I don't know the details of Python's resolution algorithm, but for Cargo (which is where epage is coming from) a lockfile (which is encoded in TOML) can be somewhat large-ish, maybe pushing 100 kilobytes (to the point where I'm curious if epage has benchmarked to see if lockfile parsing is noticeable in the flamegraph).
(not sure how uv does it, just guessing what can be done)
[fruit.apple]
[animal]
[fruit.orange]
So the only way to know that you have all the keys in a given table is to literally read the entire file. This is one of those unfortunate things in TOML that I would honestly ignore if I were writing my own TOML parser, even if it meant I wasn't "compliant".- Tables can be in any order, independent of heirarchy
- keys can be dotted, creating subtables in any order
On top of that, most use cases for the format are not benefitted by streaming.
- synchronization operations are implicit so we need to re-resolve to confirm the lockfile is still valid. We could take some short cut but it would require re-implementing some logic
- dependency resolution only uses `Cargo.toml` for local and git dependencies. Registry dependencies have a json summary of what content is relevant for dependency resolution. Cargo parses nearly every locked package's `Cargo.toml` to know how to build it.
I mean, of course it wasn't specifically Rust that made it fast, it's really a banal statement: you need only very moderate serious programming experience to know, that rewriting legacy system from scratch can make it faster even if you rewrite it in a "slower" language. There have been C++ systems that became faster when rewritten in Python, for god's sake. That's what makes system a "legacy" system: it does a ton of things and nobody really knows what it does anymore.
But when listing things that made uv faster it really mentions some silly things, among others. Like, it doesn't parse pip.conf. Right, sure, the secret of uv's speed lies in not-parsing other package manager's config files. Great.
So all in all, yes, no doubt that hundreds of little things contributed into making uv faster, but listing a few dozens of them (surely a non-exhaustive lists) doesn't really enable you to make any conclusions whatsoever. I suppose the mentioned talk[0] (even though it's more than a year old now) would serve as a better technical report.
Just commenting to preempt any comments telling you that the article doesn’t say this.
I blame fixed AI system prompts - they forcibly collapse all inputs into the same output space. Truly disappointing that OpenAI et all have no desire to change this before everything on the internet sounds the same forever.
As you said, reading this stuff is taxing. What's more, this is a daily occurrence by now. If there's a silver lining, it's that the LLM smells are so obvious at the moment; I can close the tab as soon as I notice one.
It’s really useful for taking my first drafts and cleaning them up ready for a final polish.
The problem isn’t the surface tics—em dashes, short exclamatory sentences, lists of three, “Not X: Y!”.
Those are symptoms of the deep, statistically-built tissue of LLM “understanding” of “how to write a technical blog post”.
If you randomize the surface choices you’re effectively running into the same problem Data did on Star Trek: The Next Generation when he tried to get the computer to give him a novel Sherlock Holmes mystery on the holodeck. The computer created a nonsense mishmash of characters, scenes, and plot points from stories in its data bank.
Good writing uses a common box of metaphorical & rhetorical tools in novel ways to communicate novel ideas. By design, LLMs are trying to avoid true (unpredictable) novelty! Thus they’ll inevitably use these tools to do the reverse of what an author should be attempting.
Fairly easy, in my wife's experience. She repeatedly got accused of using chatgpt in her original writing (she's not a native english speaker, and was taught to use many of the same idioms that LLMs use) until she started actually using chatgpt with about two pages of instructions for tone to "humanize" her writing. The irony is staggering.
> What uv drops: Virtual environments required. pip lets you install into system Python by default. uv inverts this, refusing to touch system Python without explicit flags. This removes a whole category of permission checks and safety code.
pip also refuses to touch system Python without explicit flags?
For uv, there are flags that allow it, so it doesn't really "removes a whole category of permission checks and safety code"?
> Optimizations that don’t need Rust: Python-free resolution. pip needs Python running to do anything.
This seems to me to be implying that python is inherently slow, so yes, this optimization requires a faster language?
Same. I'm actually more tired of this AI witch hunt
[1]: https://github.com/andrew/nesbitt.io/commit/0664881a524feac4...
Or, use the "deep research" mode for writing your prose instead. It's far less sloppy in how it writes.
These people are amateurs at humanizing their writing.
I heard high school and college students are doing this routinely so their papers don't get flagged as AI
this is whether they used an LLM for the whole assignment or wrote it themselves, has to get pass through a "re-humanizing" LLM either way just to avoid drama
> This is concurrency, not language magic.
> This is filesystem ops, not language-dependent.
Duh.
about rust though
some say a nicer language helps finding the right architecture (heard that about cpp veteran dropping it for ocaml, any attempted idea would take weeks in cpp, was a few days in ocaml, they could explore more)
also the parallelism might be a benefit the language orientation
enough semi fanboyism
284 more comments available on Hacker News
Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.