How I Solved Pytorch's Cross-Platform Nightmare
Posted4 months agoActive4 months ago
svana.nameTechstory
calmmixed
Debate
60/100
PytorchCross-Platform CompatibilityPackage Management
Key topics
Pytorch
Cross-Platform Compatibility
Package Management
The author shares their solution to managing PyTorch's cross-platform compatibility issues, sparking a discussion on package management strategies and alternative solutions.
Snapshot generated from the HN discussion
Discussion Activity
Active discussionFirst comment
3d
Peak period
17
84-90h
Avg / period
9.7
Comment distribution29 data points
Loading chart...
Based on 29 loaded comments
Key moments
- 01Story posted
Sep 8, 2025 at 12:59 AM EDT
4 months ago
Step 01 - 02First comment
Sep 11, 2025 at 9:51 AM EDT
3d after posting
Step 02 - 03Peak activity
17 comments in 84-90h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 12, 2025 at 3:55 AM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45164762Type: storyLast synced: 11/20/2025, 1:26:54 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
> Public index servers SHOULD NOT allow the use of direct references in uploaded distributions. Direct references are intended as a tool for software integrators rather than publishers.
This means that PyPI will not accept your project metadata as you currently have it configured. See https://github.com/pypi/warehouse/issues/7136 for more details.
In all seriousness, I get the impression that pytorch is such a monster PITA to manage because it cares so much about the target hardware. It'd be like a blog post saying "I solved the assembly language nightmare"
If you do not care about performance and would rather have portability, use an alternative like tinygrad that does not optimize for every accelerator under the sun.
This need for hardware-specific optimization is also why the assembly language analogy is a little imprecise. Nobody expects one binary to run on every CPU or GPU with peak efficiency, unless you are talking about something like Redbean which gets surprisingly far (the creator actually worked on the TensorFlow team and addressed similar cross-platform problems).
So maybe the the blogpost you're looking for is https://justine.lol/redbean2/.
Or, looked at a different way, Torch has to work this way because Python packaging has too narrow of an understanding of platforms which treats many things that are materially different platforms as the same platform.
So, you're doubling down on OP's misnomer of "cross platform means whatever platforms I use", eh?
You should be specific about which distributions you have in mind.
That is: there's nothing stopping the author from building on the approach he shares to also include Windows/FreeBSD/NetBSD/whatever.
It's his project (FileChat), and I would guess he uses Linux. It's natural that he'd solve this problem for the platforms he uses, and for which wheels are readily available.
I would like to add some anecdata to this.
When I was a PhD student, I already had 12 years of using and administrating Linuxes as my personal OS, and I'd already had my share of package manager and dependency woes.
But managing Python, PyTorch, and CUDA dependencies were relatively new to me. Sometimes I'd lose an evening here or there to something silly. But I had one week especially dominated by these woes, to the point where I'd have dreams about package management problems at the terminal.
They were mundane dreams but I'd chalk them up as nightmares. The worst was having the pleasant dream where those problems went away forever, only to wake up to realize that was not the case.
But the point is more that, for me, this is a somewhat rare instance where I think using the term "nightmare" in the title is justified.
More than once I had installed pytorch into a new environment and subsequently spent hours trying to figure out why things suddenly aren't working. Turns out, PyTorch had just uploaded a bad wheel.
Weirdly I feel like CUDA has become easier yet Python has become worse. It's all package management. Honestly, I find myself wanting to use package managers less and less because of Python. Of course `pip install` doesn't work, and that is probably a good thing. But the result of this is that any time you install a package it adds the module as a system module, which I thought was the whole thing we were trying to avoid. So what? Do I edit every package build now so that it runs a uv venv? If I do that, then this seems to just get more complicated as I have to keep better track of my environments. I'd rather be dealing with environment modules than that. I'd rather things be rapped up in a systemd service or nspawn than that!
I mean I just did a update and upgrade and I had 13 python packages and 193 haskell modules, out of 351 packages! This shit is getting insane.
People keep telling me to keep things simple, but I don't think any of this is simple. It really looks like a lot of complexity created by a lot of things being simplified. I mean isn't every big problem created out of a bunch of little problems? That's how we solve big problems -- break them down to small problems -- right? Did we forget the little things matter? If you don't think they do, did you question if this comment was written by an LLM because I used a fucking em dash? Seems like you latched onto something small. I think it is hard to know when the little things matter or don't matter, often we just don't realize the little things are part of the big things.
In the comparative table, they claim that conda doesn't support:
* lock file: which is false, you can freeze your environment
* task runner: I don't need my package manager to be a task runner
* project management: You can do 1 env per project ? I don't see the problem here...
So no, please, just use conda/mamba and conda-forge.
That's why people will go stupid lengths to convert model from PyTorch / TensorFlow with onnxtools / coremltools to avoid touch the model / weights themselves.
The only one that escaped this is llama.cpp, which weirdly, despite the difficulty of model conversion with ggml, people seem to do it anyway.
This ends up wasting space and slowing down installation :(
Speaking of PyTorch and CUDA, I wish the Vulkan backend becomes stable, but that seems to super far dream...
https://docs.pytorch.org/executorch/stable/backends-vulkan.h...
`pip install torchruntime`
`torchruntime install torch`
It figures out the correct torch to install on the user's PC, factoring in the OS (Win, Linux, Mac), the GPU vendor (NVIDIA, AMD, Intel) and the GPU model (especially for ROCm, whose configuration varies per generation and ROCm version).
And it tries to support quite a number of older GPUs as well, which are pinned to older versions of torch.
It's used by a few cross-platform torch-based consumer apps, running on quite a number of consumer installations.
It doesn't solve how you package your wheels specifically, that problem is still pushed on your downstream users because of boneheaded packaging decisions by PyTorch themselves but as the consumer, Pixi soften's blow. The condaforge builds of PyTorch also are a bit more sane.