Pytorch Monarch
Key topics
PyTorch Monarch is a new distributed computing framework that allows users to program distributed systems like a single machine, with a Rust-based backend and Python frontend, sparking discussion on its design choices and comparisons to other technologies.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
2h
Peak period
23
2-4h
Avg / period
4.2
Based on 42 loaded comments
Key moments
- 01Story posted
Oct 23, 2025 at 6:15 AM EDT
3 months ago
Step 01 - 02First comment
Oct 23, 2025 at 7:55 AM EDT
2h after posting
Step 02 - 03Peak activity
23 comments in 2-4h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 24, 2025 at 3:08 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
> Monarch is split into a Python-based frontend, and a backend implemented in Rust.
Other than that, looks like a quite interesting project.
It's a pity they don't do a complete rewrite with a functional language as the driver.
It's open source, so seeing such an extension would be quite cool. There's much that could be done with native Rust actors and code that get maybe at what you want, but nothing precludes mixing PyTorch and other backends.
For example, you could wrap a C++ inference engine as part of one of the actors generating data for other actors doing distributed training.
https://github.com/elixir-nx/axon
Also, it has RDMA. Last I checked, Ray did not support RDMA.
There are probably other differences as well, but the lack of RDMA immediately splits the world into things you can do with ray and things you cannot do with ray
https://pytorch.org/blog/pytorch-foundation-welcomes-ray-to-...
Monarch:
Ray:As far as things that might be a performance loss here, one thing I'm wondering is if custom kernels are supported. I'm also wondering how much granularity of control there is with communication between different actors calling a function. Overall, I really like this project and hope to see it used over multi-controller setups.
[1] https://github.com/alyxya/mycelya-torch
Yeah, you might end up needing some changes to remote worker initialization, but you can generally bake in whatever kernels and other system code you need.
Grammarians are going to be big angry here. Ain’t an adverb in sight.
Found a few typo's. The em dash makes me suspect an LLM was involved in proofreading
> ...Note that this does not support tensor engine, which is tied to CUDA and RDMA (via ibverbs).
I.e. yet another CUDA married approach: the issue is not ibverbs but the code shows they use GPUDirect RDMA, going from there this can only get worse - more CUDA dependencies. There would have been OpenUCX.
In case someone that can fix this is reading here
- Is this similar to openMPI?
- How is a mesh established? Do they need to be on the same host?
There are some infamous tech based on the "hiding" paradigm. PHP comes to mind. By hiding how the http request/response cycle actually works it fostered a generation of web developers who didn't know what a session cookie was, resulting in login systems that leaked like a sieve. Distributed computing is complicated. There are many parameters you need to tweak and many design decisions you need to take to make distributed model training run smoothly. I think explicit and transparent architectures are way better. Distributed model training shouldn't "feel" like running on a single device because it isn't.