Marble by World Labs: Multimodal world model to create and edit 3D worlds
Mood
excited
Sentiment
positive
Category
tech
Key topics
AI
3D modeling
world models
World Labs has introduced Marble, a multimodal world model for creating and editing 3D worlds, sparking excitement and discussion about its potential applications and limitations.
Snapshot generated from the HN discussion
Discussion Activity
Active discussionFirst comment
3h
Peak period
17
Day 1
Avg / period
9
Based on 18 loaded comments
Key moments
- 01Story posted
11/12/2025, 5:13:30 PM
6d ago
Step 01 - 02First comment
11/12/2025, 7:51:01 PM
3h after posting
Step 02 - 03Peak activity
17 comments in Day 1
Hottest window of the conversation
Step 03 - 04Latest activity
11/13/2025, 9:23:21 PM
5d ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
The interior scenes look and walks great, but any scenes with/in exteriors seems kind of bad.
Nobody believed us when we said AI would create 3D virtual worlds that were indistinguishable from the real thing, and we'd be able to transport people to different places.
I particularly like the artistic effect of the drawing that brings the person into this world. Like a point-cloud that then gets "filled in".
I have little doubt this was a design decision and I think it is very well executed.
Digital Twins were a thing, and we had developed a high-resolution 3d world, outside of cities.
At the time, we thought that NERFs were going to allow us to increase resolution and fill in the gaps of what we didn't know about the world. Then Gaussian Splats came in and just took over.
There are definitely still improvements and techniques.
However, people occasionally still reach out to me to ask how to build a replica of Ayvri, and I tell them you wouldn't build it today like we did back then.
Today, you wouldn't go through the processes of setting up tile-servers, I think you can get current AI to build a scene frame by frame and transition between frames, rather than tile by tile.
But others in the gaming world may have different opinions as to where the industry is heading.
There is another thing called world models that involves predicting the state of something after some action. But this is a very very limited area of research. My understanding of this is that there just isn't much data of action->reaction.
Same issue with gaussian splatting/nerf really, very little data (relative to text/images/videos) of text -> 3d splats. I'd guess what world labs are doing is text -> image -> splats, but of course it is just speculation.
Folks interested in this can look up Yann LeCun's work on world models and JEPA, which his team at Meta created. This lecture is a nice summary of his thinking on this space and also why he isn't a fan of autoregressive LLMs: https://www.youtube.com/watch?v=yUmDRxV0krg
OK, so I've talked about this phenomenon with ChatGPT, and I think that the issue here is that to a lot of people, a song needs to be more than just a "song". There's some sort of requirement for it to be the un-faked result of having certain experiences. It has to relate to something happening in reality, and to be derived from it, and cannot exist in a vacuum separated from the rest of reality. Otherwise to them, the music isn't "real".
Not to mention being able to explore worlds from already existing works. Care to go for a ride on a broomstick? How about simply walking into Mordor? It's exciting.
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.