Wasm 3.0 Completed

Posted4 months agoActive4 months ago

todsacerdoti

1,086 points

494 comments

webassembly.orgTechstoryHigh profile

excitedpositive

Debate

60/100

WebassemblyWasm 3.0Programming LanguagesBrowser Technology

Key topics

Webassembly

Wasm 3.0

Programming Languages

Browser Technology

The release of WASM 3.0 brings significant new features such as 64-bit address space, garbage collection, and exception handling, sparking excitement and discussion among developers about its potential applications and limitations.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

13m

Peak period

0-6h

Avg / period

13.3

Comment distribution160 data points

Loading chart...

Based on 160 loaded comments

Key moments

01Story posted
Sep 17, 2025 at 2:16 PM EDT
4 months ago
Step 01
02First comment
Sep 17, 2025 at 2:29 PM EDT
13m after posting
Step 02
03Peak activity
75 comments in 0-6h
Hottest window of the conversation
Step 03
04Latest activity
Sep 20, 2025 at 11:47 AM EDT
4 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (494 comments)

Showing 160 comments of 494

jsheard

4 months ago

1 reply

Has anyone benchmarked 64bit memory on the current implementations? There's the potential for performance regressions there because they could exploit the larger address space of 64bit hosts to completely elide bounds checks when running 32bit WASM code, but that doesn't work if the WASM address space is also 64bit.

4 months ago

2 replies

> WebAssembly apps tend to run slower in 64-bit mode than they do in 32-bit mode. This performance penalty depends on the workload, but it can range from just 10% to over 100%—a 2x slowdown just from changing your pointer size.

> This is not simply due to a lack of optimization. Instead, the performance of Memory64 is restricted by hardware, operating systems, and the design of WebAssembly itself.

https://spidermonkey.dev/blog/2025/01/15/is-memory64-actuall...

nu11ptr

4 months ago

1 reply

Is this WASM specific though? Some apps suffer in performance when they move to 64-bit in general due to larger pointers and not taking sufficient advantage of/or needing 64-bit data types in general, hence the increased memory bandwidth/cache space slows them down (one of the reasons many people like a 32-bit address space, 64-bit data model).

aag

4 months ago

The blog post explains that it's more than that. Bounds checking, in particular, costs more for reasons having to do with browser implementations, for example, rather than for architectural reasons.

jsheard

4 months ago

Oof, that's unfortunate. I'm sure there's good reasons why WASM works like it does but the requirement for OOB to immediately abort the program seems rough for performance, as opposed to letting implementations handle it silently without branching (e.g. by masking the high bits of pointers so OOB wraps around).

steveklabnik

4 months ago

1 reply

This looks like a great release! Lots of stuff people have wanted for a long time in here.

leoc

4 months ago

1 reply

Tail calls. Tail calls!

davexunit

4 months ago

The tail call instructions (return_call and friends) were crucial for compiling Scheme. Safari had a bug in their validator for these instructions but the fix shipped in their most recent release so now you can use Wasm tail calls to their fullest in all major browsers.

j0e1

4 months ago

5 replies

> Garbage collection. In addition to expanding the capabilities of raw linear memories, Wasm also adds support for a new (and separate) form of storage that is automatically managed by the Wasm runtime via a garbage collector. Staying true to the spirit of Wasm as a low-level language, Wasm GC is low-level as well: a compiler targeting Wasm can declare the memory layout of its runtime data structures in terms of struct and array types, plus unboxed tagged integers, whose allocation and lifetime is then handled by Wasm. But that’s it.

Wow!

baxuz

4 months ago

2 replies

Does this allow for shrinking the WebAssembly.Memory object?

- https://github.com/WebAssembly/design/issues/1397

- https://github.com/WebAssembly/memory-control/issues/6

This is a crucial issue, as the released memory is still allocated by the browser.

kannanvijayan

4 months ago

No, I don't think it will. Pointers to managed objects are opaque, and aren't actually backed by the wasm memory buffer. The managed heap is offloaded.

Shrinking the memory object shouldn't require any special support from GC, just an appropriate API hook. It would, as always, be up to the application code running inside the module to ensure that if a shrink is done, that the program doesn't refer to memory addresses past the new endpoint.

If this hasn't been implemented yet, it's not because it's been waiting on GC, but more that it's not been prioritized.

azakai

4 months ago

Wasm GC is entirely separate from Wasm Memory objects, so no, this does not help linear memory applications.

satellite2

4 months ago

2 replies

That sounds like WASM is going into the Java direction. Is that really a good thing?

robmccoll

4 months ago

What do you mean by the Java direction? It's a virtual machine with GC support, so I guess in that regard it's similar to the JVM, CLR, BEAM, et al. If anything, those VMs show performance improvement and better GC over time and a strong track record of giving legacy software longevity. The place where things seem to fall apart over the long term is when you get to the GUI, which is arguably a problem with all software.

danielearwicker

4 months ago

Java approach: create the JVM to support one language, so it has rich high-level concepts that are unfortunately skewed toward certain assumptions about language design, and it can be reused only for other languages that are similar enough.

WASM approach: start very low-level so C is definitely supported. Thus everything is supported, although every language has to roll its own high-level constructs. But over time more patterns can be standardised so languages can be interoperable within a polyglot WASM app.

wyager

4 months ago

1 reply

This seems less than ideal to me.

1. Different languages have totally different allocation requirements, and only the compiler knows what type of allocator works best (e.g. generational bump allocator for functional languages, classic malloc style allocator for C-style languages).

2. This perhaps makes wasm less suitable for usage on embedded targets.

The best argument I can make for this is that they're trying to emulate the way that libc is usually available and provides a default malloc() impl, but honestly that feels quite weak.

Zardoz84

4 months ago

I don't see this as a problem in the JVM, where independently of what programming language you are using, you will use the GC configured on the JVM at launch.

Zariff

4 months ago

4 replies

I'm not familiar with WASM. Can someone explain why this is a good thing? How does this work with languages that do not have a garbage collector, like Rust?

goku12

4 months ago

2 replies

The answer was kind of known before hand. It was to enable the use of GCed languages like Python on Ruby to create WASM applications. Meanwhile, non-GCed languages like Rust, C and C++ were supposed to continue to work as before on WASM without breaking compatibility. This is what they seem to have finally achieved. But I needed to make sure of it. So, here are the relevant points from the WASM GC proposal [1]:

  * Motivation
  - Efficient support for high-level languages
      - faster execution
      - smaller modules
      - the vast majority of modern languages need it

  * Approach
  - Pay as you go; in particular, no effect on code not using GC, no runtime type information unless requested
  - Don't introduce dependencies on GC for other features (e.g., using resources through tables)

[1] https://github.com/WebAssembly/spec/blob/wasm-3.0/proposals/...

wffurr

4 months ago

1 reply

Note that the high level language needs a sufficient abstraction in its own runtime to allow substituting the Wasm GC for the runtime’s own GC. Work has been done for Java and Kotlin, but Python, C#, Ruby, Go can’t yet use the Wasm GC.

goku12

4 months ago

1 reply

Agreed. That's what I guessed too. WASM GC is probably a low level component which high level languages can wrap to get their native/idiomatic GC behavior.

> Work has been done for Java and Kotlin

I'm unaware of this development. What did they do? Did they create an interface to the GC specification in the draft proposal?

wffurr

4 months ago

1 reply

Well, for Java it's actually a separate compiler that targets Wasm and integrates with WasmGC: https://github.com/google/j2cl The Google Sheets team used it for their calc engine: https://v8.dev/blog/wasm-gc-porting

For Kotlin it's similar but the compiler backend is from Jetbrains themselves, targets Wasm and adapts the Kotlin runtime to use WasmGC: https://kotlinlang.org/docs/wasm-overview.html. https://seb.deleuze.fr/introducing-kotlin-wasm/ has some low level detail on how Kotlin works with WasmGC.

A bit more on Kotlin/Wasm here, seems like also Dart/Flutter uses WasmGC: https://developer.chrome.com/blog/wasmgc#kotlin_wasm

https://github.com/dotnet/runtime/issues/94420 has some notes on why C# can't use WasmGC (yet?).

goku12

4 months ago

Good info! Thanks!

larodi

4 months ago

to also consider this one https://github.com/6over3/zeroperl

robmccoll

4 months ago

1 reply

Non-GCed languages will continue to manage memory themselves. Previously, GCed languages that wanted to run on WASM had to have an implementation of their runtime including GC compiled to WASM. The idea or hope here is that those languages can use the built-in GC instead and slim down the amount of WASM that needs to be delivered to run the application to only include a minimal runtime. The current scenario is closer to if a web app or node app built with JavaScript had to ship a significant portion of V8 with it to function.

flykespice

4 months ago

2 replies

This will probably benefit Java applets targeting Wasm the most considering the huge size of their JVM.

pstuart

4 months ago

Hopefully Go as well.

g-mork

4 months ago

Would not be surprised to see the typeinfo needing shipped taking up more space than the internal GC implementation it replaced :)

apitman

4 months ago

Here's a fun example of a language that takes advantage of this: https://spritely.institute/hoot/

simon_void

4 months ago

I works very well, thank you for asking: https://rustwasm.github.io/book/

teleforce

4 months ago

It's very refreshing and good to see WASM is embracing GC in addition to non-GC support. This approach is similar to D language where both non-GC and GC are supported with fast compilation and execution.

By the way now you can generate WASM via Dlang compiler LDC [1].

[1] Generating WebAssembly with LDC:

https://wiki.dlang.org/Generating_WebAssembly_with_LDC

WhereIsTheTruth

4 months ago

6 replies

> GC and Exception handling

This was not necessary.. what a mistake, specially EH..

aag

4 months ago

3 replies

Not including GC would have been a mistake. Having to carry a complete garbage collector with every program, especially on platforms like browsers were excellent ones already exist, would have been a waste.

ridiculous_fish

4 months ago

2 replies

Doesn't every WASM program have to carry its own malloc/free today?

phickey

4 months ago

Yes, every wasm program that uses linear memory (which includes all those created by llvm toolchains) must ship with its own allocator. You only get to use the wasm GC provided allocator if your program is using the gc types, which can’t be stored in a linear memory.

flohofwoe

4 months ago

Yes, but Emscripten comes with a minimal allocator that's good enough for most C code (e.g. code with low alloc/free frequency) and only adds minimal size overhead:

https://github.com/emscripten-core/emscripten/blob/main/syst...

em-bee

4 months ago

3 replies

how is that different from compiling against a traditional CPU which also doesn't have a built in GC? i mean those programs that need a GC already have one. so what is the benefit of including one on the "CPU"?

aag

4 months ago

1 reply

The "CPU" in every browser already has one. This lets garbage-collected languages use that one. That's an enormous savings in code size and development effort.

em-bee

4 months ago

3 replies

i don't see the reduced development effort, after all, unless the language is only running on webassembly i still need to implement my own GC for other CPUs.

so most GC-languages being ported to webassembly already have a GC, so what is the benefit of using a provided GC then?

on the other hand i see GC as a feature that could become part of any modern CPU. then the benefit would be large, as any language could use it and wouldn't have to implement their own at all anymore.

0x457

4 months ago

1 reply

> i don't see the reduced development effort, after all, unless the language is only running on webassembly i still need to implement my own GC for other CPUs.

I'd think porting an existing GC to WASM is more effort than using WASM's GC for a GC'd language?

em-bee

4 months ago

i don't think so. first of all, you don't rewrite your code for every CPU but you just adapt some specific things. most code is just compiled for the new architecture and runs. second, those languages that are already running on wasm have already done the work. so at best new languages who haven't been ported yet will get any benefit from a reduced porting effort.

phickey

4 months ago

1 reply

Aside from code size the primary benefit on the Web is that the GC provided to wasm is the same one for the outer JavaScript engine, so an object from wasm can stay alive get collected based on whether JS keeps references to it. So it’s not really about providing a GC for a single wasm module (program), its about participating in one cooperatively with other programs.

em-bee

4 months ago

now that would make a lot of sense, thanks

aag

4 months ago

Writing a GC that performs well often involves making decisions that are tightly coupled to the processor architecture and operating system as well as the language implementation's memory representations for objects. Using a GC that is already present can solve that problem.

bryanlarsen

4 months ago

1 reply

The fact that a minimum size go program is a few megabytes in size is acceptable in most places in 2025. If it was shipped over the wire for every run time instead of a single install time download, that would be a different story.

Garbage collection is a small part of the go run time, but it's not insignificant.

onionisafruit

4 months ago

1 reply

I will be interested to see if Go is able to make use of this GC and if so, how much that wasm binaries

davey48016

4 months ago

1 reply

https://github.com/golang/go/issues/63904

Skimming this issue, it seems like they weren't expecting to be able to use this GC. I know C# couldn't either, at least based on an earlier state of the proposal.

em-bee

4 months ago

this thread confirms my suspicions. some languages may benefit from a built in GC, but those languages probably use a generic GC to begin with. wheras any language that has a highly optimized GC for their own needs won't be able to use this one.

circuit10

4 months ago

WebAssembly programs can't directly read their own stack because of how it's designed, meaning that scanning for live objects within WASM is inefficient as you have to keep a separate stack in software for GC

AgentME

4 months ago

It's also important because sometimes you want a WebAssembly instance to hold a reference to a GC object from Javascript, such as a DOM object, or be able to return a similar GC object back to Javascript or to another separate WebAssembly instance. Doing the first part alone is easy to do with a little bit of JS code (make the JS code hold a reference to the GC object, give the Wasm an id that corresponds to it, and let the Wasm import some custom JS functions that can operate on that id), but it's not composable in a way that lets the rest of those tasks work in a general way.

flohofwoe

4 months ago

I think it's a "you don't pay for it if you don't use it" thing, so I guess it's fine. It won't affect me compiling my C or Zig code to WASM for instance since those languages have neither garbage collection nor exceptions.

sehugg

4 months ago

It's kinda nice to have 1st class exception support. C++ exceptions barely work in Emscripten right now. Part of the problem is that you can't branch to arbitrary labels in WASM.

bangaladore

4 months ago

WASM isn't a language, so them adding stuff like this serves to increase performance and standardize rather than forcing compilers to emulate common functionality.

hiccuphippo

4 months ago

This allows more languages to compile to it. You don't need to use these features if you don't want to.

dzaima

4 months ago

Besides making it much nicer for GC'd languages to target WASM, an important aspect is that it also allows cross-language GC.

Whereas with a manual GC, if you had a JS object holding a reference to an object on your custom heap, and your heap holds a reference to that JS object (with indirections sprinkled in to taste) but nothing else references it, that'd result in a permanent memory leak, as both heaps would have to consider everything held by the other as GC roots; so you'd still be forced to manually avoid cycles despite only ever using GC'd languages. Wasm GC entirely avoids this problem.

aag

4 months ago

3 replies

Does anyone know whether the exception handling implementation supports restartable exceptions like Common Lisp's and Scheme's?

phoe-krk

4 months ago

Speaking for CL, it seems so for me.

The whole magic about CL's condition system is to keep on executing code in the context of a given condition instead of immediately unwinding the stack, and this can be done if you control code generation.

Everything else necessary, including dynamic variables, can be implemented on top of a sane enough language with dynamic memory management - see https://github.com/phoe/cafe-latte for a whole condition system implemented in Java. You could probably reimplement a lot of this in WASM, which now has a unwind-to-this-location primitive.

Also see https://raw.githubusercontent.com/phoe-trash/meetings/master... for an earlier presentation of mine on the topic. "We need means of unwinding and «finally» blocks" is the key here.

titzer

4 months ago

No, that functionality would fall under the stack-switching proposal, which builds on the tags of Wasm exception handling.

davexunit

4 months ago

Scheme doesn't have restarts as a core concept like Common Lisp, but it does have continuations. For the Scheme implementation I worked on [0], the exception system is built on top of the continuation system. In other words, Scheme exceptions are not Wasm exceptions. However, we did find a use for Wasm exceptions in the implementation of the runtime. We realized it would be useful to mark the entry point into Scheme with a Wasm 'try' block for re-entrancy reasons. Programs might call from Scheme to JS back to Scheme and on and on. When we suspend the current continuation we throw a Wasm exception and find the correct entry point for doing our stack tomfoolery, re-throwing until we find it. I don't know if my explanation makes any sense but that's how we're using Wasm exceptions.

[0] https://spritely.institute/hoot/

jjcm

4 months ago

4 replies

I'm definitely excited to see 64 bit as a default part of the spec. A lot of web apps have been heavily restricted by this, in particular any online video editors. We see a bunch of restrictions due to the 32 bit cap today here at Figma. One thing I'm curious though is whether mobile devices will keep their addressable per-tab memory cap the same. It's often OS defined rather than tied to the 32 bit space.

renehsz

4 months ago

5 replies

Unfortunately, Memory64 comes with a significant performance penalty because the wasm runtime has to check bounds (which wasn't necessary on 32-bit as the runtime would simply allocate the full 4GB of address space every time).

But if you really need more than 4GB of memory, then sure, go ahead and use it.

jsheard

4 months ago

5 replies

The comedy option would be to use the new multi-memory feature to juggle a bunch of 32bit memories instead of a 64bit one, at the cost of your sanity.

baq

4 months ago

4 replies

didn't we call it 'segmented memory' back in DOS days...?

munificent

4 months ago

1 reply

We call it "pointer compression" now. :)

mananaysiempre

4 months ago

2 replies

Seriously though, I’ve been wondering for a while whether I could build a GCC for x86-64 that would have 32-bit (low 4G) pointers (and no REX prefixes) by default and full 64-bit ones with __far or something. (In this episode of Everything Old Is New Again: the Very Large Memory API[1] from Windows NT for Alpha.)

[1] https://devblogs.microsoft.com/oldnewthing/20070801-00/?p=25...

dajtxx

4 months ago

6502 zero page instruction vibes.

o11c

4 months ago

A moderate fraction of the work is already done using:

https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html

Unfortunately the obvious `__attribute__((mode(...)))` errors out if anything but the standard pointer-size mode (usually SI or DI) is passed.

Or you may be able to do it based on x32, since your far pointers are likely rare enough that you can do them manually. Especially in C++. I'm pretty sure you can just call "foreign" syscalls if you do it carefully.

magicalhippo

4 months ago

1 reply

It was glorious I tell you.

Especially how you could increase the segment value by one or the offset by 16 and you would address the same memory location. Think of the possibilities!

And if you wanted more than 1MB you could just switch memory banks[1] to get access to a different part of memory. Later there was a newfangled alternative[2] where you called some interrupt to swap things around but it wasn't as cool. Though it did allow access to more memory so there was that.

Then virtual mode came along and it's all been downhill from there.

[1]: https://en.wikipedia.org/wiki/Expanded_memory

[2]: https://hackaday.com/2025/05/15/remembering-more-memory-xms-...

mananaysiempre

4 months ago

1 reply

> Think of the possibilities!

Schulman’s Unauthorized Windows 95 describes a particularly unhinged one: in the hypervisor of Windows/386 (and subsequently 386 Enhanced Mode in Windows 3.0 and 3.1, as well as the only available mode in 3.11, 95, 98, and Me), a driver could dynamically register upcalls for real-mode guests (within reason), all without either exerting control over the guest’s memory map or forcing the guest to do anything except a simple CALL to access it. The secret was that all the far addresses returned by the registration API referred to the exact same byte in memory, a protected-mode-only instruction whose attempted execution would trap into the hypervisor, and the trap handler would determine which upcall was meant by which of the redundant encodings was used.

And if that’s not unhinged enough for you: the boot code tried to locate the chosen instruction inside the firmware ROM, because that will have to be mapped into the guest memory map anyway. It did have a fallback if that did not work out, but it usually succeeded. This time, the secret (the knowledge of which will not make you happier, this is your final warning) is that the instruction chosen was ARPL, and the encoding of ARPL r/m16, AX starts with 63 hex, also known as the ASCII code of the lowercase letter C. The absolute madmen put the upcall entry point inside the BIOS copyright string.

(Incidentally, the ARPL instruction, “adjust requested privilege level”, is very specific to the 286’s weird don’t-call-it-capability-based segmented architecture... But it’s has a certain cunning to it, like CPU-enforced __user tagging of unprivileged addresses at runtime.)

DaiPlusPlus

4 months ago

2 replies

> The absolute madmen put the upcall entry point inside the BIOS copyright string.

Isn’t that an arbitrary string, though? Presumably AMI and Insyde have different copyright messages, so then what?

mananaysiempre

4 months ago

To clarify: when I said that “the boot code tried to locate the chosen instruction inside the firmware ROM”, I literally meant that it looked through the entirety of the ROM BIOS memory range for a byte, any byte, with value 63 hex. There’s even a separate (I’d say prematurely factored out) routine for that, Locate_Byte_In_ROM. It just so happens that the byte in question is usually found inside the copyright string (what with the instruction being invalid and most of the rest of the exposed ROM presumably being valid code), but the code does not assume that.

If the search doesn’t succeed or if you’ve set SystemROMBreakPoint=off in the [386Enh] section of SYSTEM.INI[1] or run WIN /D:S, then the trap instruction will instead be placed in a hypervisor-provided area of RAM that’s shared among all guests, accepting the risk that a misbehaving guest will stomp over it and break everything (don’t know where it fits in the memory map).

As to the chances of failing, well, I suspect the original target was the c in “(c)”, but for example Schulman shows his system having the trap address point at “chnologies Ltd.”, presumably preceded by “Phoenix Te”. AMI and Award were both “Inc.”, so that would also work. Insyde wasn’t a thing yet; don’t know what happened on Compaq or IBM machines. One way or another, looks like a c could be found somewhere often enough that the Microsoft programmers were satisfied with the approach.

[1] https://jeffpar.github.io/kbarchive/kb/071/Q71264/

_nalply

4 months ago

I thought so, but "Copyright" is always the same? Haha, that's dangerously clever or cleverly dangerous.

marcosdumay

4 months ago

And turned out we have the transistors to avoid it, but it's a really good optimization for CPUs nowadays.

At least most people design non-overlaping segments. And I'm not sure wasm would gain anything from it, being a virtual machine instead of real.

malkia

4 months ago

wait.... UNREAL MODE!

the_duke

4 months ago

1 reply

The problem with multi-memory (and why it hasn't seen much usage, despite having been supported in many runtimes for years) is that basically no language supports distinct memory spaces. You have to rewrite everything to use WASM intrinsics to work on a specific memory.

benji-york

4 months ago

Stray thought: the way Zig uses first-class allocators might make it interesting for doing things with multiple memories.

andrewl-hn

4 months ago

Somewhat related. At some point around 15 years ago I needed to work with large images in Java, and at least at the time the language used 32-bit integers for array sizes and indices. My image data was about 30 gigs in size, and despite having enough RAM and running a 64-bit OS and JVM I couldn't fit image data into s ingle array.

This multi-memory setup reminds me of my array juggling I had to do back then. While intellectually challenging it was not fun at all.

evmar

4 months ago

It looks like memories have to be declared up front, and the memcpy instruction takes the memories to copy between as numeric literals. So I guess you can't use it to allocate dynamic buffers. But maybe you could decide memory 0 = heap and memory 1 = pixel data or something like that?

afiori

4 months ago

Honestly you could allocate a new memory for every page :-)

TrueDuality

4 months ago

1 reply

The irony for me is that it's already slow because of the lack of native 64-bit math. I don't care about the memory space available nearly as much.

sehugg

4 months ago

1 reply

Eh? I'm pretty sure it's had 64-bit math for awhile -- i64.add, etc.

jesse__

4 months ago

They might have meant lack of true 64bit pointers ..? IIRC the chrome wasm runtime used tagged pointers. That comes with an access cost of having to mask off the top bits. I always assumed that was the reason for the 32bit specification in v1

zarzavat

4 months ago

4 replies

I still don't understand why it's slower to mask to 33 or 34 bit rather than 32. It's all running on 64-bit in the end isn't it? What's so special about 32?

azakai

4 months ago

1 reply

The special part is the "signal handler trick" that is easy to use for 32-bit pointers. You reserve 4GB of memory - all that 32 bits can address - and mark everything above used memory as trapping. Then you can just do normal reads and writes, and the CPU hardware checks out of bounds.

With 64-bit pointers, you can't really reserve all the possible space a pointer might refer to. So you end up doing manual bounds checks.

kannanvijayan

4 months ago

1 reply

Hi Alon! It's been a while.

Can't bounds checks be avoided in the vast majority of cases?

See my reply to nagisa above (https://news.ycombinator.com/item?id=45283102). It feels like by using trailing unmapped barrier/guard regions, one should be able to elide almost all bounds checks that occur in the program with a bit of compiler cleverness, and convert them into trap handlers instead.

azakai

4 months ago

Hi!

Yeah, certainly compiler smarts can remove many bounds checks (in particular for small deltas, as you mention), hoist them, and so forth. Maybe even most of them in theory?

Still, there are common patterns like pointer-chasing in linked list traversal where you just keep getting an unknown i64 pointer, that you just need to bounds check...

nagisa

4 months ago

2 replies

That's because with 32-bit addresses the runtime did not need to do any masking at all. It could allocate a 4GiB area of virtual memory, set up page permissions as appropriate and all memory accesses would be hardware checked without any additional work. Well that, and a special SIGSEGV/SIGBUS handler to generate a trap to the embedder.

With 64-bit addresses, and the requirements for how invalid memory accesses should work, this is no longer possible. AND-masking does not really allow for producing the necessary traps for invalid accesses. So every one now needs some conditional before to validate that this access is in-bounds. The addresses cannot be trivially offset either as they can wrap-around (and/or accidentally hit some other mapping.)

zarzavat

4 months ago

1 reply

> AND-masking does not really allow for producing the necessary traps for invalid accesses.

Why does it need to trap? Can't they just make it UB?

Specifying that invalid accesses always trap is going to degrade performance, that's not a 64-bit problem, that's a spec problem. Even if you define it in WASM, it's still UB in the compiler so you aren't saving anyone from UB they didn't already have. Just make the trapping guarantee a debug option only.

_nalply

4 months ago

1 reply

It's WASM. WASM runs in a sandbox and you can't have UB on the hardware level. Imagine someone exploiting the behavior of some browser when UB is triggered. Except that the programmer is not having nasal demons [1] but some poor user, like a mom of four children in Abraska running a website on her cell phone.

[1]: http://catb.org/jargon/html/N/nasal-demons.html

zarzavat

4 months ago

1 reply

The UB in this case is "you may get another value in the sandboxed memory region if you dereference an invalid pointer, rather than a guaranteed trap". You can still have UB even in a sandbox.

Seems like they got overly attached to the guaranteed trapping they got on 32-bit and wanted to keep it even though it's totally not worth the cost of bounds checking every pointer access. Save the trapping for debug mode only.

_nalply

4 months ago

Ah, so you meant UB = unspecified behavior, not UB = undefined behavior.

Maybe. Bugs that come from spooky behavior at a distance are notoriously hard to debug, especially in production, and it's worthwile to pay for it to avoid that.

kannanvijayan

4 months ago

I don't feel this is going to be as big of a problem as one might think in practice.

The biggest contributor to pointer arithmetic is offset reads into pointers: what gets generated for struct field accesses.

The other class of cases are when you're actually doing more general pointer arithmetic - usually scanning across a buffer. These are cases that typically get loop unrolled to some degree by the compiler to improve pipeline efficiency on the CPU.

In the first case, you can avoid the masking entirely by using an unmapped barrier region after the mapped region. So you can guarantee that if pointer `P` is valid, then `P + d` for small d is either valid, or falls into the barrier region.

In the second case, the barrier region approach lets you lift the mask check to the top of the unrolled segment. There's still a cost, but it's spread out over multiple iterations of a loop.

As a last step: if you can prove that you're stepping monotonically through some address space using small increments, then you can guarantee that even if theoretically the "end" of the iteration might step into invalid space, that the incremental stepping is guaranteed to hit the unmapped barrier region before that occurs.

It's a bit more engineering effort on the compiler side.. and you will see some small delta of perf loss, but it would really be only in the extreme cases of hot paths where it should come into play in a meaningful way.

phire

4 months ago

Because CPUs still have instructions that automatically truncate the result of all math operations to 32 bits (and sometimes 8-bit and 16-bit too, though not universally).

To operate on any other size, you need to insert extra instructions to mask addresses to the desired size before they are used.

dist1ll

4 months ago

WASM traps on out-of-bounds accesses (including overflow). Masking addresses would hide that.

fulafel

4 months ago

Bounds checking in other PLT is often reproted to result in pretty low overheads. Will be interesting to see some details about how this turns out.

Findecanor

4 months ago

Actually, runtimes often allocate 8GB of address space because WASM has a [base32 + index32] address mode where the effective address could overflow into the 33rd bit.

On x86-64, the start of the linear memory is typically put into one of the two remaining segment registers: GS or FS. Then the code can simply use an address mode such as "GS:[RAX + RCX]" without any additional instructions for addition or bounds-checking.

tjoff

4 months ago

4 replies

Webapps limited by 4GiB memory?

Sounds about right. Guess 512 GiB menory is the minimum to read email nowadays.

poly2it

4 months ago

1 reply

It doesn't actually allocate 4 GiB. Memory can be mapped without being physically occupied.

IshKebab

4 months ago

No, web apps can actually use 4GB of memory (now 16GB apparently).

jjcm

4 months ago

2 replies

I know you're in this for the satire, but it's less about the webapps needing the memory and more about the content - that's why I mentioned video editing webapps.

For video editing, 4GiB of completely uncompressed 1080p video in memory is only 86 frames, or about 3-4 seconds of video. You can certainly optimize this, and it's rare to handle fully uncompressed video, but there are situations where you do need to buffer this into memory. It's why most modern video editing machines are sold with 64-128GB of memory.

In the case of Figma, we have files with over a million layers. If each layer takes 4kb of memory, we're suddenly at the limit even if the webapp is infinitely optimal.

gertop

4 months ago

3 replies

> 4GiB of completely uncompressed 1080p video in memory is only 86 frames

How is that data stored?

Because (2^32)÷(1920×1080×4) = 518 which is still low but not 86 so I'm curious what I'm missing?

brokencube

4 months ago

1 reply

I would guess 3 colour channels at 16bit (i.e. 2 bytes)

(2^32)÷(1920×1080×4×3×2) = 86

TheBicPen

4 months ago

1 reply

Where does the 4 come from? I thought it was R+G+B+A, but you already have 3 colour channels in that calculation

brokencube

4 months ago

Yep, my logic is faulty there. And even if we assume that it's 24bpp color, that's still a factor of 2 out.

4 months ago

Apparently with 24 bytes per pixel instead of bits :) Although to be fair, there's HDR+ and DV, so probably 4(RGBA/YUVA) floats per pixel, which is pretty close..

jjcm

4 months ago

> How is that data stored?

So glad you asked. It's stored poorly because I'm bad at maths and I'm mixing up bits and bytes.

That's what I get for posting on HN while in a meeting.

flykespice

4 months ago

I'm strange to webdev, but can't you swap the remaining uncompressed frames that don't fit into memory to disk?

hsbauauvhabzb

4 months ago

Finally a web browser capable of loading slack

miniBill

4 months ago

In fairness, this is talking about Figma, not an email client

prewett

4 months ago

7 replies

I guess I’m just a crusty ol’ greybeard C++ developer, but it seems like a video editor is out of place in a document browser. There’s a perfectly good native operating system that nobody uses any more.

If we think we need a more thoroughly virtualized machine than traditional operating system processes give us (which I think is obvious), then we should be honest and build a virtualization abstraction that is actually what we want, rather than converting a document reader into a video editor…

wyldfire

4 months ago

6 replies

> ... document browser ... document reader ...

I'm going to assume you're being sincere. But even the crustiest among us can recognize that the modern purpose for web browsers is not (merely) documents. Chances are, many folks on HN in the last month have booked tickets for a flight or bought a home or a car or watched a cat video using the "document browser".

> If we think we need a more thoroughly virtualized machine than traditional operating system processes give us (which I think is obvious)...

Like ... the WASM virtual machine? What if the WASM virtual machine were the culmination of learning from previous not-quite-good-enough VMs?

WASM -- despite its name -- is not truly bound to the "document" browser.

pjmlp

4 months ago

2 replies

We have, but not by choice, I miss my native apps, even though ChromeOS Platform pays the bills.

adrianton3

4 months ago

2 replies

> booked tickets for a flight or bought a home or a car or watched a cat video

Would you install a native app to book a flight? One for each company? Download updates for them every now and then, uninstall them when you run out of disk space etc

I can ask the same question about every other activity we do in these non-native apps.

pjmlp

4 months ago

1 reply

I have installed all them on the phone, so yes.

Unfortunely several of them are glorified webviews.

I am old enough to have lived through the days Internet meant a set of networking protocols, not ChromeOS Platform.

And on those days hard disks were still bloody expensive, by the way.

rglullis

4 months ago

1 reply

> I have installed all them on the phone, so yes.

Isn't your phone providing a sandbox, a distribution system, a set of common runtime services, etc to get these native apps functional?

You don't have to squint your eyes to realize that this thing we call "document browsers" are doing a lot of the same work that Apple/Google are doing with their mobile OSes.

pjmlp

4 months ago

1 reply

You mean like Windows Store, Mac App Store, apt, yum/dnf, emerge,....?

All the OS frameworks that are available across most operating systems that don't fragment themselves into endless distributions?

rglullis

4 months ago

2 replies

> don't fragment themselves into endless distributions?

My dear Lord! In what world are you living in?

Take a look at all of the "mobile apps" you installed on your phone and tell me which of those would ever devote any resource to make a apt/rpm repository for their desktop applications.

Even the ones that want to have a desktop application can not figure out how to reliably distribute their software. The Linux crowd itself is still at the flatpak vs AppImage holy war. Mark Shuttleworth is still beating the snap horse.

The Web as a platform is far from ideal, but if it weren't for it I would never been able to switch to Linux as my primary base OS and I would have to accept the Apple/Microsoft/Google oligopoly, just like we are forced to do it at the mobile space.

pjmlp

4 months ago

In the world we build for ourselves, a worse is better mentality world.

johnisgood

4 months ago

> The Web as a platform is far from ideal, but if it weren't for it I would never been able to switch to Linux as my primary base OS

As my old IT teacher said: you can use the browser on any OS. She also implied it requires no special skills, which is true if you are limited to the browser for the majority of the time.

So... are you saying that you are able to use Linux because all you are using is the browser?

apitman

4 months ago

1 reply

You're doing all these things with web apps also, it's just that the browser orchestrates it for you.

But for some reason this takes 20M lines of code, which creates a moat that prevents browser competition.

yellowapple

4 months ago

Any sufficiently-capable graphical application runtime* contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of a web browser.

(* including every web browser)

giancarlostoro

4 months ago

1 reply

I have wanted to try Chrome for the longest time but I cant justify overspending (you need a lot of memory for the modern web) and not being able to install half the Linux apps I would want OOTB (out of the box is a big deal for me).

I am still shocked Google has not rubbed two brain cells together and built a serious Google ChromeOS version for developers with a real desktop environment and real access to Linux, and keeping the browser as sandboxed as they have. I would spend top dollar on such a laptop. Heck it could come with an easy way to install Android Studio, and native apps for things like Hangouts or whatever they call it now.

petralithic

4 months ago

1 reply

That's literally ChromeOS now, it comes with a terminal built in and you can run any Linux apps

giancarlostoro

4 months ago

2 replies

Without doing too much fancy stuff can I just run Zed or JetBrains IDEs?

petralithic

4 months ago

Yes you can run them, they work well.

yellowapple

4 months ago

Back when I tried to daily-drive a Chromebook Pixel it was, like, one flag in the settings app to enable Crostini and run whatever (GNU/)Linux program my heart desired, and that was many years ago; can't imagine it's gotten any harder (especially now that even Android is starting to offer similarly-turnkey Linux app support through the newfangled Terminal app).

jpc0

4 months ago

1 reply

So you are telling me I can now directly render from WASM into the viewport of the browser with a11y? Nope, then it’s still restricted to interacting with the DOM through JS or or renderings to a cavas with no a11y support or significantly worse dx for a11y because you would need to render a screen reader only document layout ontop of your actual UI, instead of just handing the browser an interaction tree and allowing it to do what is needed for screen readers.

wffurr

4 months ago

1 reply

You can write Wasm bindings to the native UI toolkit of your choice and render with that. It doesn’t have to be in the DOM.

jpc0

4 months ago

2 replies

You understand why that is exactly pointless…

I could write those exact same bindings for my language that I will compile to wasm and then use the current WASI interface, but even that is pointless because at that point I have written a native app, what good reason would I need to run it through an emulator, specifically when a modern OS is already sandboxing things.

If I am targeting the browser my above point stands, unless the DOM is already a reasonably decent API for what needs to be built(it probably isn’t, it’s the best example you can get of horrible API design in my opinion) then I will need to build something ontop of that, prime example being react.

So I need to run my code in an interpreter, having built data structures to generate a document in an eDSL, which creates other data structures, which then calls into another API, which then calls into the OS/hardware layer.

When what I want to do, is call into the OS/hardware layer directly…

yellowapple

4 months ago

1 reply

> what good reason would I need to run it through an emulator, specifically when a modern OS is already sandboxing things.

One good reason would be that “Which OS specifically?” is a question with at least 4-5 mainstream answers (and even more not-so-mainstream answers). That's what motivated the push from WWW-as-a-bunch-of-interconnected-documents to WWW-as-a-bunch-of-remote-hosted-apps in the first place: to be able to develop and deliver applications without having to worry quite so much about the client platform.

WASM is in this regard analogous to the olden days of Java/Silverlight/Flash, with many of the same tradeoffs (and some of the rough edges filed down due to lessons learned around browser integration and toolchains and such).

jpc0

4 months ago

The discussion here was building bindings for the OS layer, which means to explicitlY worry about the client platform.

When using the web I have my own issues with that platform, or more specifically using wasm in it since it is literally useless for building web based applications over JS/TS. It is being targeted more as a library platform which gets called into from JS than being an actual “assembly “ language.

wffurr

4 months ago

1 reply

>> You understand why that is exactly pointless…

I don't understand any such thing.

Modern OS sandboxing doesn't give you memory safety from Wasm's memory model or the capability-based security of WASI.

>> When what I want to do, is call into the OS/hardware layer directly…

If that's what you want to do, then I'm not sure what we're even discussing in this thread. The only safe way to run such code is with a hypervisor.

jpc0

4 months ago

The entire point was that WASM just calls to the existing C functions, the ones you would call from literally any language.

If you care about memory safety then use a “memory safe” language that guarantees the exact same thing as what you thing WASM guarantees except without all the pointless overhead or running a sandboxed interpreter inside a sandboxed operating system process, it is actually just pointless complexity at that point.

For native development WASM gives no benefits, the only useful parts that WASM might bring literally hasn’t been standardised because it’s a real hard problem to solve and has no use in browser.

So wasm is designed for the browser and unless you only intend to embed a library in your existing JS application it is pointless because you are still restricted to the DOM.

prewett

4 months ago

Well, that's my point. The modern purpose of the browser is for applications, and for very good reasons, namely to abstract away the OS. The problem is that the design of the browser is for documents, and is really unsuitable for applications. Applications need a UI toolkit that takes a rectangle and renders the button, whatever into it. But you can't even get a proper rectangle with DOM: a div is 100% width, which is usually not what you want, but a span is this strange thing that makes sense for text, but not for a button or slider. So everything has "display: inline-block;", which then overrides the div or span part, so that it doesn't even matter which you pick. You can't even center something vertically without doing a web search, although I guess these days you use flexbox. Flexbox at least is the equivalent of Qt or Swing's layouts.

Mind you, I think WASM is the best thing that has happened to the browser, but that's because I think the HTML/DOM is completely unsuitable for apps, and I hate developing with it so much that I won't do it, even if I have to switch careers.

I think WASM is a reasonable start to a proper virtual machine. But I think what we need is a "browser" that presents a virtual machine with a virtual monitor(s), virtual chunk of contiguous memory, virtual CPU(s), virtual networking, virtual filesystem, and basic drawing primitives, that executes WASM. The "browser" part would be that you can use a URL to point to a WASM program. The program would see itself as being the only thing on the machine (see, basically generalization of the OS process concept into a machine). The user would be able to put limits on what the virtual network can access, what parts of the OS filesystem the virtual filesystem could access, how many real CPUs (and a cpulimit) the virtual CPUs get, etc. So, sort of like a containerized Plan9. I would be happy to do this myself, but I just don't have the spare time to do it, so unless someone is willing to fund me, I'll hope someone sees this and catches the vision.

Using WASM in the web browser is a workaround.

DonHopkins

4 months ago

Alan Kay on “Should web browsers have stuck to being document viewers?” and a discussion of Smalltalk, HyperCard, NeWS, and HyperLook

https://donhopkins.medium.com/alan-kay-on-should-web-browser...

>Alan Kay answered: “Actually quite the opposite, if “document” means an imitation of old static text media (and later including pictures, and audio and video recordings).”

Dylan16807

4 months ago

While web apps make sense, those are pretty weak examples. Booking and buying are pretty document-based and need to run little to zero external code. The cat video isn't technically a document but upgrading a photo to a video is not a very big change and doesn't require any external code at all.

Terretta

4 months ago

document browser, document reader, printed paper, paper form, return to sender... those are all in the same concept space*

"virtual machine" is clearly not

that said, i love WASM in the browser, high time wrapping media with code to become "new media" wasn't stuck solely with a choice between JS and "plugins" like Java, Flash, or Silverlight

it's interesting to look back at a few "what might have been" alternate timelines, when the iPhone was intended to launch as an HTML app platform, or Palm Pre (under a former Apple exec, the "pod-father") intended the same with WebOS. if a VM running a web OS shows a PDF or HTML viewer in a frame, versus if a HTML viewer shows a VM running a web OS in a frame...

we're still working on figuring out whether new media and software distribution are the same.

today, writing Swift, or Nim, or whatever LLVM, and getting WASM -- I agree with you, feels like a collective convergence on the next step of common denominator

* note: those are all documents and document workflows with skeuomorphic analogs in the same headspace, and newspaper with "live pictures" has been a sci-fi trope for long enough TV news still can't bring themselves to say "video" (reminding us "movie" is to "moving" as "talkie" was to "talking") so extending document to include "media" is reasonable. but extending that further to be "arbitrary software" is no longer strictly document nor media

xmichael909

4 months ago

1 reply

Personally not a fan of Windows 95 in the browser, however the browser stoped being a “document reader” a decade ago it’s the only universel, sandbox runtime, and everything is moving in that direction ... safe code. WASM isnt a worst VM; it’s a diffrent trade off: portable, fast start, capability scoped compute without shiping a OS. Raw device still have their place (servers). If you need safe distribution + performance thats “good enough” WASM in the browser is going to be the future of client.

oblio

4 months ago

2 replies

A decade ago? Gmail was launched in 2004, 21 years ago.

ifwinterco

4 months ago

Not to mention Java applets which is how you would do this sort of thing in the early 2000s

blackoil

4 months ago

XMLHttpRequest was part of IE5 in 1999 as an ActiveXObject. Outlook Web team built it a year earlier.

imoverclocked

4 months ago

2 replies

There are projects to run WASM on bare metal.

I do agree that we tend to run a lot in a web-browser or browser environment though. It seems like a pattern that started as a hack but grew into its own thing through convenience.

It would be interesting to sit down with a small group and figure out exactly what is good/bad about it and design a new thing around the desired pattern that doesn't involve a browser-in-the-loop.

syndeo

4 months ago

4 replies

> run WASM on bare metal

Heh, reminds me of those boxes Sun used to make that only ran Java. (I don’t know how far down Java actually went; perhaps it was Solaris for the lower layers now that I think about it…)

imoverclocked

4 months ago

1 reply

With hypervisors and a Linux kernel doing the heavy lifting, the WASM on bare metal probably just looks a lot like a regular process. I would bet Sun did similar … minus the hypervisor.

I do miss the Solaris 10/OpenSolaris tech though. I don’t know anything that comes close to it today.

pjmlp

4 months ago

1 reply

Solaris is still around, while OpenSolaris forks support Oxide endeavours.

imoverclocked

4 months ago

Technically, yes. I built+ported a majority of Debian packages onto Nexenta OS but that effort (and many parallel efforts) just vanished when Oracle purchased Sun. I don't miss SVR4 packages and I never grew fond of IPS. So many open-source folk warned of the dangers of CDDL and they were largely realized in fairly short time. Unsurprisingly, #opensolaris on irc also had a precipitous drop-off around the same time.

dtrace/zones/smf/zfs/iscsi/... and the integration between them all was top notch. One could create a zone, spin up a clone, do some computation, trash the filesystem and then just throw the clone away... in very short time. Also, that whole loop happened without interacting with zfs directly; I know that some of these things have been ported but the ports miss the integration.

eg: zfs on Linux is just a filesystem. zfs on Solaris was the base of a bunch of technology. smf tied much of it together.

eg: dtrace gave you access all the way down to individual read/write operations per disk in a raid-z and all the way up to the top of your application running inside of a zone. One tool with massive reach and very little overhead.

Not much compels me to go back to the ecosystem; I've been burned once already.

meindnoch

4 months ago

SIM cards and credit cards also run Java bytecode: https://en.wikipedia.org/wiki/Java_Card

swiftcoder

4 months ago

The Java went so far down that many early ARM cores could be placed in Jazelle DBX mode, which processed Java bytecode in hardware shudders

IcePic

4 months ago

I think it was far less special that advertized, so it was probably a stripped Solaris that ran a JRE hoping noone would notice. Dog slow they were at least so from my viewpoint, there was nothing magic about those boxes at all.

Taikonerd

4 months ago

That's the idea of the Orca app/standard: https://orca-app.dev/

"What if we made a new WASM-based platform for apps, separate from the browser?"

RussianCow

4 months ago

1 reply

The browser removes the friction of needing to install specialized software locally, which is HUGE when you want people to actually use your software. Figma would have been dead in the water if it wasn't stupidly simple to share a design via a URL to anyone with a computer and an internet connection.

zeroq

4 months ago

I can't shake the feeling that this ship has sailed and only a few got to get on it while it happened.

And this comes from someone who started with Flash, built actual video editing apps with it, and for the last 25 years build application with "it's not a web app, it's a desktop app that lives in a browser" attitude [1].

Even with Flash we often used hybrid approach where you had two builds from same codebase - a lite version running in the browser and an optional desktop app (AIR) with full functionality. ShareObjects and LocalConnection made this approach extremely feasible as both instances were aware of each other and you could move data and objects between them in real time.

The premise is great, but it was never fully realized - sure you have few outliers like Figma, but building a real "desktop app" in a browser comes with a lot of quirks, and the resulting UX is just terrible in most cases.

[1] just to be clear, there's a huge difference between web page and web app ;D

chrsig

4 months ago

1 reply

Think of it like emacs. Browsers are perfectly good operating systems just needing a better way to view the web.

mwcz

4 months ago

That's too true to be funny!

willtemperley

4 months ago

1 reply

Plenty of people still want local-first apps that function offline.

djxfade

4 months ago

1 reply

Totally possible today with modern SPA technology that all major browsers support

willtemperley

4 months ago

1 reply

So where are they?

SJC_Hacker

4 months ago

Flutter?

FpUser

4 months ago

I a 64yo fart. Started programming with machine codes. Native is my bread and butter. Still have no problems and am using browser as a deployment platform for some type of applications. Almost each thing has it's own use.

wslh

4 months ago

I assume that looking into the present we need to think about running local LLMs in the browser. Just a few days ago I submitted an article about that [1].

[1] https://news.ycombinator.com/item?id=45200414

334 more comments available on Hacker News

View full discussion on Hacker News

ID: 45279384Type: storyLast synced: 11/26/2025, 1:00:33 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN