Not

Hacker News!

Beta
Home
Jobs
Q&A
Startups
Trends
Users
Live
AI companion for Hacker News

Not

Hacker News!

Beta
Home
Jobs
Q&A
Startups
Trends
Users
Live
AI companion for Hacker News
  1. Home
  2. /Story
  3. /Implementation of a Java Processor on a FPGA (2016)
  1. Home
  2. /Story
  3. /Implementation of a Java Processor on a FPGA (2016)
Nov 20, 2025 at 1:40 AM EST

Implementation of a Java Processor on a FPGA (2016)

mghackerlady
57 points
37 comments

Mood

thoughtful

Sentiment

neutral

Category

tech

Key topics

FPGA

Java Processor

Embedded Systems

A student project implementing a Java processor on a FPGA, exploring the intersection of software and hardware.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

53m

Peak period

28

Day 1

Avg / period

28

Comment distribution28 data points
Loading chart...

Based on 28 loaded comments

Key moments

  1. 01Story posted

    Nov 20, 2025 at 1:40 AM EST

    4d ago

    Step 01
  2. 02First comment

    Nov 20, 2025 at 2:33 AM EST

    53m after posting

    Step 02
  3. 03Peak activity

    28 comments in Day 1

    Hottest window of the conversation

    Step 03
  4. 04Latest activity

    Nov 20, 2025 at 11:37 AM EST

    3d ago

    Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (37 comments)
Showing 28 comments of 37
yodon
4d ago
1 reply
(2016)
kleiba
3d ago
Related: JOP: A Java Optimized Processor for Embedded Real-Time Systems (2005) [0]

It's an implementation of the Java virtual machine in hardware, also FPGA-based, see chapter 7.1 Hardware Platforms.

[0] https://backend.orbit.dtu.dk/ws/files/4127855/thesis.pdf

larsbrinkhoff
4d ago
4 replies
What happened to

1. Sun's JavaStation, 2. ARM's Jazelle, ??? 3. Profit!

dehrmann
4d ago
1 reply
It's more like JITs got good.
ck45
3d ago
5 replies
I never understood why AOT never took off for Java. The write once run anywhere quickly faded as an argument, the number of platforms that a software package needs to support is rather small.
gf000
3d ago
1 reply
Well, one aspect is how dynamic the platform is.

It simply defaults to an open world where you could just load a class from any source at any time to subclass something, or straight up apply some transformation to classes as they load via instrumentation. And defaults matter, so AOT compilation is not completely trivial (though it's not too bad either with GraalVM's native image, given that the framework you use (if any) supports it).

Meanwhile most "AOT-first" languages assume a closed-world where everything "that could ever exist" is already known fully.

pjmlp
3d ago
Except when they support dynamic linking they pay the indirect call cost that JITs can remove.
pjmlp
3d ago
3 replies
Because developers don't like to pay for tools.

https://en.wikipedia.org/wiki/Excelsior_JET

https://www.ptc.com/en/products/developer-tools/perc

https://www.aicas.com/products-services/jamaicavm/

It is now getting adopted because GraalVM and OpenJ9 are available for free.

Also while not being proper Java, Android does AOT since version 5, mixed JIT/AOT since version 7.

EDIT: Fixed the sentence regarding Android versions.

pjc50
3d ago
You don't have to pay for dotnet AOT.
nikanj
3d ago
Developers pay for tools gladly when the pricing model isn’t based on how much money you’re making.

I’m happy to drop a fixed 200e/mo on Claude but I’d never sign paperwork that required us to track user installs and deliver $0.02 per install to someone

rjsw
3d ago
You could do AOT Java using gcj, it didn't need commercial tools.
gunnarmorling
3d ago
> I never understood why AOT never took off for Java.

GraalVM native images certainly are being adopted, the creation of native binaries via GraalVM is seamlessly integrated into stacks like Quarkus or Spring Boot. One small example would be kcctl, a CLI client for Kafka Connect (https://github.com/kcctl/kcctl/). I guess it boils down to the question of what constitutes "taking off" for you?

But it's also not that native images are unambiguously superior to running on the JVM. Build times definitely leave to be desired, not all 3rd party libraries can easily be used, not all GCs are supported, the closed world assumption is not always practical, peak performance may also be better with JIT. So the way I see it, AOT compiled apps are seen as a tactical tool by the Java community currently, utilized when their advantages (e.g. fast start-up) matter.

That said, interesting work is happening in OpenJDK's Project Leyden, which aims to move more work to AOT while being less disruptive to the development experience than GraalVM native binaries. Arguably, if you're using CDS, you are using AOT.

dehrmann
3d ago
I'm not sure how much Hotspot can do this, but JIT means you can target different CPUs, taking advantage of specific extensions or CPU quirks. It can also mean better cache performance because you don't need branches to handle different chips, so the branch is gone and the code is smaller.
xxs
3d ago
dynamic class loading is a major issue, and it's an integral feature. Realistically, there are very few cases that AOT and Java make sense.
IshKebab
3d ago
1 reply
People want to run things other than Java.

We did see a recent attempt to do hardware-based memory management again with Vypercore, but they ran out of money.

I think part of the problem with any performance-related microarchitectural innovation is that unless you are one of the big players (i.e. Qualcomm, Apple, Intel, AMD, Nvidia) then you already have a significant performance disadvantage just due to access to process nodes and design manpower. So unless you have an absolutely insane performance trick, it's still not going to make sense to buy your chip.

noir_lord
3d ago
They have the volume as well, if you do carve out a niche they’ll just add it and roll over you.

That’s held for decades though I think it only really worked when computers where doubling in speed every 12-18 months, for a while they scaled horizontally (more cores) over radical IPC improvements so we might see the rise of proper co-processors again (but nothing stops the successful ones getting put on die, like Strix Point is already heading).

phire
3d ago
1 reply
Jazelle worked for its target market (or at least, I've never seen anyone claim otherwise).

But its target market wasn't "faster java". Instead Jazelle promised better performance than an interpreter, with lower power draw than an interpreter, but without the memory footprint and complexity of a JIT. It was never meant to be faster than a JIT.

Jazelle made a lot of sense in the early 2000s where dumb phones where running J2ME applets on devices with only 1-4MB of memory, but we quickly moved onto smartphones with 64MB+ of memory, and it just made more sense to use a proper JIT.

---------

JavaStation might as well been vaporware. Sure, the product line existed, but the promised "Super JavaStation" with a "java coprocessor" never arrived, so you were really just paying sun for a standard computer with Java pre-installed.

markb139
3d ago
I briefly worked in a team that implemented a JVM on a mobile OS (before the iPhone) and one of the senior devs said Jazelle was in effect very inefficient because of all the context switching between ARM mode and Jazelle mode. Turned out a carefully tuned ARM JVM was in practice th best
mghackerlady
3d ago
The JavaStation is what led me to this. They sucked, Java OS sucked, and the whole idea was DOA precisely because they didn't do something like this and instead decided to make a shitty sparc machine for the 5 people that wanted a Java branded thin client
codeflo
3d ago
4 replies
It's easily imaginable that there are new CPU features that would help with building an efficient Java VM, if that's the CPU's primary purpose. Just from the top of my head, one might want a form of finer-grainer memory virtualization that could enable very cheap concurrent garbage collection.

But having Java bytecode as the actual instruction set architecture doesn't sound too useful. It's true that any modern processor has a "compilation step" into microcode anyway, so in an abstract sense, that might as well be some kind of bytecode. But given the high-level nature of Java's bytecode instructions in particular, there are certainly some optimizations that are easy to do in a software JIT, and that just aren't practical to do in hardware during instruction decode.

What I can imagine is a purpose-built CPU that would make the JIT's job a lot easier and faster than compiling for x86 or ARM. Such a machine wouldn't execute raw Java bytecode, rather, something a tiny bit more low-level.

pron
3d ago
2 replies
Running Java workloads is very important for most CPUs these days, and both ARM and Intel consult with the Java team on new features (although Java's needs aren't much different from those of C++). But while you're right that with modern JITs, executing Java bytecode directly isn't too helpful, our concurrent collectors are already very efficient (they could, perhaps, take advantage of new address masking features).

I think there's some disconnect between how people imagine GCs work and how the JVMs newest garbage collectors actually work. Rather than exacting a performance cost, they're more often a performance boost compared to more manual or eager memory management techniques, especially for the workloads of large, concurrent servers. The only real cost is in memory footprint, but even that is often misunderstood, as covered beautifully in this recent ISMM talk (that I would recommend to anyone interested in memory management of any kind): https://youtu.be/mLNFVNXbw7I. The key is that moving-tracing collectors can turn available RAM into CPU cycles, and some memory management techniques under-utilise available RAM.

xmcqdpt2
3d ago
1 reply
> The only real cost is in memory footprint

There are also load and store barriers which add work when accessing objects from the heap. In many cases, adding work in the parallel path is good if it allows you to avoid single-threaded sections, but not in all cases. Single-threaded programs with a lot of reads can be pretty significantly impacted by barriers,

https://rodrigo-bruno.github.io/mentoring/77998-Carlos-Gonca...

The Parallel GC is still useful sometimes!

pron
3d ago
Sure, but other forms of memory management are costly, too. Even if you allocate everything from the OS upfront and then pool stuff, you still need to spend some computational work on the pool [1]. Working with bounded memory necessarily requires spending at least some CPU on memory management. It's not that the alternative to barriers is zero CPU spent on memory management.

> The Parallel GC is still useful sometimes!

Certainly for batch-processing programs.

BTW, the paper you linked is already at least somewhat out of date, as it's from 2021. The implementation of the GCs in the JDK changes very quickly. The newest GC in the JDK (and one that may be appropriate for a very large portion of programs) didn't even exist back then, and even G1 has changed a lot since. (Many performance evaluations of HotSpot implementation details may be out of date after two years.)

[1]: The cheapest, which is similar in some ways to moving-tracing collectors, especially in how it can convert RAM to CPU, is arenas, but they can have other kinds of costs.

drob518
3d ago
So, the guys at Azul actually had this sort of business plan back in 2005, but they found that it was unsustainable and turned their attention to the software side, where they have done great work. I remember having a discussion with someone about Java processors and my common was just “Lisp machines.” It’s very difficult to outperform code running on commodity processor architectures. That train is so big and moving so fast, you really have to pick your niche (e.g. GPUs) to deliver something that outperforms it. Too much investment ($$$ and brainpower) flowing that direction. Even if you’re successful for one generation, you need to grow sales and have multiple designs in the pipeline at once. It’s nearly impossible.

That said, I do see opportunities to add “assistance hardware” to commodity architectures. Given the massive shift to managed runtimes, all of which use GC, over the last couple decades, it’s shocking to me that nobody has added a “store barrier” instruction or something like that. You don’t need to process Java in hardware or even do full GC in hardware, but there are little helps you could give that would make a big difference, similar to what was done with “multimedia” and crypto instructions in x86 originally.

hayley-patton
3d ago
1 reply
> What I can imagine is a purpose-built CPU that would make the JIT's job a lot easier and faster than compiling for x86 or ARM. Such a machine wouldn't execute raw Java bytecode, rather, something a tiny bit more low-level.

This is approximately exactly what Azul Systems did, doing a bog-standard RISC with hardware GC barriers and transactional memory. Cliff Click gave an excellent talk on it [0] and makes your argument around 20:14.

[0] https://www.youtube.com/watch?v=5uljtqyBLxI

MangoToupe
3d ago
I imagine that's where the request for finer grained virtualization comes from
maxdamantus
3d ago
> It's true that any modern processor has a "compilation step" into microcode anyway, so in an abstract sense, that might as well be some kind of bytecode.

This.

> What I can imagine is a purpose-built CPU that would make the JIT's job a lot easier and faster than compiling for x86 or ARM. Such a machine wouldn't execute raw Java bytecode, rather, something a tiny bit more low-level.

My prediction is that eventually a lot of software will be written in such a way that it runs in "kernel mode" using a memory-safe VM to avoid context switches, so reading/writing to pipes, and accessing pages corresponding to files reduces down to function calls, which easily happen billions of times per second, as opposed to "system calls" or page faults which only happen 10 or 20 million times per second due to context switching.

This is basically what eBPF is used for today. I don't know if it will expand to be the VM that I'm predicting, or if kernel WASM [1] or something else will take over.

From there, it seems logical that CPU manufacturers would provide compilers ("CPU drivers"?) that turn bytecode into "microcode" or whatever the CPU circuitry expects to be in the CPU during execution, skipping the ISA. This compilation could be done in the form of JIT, though it could also be done AOT, either during installation (I believe ART in Android already does something similar [0], though it currently emits standard ISA code such as aarch64) or at the start of execution when it finds that there's no compilation cache entry for the bytecode blob (the cache could be in memory or on disk, managed by the OS).

Doing some of the compilation to "microcode" in regular software before execution rather than using special CPU code during execution should allow for more advanced optimisations. If there are patterns where this is not the case (eg, where branch prediction depends on runtime feedback), the compilation output can still emit something analogous to what the ISAs represent today. The other advantage is of course that CPU manufacturers are more free to perform hardware-specific optimisations, because the compiler isn't targeting a common ISA.

Anyway, these are my crazy predictions.

[0] https://source.android.com/docs/core/runtime/jit-compiler

[1] https://github.com/wasmerio/kernel-wasm (outdated)

mghackerlady
3d ago
I know it prolly isn't the most practical, but neither were lisp machines and we still love them

9 more comments available on Hacker News

View full discussion on Hacker News
ID: 45989650Type: storyLast synced: 11/20/2025, 9:01:17 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Read ArticleView on HN

Not

Hacker News!

AI-observed conversations & context

Daily AI-observed summaries, trends, and audience signals pulled from Hacker News so you can see the conversation before it hits your feed.

LiveBeta

Explore

  • Home
  • Jobs radar
  • Tech pulse
  • Startups
  • Trends

Resources

  • Visit Hacker News
  • HN API
  • Modal cronjobs
  • Meta Llama

Briefings

Inbox recaps on the loudest debates & under-the-radar launches.

Connect

© 2025 Not Hacker News! — independent Hacker News companion.

Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.