Apex GPU | Not Hacker News!

Discussion (1 comments)

Showing 6 comments

ArchitectAI

30 days ago

1 reply

I built a lightweight (93KB) CUDA→AMD translation layer using LD_PRELOAD.

It intercepts CUDA API calls at runtime and translates them to HIP/rocBLAS/MIOpen.

No source code needed. No recompilation. Just:

  LD_PRELOAD=./libapex_hip_bridge.so ./your_cuda_app

Currently supports:

- 38 CUDA Runtime functions

- 15+ cuBLAS operations (matrix multiply, etc)

- 8+ cuDNN operations (convolutions, pooling, batch norm)

- PyTorch training and inference

Built in ~10 hours using dlopen/dlsym for dynamic loading. 100% test pass rate.

The goal: break NVIDIA's CUDA vendor lock-in and make AMD GPUs viable for

existing CUDA workloads without months of porting effort.

bigyabai

30 days ago

> ## First Comment (Expand on technical details)

> Post this as your first comment after submitting:

lmfao

throwaway2027

30 days ago

1 reply

[flagged]

tomhow

30 days ago

Please don't give more oxygen to trolls. We detached and banned the account. Any time you see this kind of thing, flag the comment, and if you want to be extra-helpful, email us – hn@ycombinator.com.

throwaway2027

30 days ago

Holy AI Slop

AMDAnon

30 days ago

Despite being vibecoded, swapping out cuda for another shared library is technically sound.

Probably violates EULAs though which is why AMD has HIP.

Resources