Verifiable ML Without Determinism: Tolerance-Aware Optimistic Verification
Posted2 months ago
arxiv.orgResearchstory
calmneutral
Debate
0/100
Machine LearningVerificationDeterminism
Key topics
Machine Learning
Verification
Determinism
Researchers propose a new method for verifiable machine learning without determinism, sparking interest in the HN community, albeit with limited discussion.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
N/A
Peak period
1
Start
Avg / period
1
Key moments
- 01Story posted
Oct 21, 2025 at 9:27 AM EDT
2 months ago
Step 01 - 02First comment
Oct 21, 2025 at 9:27 AM EDT
0s after posting
Step 02 - 03Peak activity
1 comments in Start
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 21, 2025 at 9:27 AM EDT
2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45655524Type: storyLast synced: 11/17/2025, 9:09:06 AM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
We built a system that makes ML results verifiable without requiring determinism. Instead of demanding exact equality, we verify outputs up to principled, per-operator error bounds and resolve disagreements with an optimistic, Merkle-anchored dispute game.
What's inside (precise + concise):
Semantics: "Tolerance-aware correctness" for tensor programs and operator-specific acceptance regions induced by IEEE-754 rounding.
Two error models:
Theoretical IEEE-754 bounds (sound, per-operator, element-wise; conservative but cheap to verify).
Empirical percentile thresholds calibrated across GPUs (tight, model-specific).
Dispute protocol: If a result is challenged, we recursively partition the traced graph (Merkle proofs) until a single operator. At the leaf, we either (i) certify with the theoretical bound or (ii) run a small committee vote against the empirical threshold. Violations are slashed.
Runtime + coordinator: PyTorch-compatible runtime that traces graphs, computes bounds on the fly, records subgraph I/O, and emits/validates commitments; a lightweight coordination layer (we prototyped on Ethereum for authenticated logs/bonds, but the protocol doesn't depend on a blockchain). Overhead ~0.3% latency on Qwen3-8B; no extra memory beyond native subgraph execution.
Scope: Designed for the open-model setting (public weights/graphs enable subgraph extraction and thresholding). Closed-model APIs can still expose commitments to permissioned verifiers. No TEEs required; no deterministic kernels needed.
We’d love feedback on blind spots in the threat model, better partition policies for the dispute game, and where empirical thresholds might be gamed or leak information.