Open Source Speech Foundation Model That Runs Locally on CPU in Real-Time
Posted3 months agoActive3 months ago
huggingface.coTechstory
excitedpositive
Debate
20/100
Artificial IntelligenceSpeech SynthesisOpen Source
Key topics
Artificial Intelligence
Speech Synthesis
Open Source
The post shares an open-source speech foundation model that can run in real-time on CPU, sparking interest and discussion about its potential applications and comparisons to other models.
Snapshot generated from the HN discussion
Discussion Activity
Moderate engagementFirst comment
N/A
Peak period
7
0-12h
Avg / period
3.3
Key moments
- 01Story posted
Oct 2, 2025 at 10:47 AM EDT
3 months ago
Step 01 - 02First comment
Oct 2, 2025 at 10:47 AM EDT
0s after posting
Step 02 - 03Peak activity
7 comments in 0-12h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 8, 2025 at 5:27 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45450363Type: storyLast synced: 11/20/2025, 4:02:13 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
The main idea: frontier-quality text-to-speech, but small enough to run in realtime on CPU. No GPUs, no cloud APIs, no rate limits.
Why we built this: - Most speech models today live behind paid APIs → privacy tradeoffs, recurring costs, and external dependencies. - With Air, you get full control, privacy, and zero marginal cost. - It enables new use cases where running speech models on-device matters (edge compute, accessibility tools, offline apps).
Repo: https://github.com/neuphonic/neutts-air
Would love feedback from HN on performance, applications, and contributions.
https://github.com/neuphonic/neutts-air/issues/15#issuecomme...
So "no rate limits", while true, is kind of setting different expectations.
Appears to use a proprietary codec as well.
You can listen to the model on this video => https://www.youtube.com/watch?v=YAB3hCtu5wE
The codec is open source: https://huggingface.co/neuphonic/neucodec
> Audio Codec: NeuCodec - our proprietary neural audio codec that achieves exceptional audio quality at low bitrates using a single codebook
( https://huggingface.co/neuphonic/neutts-air#model-details )
This says it was trained on proprietary data.
Sorry if I'm missing the point.
Demo sounds great.
Can it run on a gpu? Would it be faster?