Ask HN: What do you use for speaker diarization?
Mood
thoughtful
Sentiment
neutral
Category
tech
Key topics
speaker diarization
open-source solutions
audio processing
I am looking for a fire and forget solution akin to whisper where I can give it a wav of around 12 people and it can give me a diarization on the format (speaker_1, speaker_2, etc)
whispercpp gives labels like speaker_turn which is not what I am looking for, I need to know who said what
nvidia nemo only works with 4 speakers and unfortunately is not good enough for me
Do you have an open source solution that you can suggest? Or a potential pipeline?
Much appreciated!
The author is seeking a 'fire and forget' open-source solution for speaker diarization that can handle audio files with around 12 speakers and provide a clear identification of who said what.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
8h
Peak period
1
Hour 8
Avg / period
1
Based on 1 loaded comments
Key moments
- 01Story posted
11/18/2025, 6:27:16 AM
15h ago
Step 01 - 02First comment
11/18/2025, 2:11:18 PM
8h after posting
Step 02 - 03Peak activity
1 comments in Hour 8
Hottest window of the conversation
Step 03 - 04Latest activity
11/18/2025, 2:11:18 PM
7h ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Discussion hasn't started yet.
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.