Speech and Language Processing (3rd Ed. Draft)
Postedabout 1 month agoActive26 days ago
web.stanford.edustory
informativeneutral
Conversational UIAudio_recognitionComputational Linguistics
Key topics
Conversational UI
Audio_recognition
Computational Linguistics
Discussion Activity
Moderate engagementFirst comment
7d
Peak period
10
168-180h
Avg / period
6.5
Comment distribution13 data points
Loading chart...
Based on 13 loaded comments
Key moments
- 01Story posted
Dec 8, 2025 at 1:50 AM EST
about 1 month ago
Step 01 - 02First comment
Dec 15, 2025 at 11:15 AM EST
7d after posting
Step 02 - 03Peak activity
10 comments in 168-180h
Hottest window of the conversation
Step 03 - 04Latest activity
Dec 15, 2025 at 2:55 PM EST
26 days ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 46189205Type: storyLast synced: 12/15/2025, 7:35:33 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
But, there's benefit to the fact that deep learning is now the "lingua franca" across machine learning fields. In 2008, I would have struggled to usefully share ideas with, say, a researcher working on computer vision.
Now neural networks act as a shared language across ML, and ideas can much more easily flow across speech recognition, computer vision, AI in medicine, robotics, and so on. People can flow too, e.g., Dario Amodei got his start working on Baidu's DeepSpeech model and now runs Anthropic.
Makes it a very interesting time to work in applied AI.
In what fields did neural networks replace Gaussian mixtures?
Now those layers are neural nets, so acoustic pre-processing, GMM, and HMM are all subsumed by the neural network and trained end-to-end.
One early piece of work here was DeepSpeech2 (2015): https://arxiv.org/pdf/1512.02595
When you work closely with transformers for while you do start to see things reminiscent of old school NLP pop up: decoder only LLMs are really just fancy Markov Chains with a very powerful/sophisticated state representation, "Attention" looks a lot like learning kernels for various tweaks on kernel smoothing etc.
Oddly, I almost think another AI winter (or hopefully just an AI cool down) would give researchers and practitioners alike a chance to start exploring these models more closely. I'm a bit surprised how few people really spend their time messing with the internals of these things, and every time they do something interesting seems to come out of it. But currently nobody I know in this space, from researchers to product folks, seems to have time to catch their breath, let along really reflect on the state of the field.
The field of Explainable AI (or other equivalent names, interpretable AI, transparent AI etc) is looking for talent, both in academia and industry.
Among screen reader users for example, formant-based TTS is still wildly popular, and I don't think that's going to change anytime soon. The speed, predictability and responsiveness are unmatched by any newer technology.
Newcomers to the field should glad to read through this... there is gold in there. <3
I got my start in NLP back in '08 and later in '12 with an older version of this book. Recommended!
This one and Manning and Schutze's "Dice Book" (Foundations of Statistical Natural Language Processing) were what got me into computational linguistics, and eventually web development.
Controversial opinion (certainly the publisher would disagree with me): I would not take out older material, but arrange it by properties like explanatory power/transparency/interpreability, generative capacity, robustness, computational efficiency, and memory footprint. For each machine learning method, an example NLP model/application could be shown to demonstrate it.
Naive Bayes is way too useful to downgrade it to an appendix position.
It may also make sense to divide the book into timeless material (Part I: what's a morphem? what's a word sense?) and (Part II:) methods and datasets that change every decade.
This is the broadest introductory book for beginners and a must-read; like the ACL family of conferences it is (nowadays) more of an NLP book (i.e., on engineering applications) than a computational linguistics (i.e., modeling/explaining how language-based communication works) book.