The Smol Training Playbook: the Secrets to Building World-Class Llms
Posted2 months agoActive2 months ago
huggingface.coTechstory
supportivepositive
Debate
20/100
LLM TrainingAI Learning ResourcesHugging Face
Key topics
LLM Training
AI Learning Resources
Hugging Face
The Smol Training Playbook is a hands-on resource for building world-class LLMs, sparking discussion on its content and the Hugging Face platform.
Snapshot generated from the HN discussion
Discussion Activity
Moderate engagementFirst comment
2d
Peak period
9
60-72h
Avg / period
3.8
Comment distribution19 data points
Loading chart...
Based on 19 loaded comments
Key moments
- 01Story posted
Oct 30, 2025 at 12:52 PM EDT
2 months ago
Step 01 - 02First comment
Nov 1, 2025 at 9:55 AM EDT
2d after posting
Step 02 - 03Peak activity
9 comments in 60-72h
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 7, 2025 at 12:27 AM EST
2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45762160Type: storyLast synced: 11/20/2025, 1:51:04 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
One of the reasons people build one though is to learn. Most smart folks are quite aware that the reality of pre-training a real LLM is going to involve some head banging against the wall (ie, things don't go smoothly like "building an llm from scratch" book), and they want to go through the process.
Tumbler speak has a bunch of whacky things, notably "chimkin nuggers."
> Modify one thing at a time
> Change only one variable per ablation while keeping everything else constant. If you change multiple things and performance improves, you won’t know what caused it. Test modifications individually, then combine successful ones and reassess.
This is an unintentional microcosm of what is flawed with the document.
Or, more modern Bayesian methods if you're more interested in getting the best results for a given hyperparameter sweep.
However, that is not to detract from the excellent effort made here and the great science being investigated. Write ups like this offer so much gold to the community.
And even then. If you’re an IC and your boss is saying, “incrementalism at the level of planning experiments,” and the goal is research, quit, because you will fail.