Ask HN: Advice for getting into post-training / fine-tuning of LLMs?
Mood
thoughtful
Sentiment
neutral
Category
tech
Key topics
LLMs
fine-tuning
post-training
Those who follow fine-tunes of LLMs may know that there’s a company called Nous Research has been releasing a series of fine-tuned models called the Hermes, which seem to have great performance.
Since post-training is relatively cheaper than pre-training, “so” I also want to get into post-training and fine-tuning. Given that I'm GPU poor, with only a M4 MBP and some Tinker credits, so I was wondering if you have any advice and/or recommendations for getting into post-training? For instance, do you think this book https://www.manning.com/books/the-rlhf-book is a good place to start? If not, what’s your other recommendations?
I’m also currently reading “Hands-on LLM” and “Build a LLM from scratch” if that helps.
Many thanks for your time!
The author is seeking advice on getting into post-training and fine-tuning of Large Language Models (LLMs) with limited GPU resources and is looking for recommendations on where to start.
Snapshot generated from the HN discussion
Discussion Activity
No activity data yet
We're still syncing comments from Hacker News.
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Discussion hasn't started yet.
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.