Writing an LLM from scratch, part 20 – starting training, and cross entropy loss | Not Hacker News!