1 comments

  • SiliconGen 3 hours ago ago

    I'm Harry, a former CV engineer who switched to LLMs. After months of wading through scattered blog posts, dense papers, and tutorials that skip the hard parts, I decided to build the course I wished existed.

      CookLLM is a hands-on LLM engineering course where you build everything from                                                                                                                                                             
      scratch — tokenizer (BPE in Rust), model architecture, GPU kernels                                                                                                                                                                       
      (CUDA/Triton), Flash Attention, pretraining pipeline, and eventually SFT/RLHF.                                                                                                                                                           
                                                                                                                                                                                                                                                                                                                                                                                                                                                   
                                                                                                                                                                                                                                               
      Currently ~40% complete. Topics already shipped: tokenization, RoPE, attention,                                                                                                                                                          
      Flash Attention (6 chapters), GPU programming, BentoLM architecture, and the                                                                                                                                                             
      full pretrain pipeline. Coming next: training parallelism, modern architectures                                                                                                                                                          
      (RMSNorm, SwiGLU, Muon optimizer), and post-training (SFT, DPO, GRPO).