Back to Home


Blog
Thoughts on software development, AI, and technology. Sharing insights from building real-world applications.

GPT-2 from scratch
The process of creating GPT-2 from scratch and pre-training with DDP.
PyTorchAttentionTransformersDDP
Feb 19, 2026
Read more

Instruction Fine-tuning
Fine tuning the pre trained model to follow instructions and emotion.
SFTFine-tuningInstruction Fine-tuningLlama
Feb 5, 2026
Read more

RoPE: Rotary Position Embedding
From first principles, the small mathematical change that unlocked long-context reasoning
RoPETransformersLLM
Jan 22, 2026
Read more