GPT-4 from Scratch
Andrej Karpathy
OpenAI, USA
Abstract
I will cover the full training pipeline of GPT-4 in depth, drawing upon nearest neighbor publicly available materials. This includes datasets, tokenization, Transformer, pretraining, supervised finetuning on conversational data, RLHF (reward modeling, proximal policy optimization), and related topics. I will also cover the multimodal extensions that allow GPT-4 to perceive images.