In this post, we’ll implement a GPT from scratch in just 60 lines of numpy. We’ll then load the trained GPT-2 model weights released by OpenAI into our implementation and generate some text.
A searchable collection of notes
In this post, we’ll implement a GPT from scratch in just 60 lines of numpy. We’ll then load the trained GPT-2 model weights released by OpenAI into our implementation and generate some text.