2. The Transformer

Today's lab is about seeing distributed training in action, moving from the previous lab. We'll run it on a GPU and train a model to tell tiny stories. Please follow the jupyer lab