2. Distributed Training

Moving from the previous example, we will now train a model on multiple GPUs to tell tiny stories. Please follow the jupyer lab

The task is to go through the notebook and at the end adjust the different parameters. Note down at the end of the notebook a report on how this changes the performance of the model