Frugality meets Accuracy: Cost-efficient training of GPT NeoX and Pythia models with AWS Trainium
AWS Machine Learning Blog
DECEMBER 12, 2023
Next, we also evaluate the loss trajectory of the model training on AWS Trainium and compare it with the corresponding run on a P4d (Nvidia A100 GPU cores) cluster. NeoX 20B is trained on 4 nodes with small wiki dataset on GPU and Trainium with same training hyper-parameters (global batch size=256). tokens/$ spent. tokens/$ spent.
Let's personalize your content