JMLR Journal 2023 Journal Article
Scaling Up Models and Data with t5x and seqio
- Adam Roberts
- Hyung Won Chung
- Gaurav Mishra
- Anselm Levskaya
- James Bradbury
- Daniel Andor
- Sharan Narang
- Brian Lester
Scaling up training datasets and model parameters have benefited neural network-based language models, but also present challenges like distributed compute, input data bottlenecks and reproducibility of results. We introduce two simple and scalable software libraries that simplify these issues: t5x enables training large language models at scale, while seqio enables reproducible input and evaluation pipelines. These open-source libraries have been used to train models with hundreds of billions of parameters on multi-terabyte datasets. Configurations and instructions for T5-like and GPT-like models are also provided. The libraries can be found at https://github.com/google-research/t5x and https://github.com/google/seqio. [abs] [ pdf ][ bib ] [ code ] © JMLR 2023. ( edit, beta )