The Pile: An 800GB Dataset of Diverse Text for Language Modeling - https://news.ycombinator.com/item?id=36272365 - June 2023 (5 comments)
The Pile: An 800GB Dataset of Diverse Text for Language Modeling - https://news.ycombinator.com/item?id=25607809 - Jan 2021 (60 comments)
The Pile: An 800GB Dataset of Diverse Text for Language Modeling - https://news.ycombinator.com/item?id=36272365 - June 2023 (5 comments)
The Pile: An 800GB Dataset of Diverse Text for Language Modeling - https://news.ycombinator.com/item?id=25607809 - Jan 2021 (60 comments)