You must log in or register to comment.
Earlier reports suggested they trained it on books from Bibliotik.
What changed?
Probably just both honestly.
In for a penny and for a pound.
The llama-1 paper acknowledged the use of the books dataset, libgen isn’t mentioned in any of the papers so this is new info.