Introducing the KL3M Data Project: a comprehensive collection of legally sound training resources for large language models spanning 132+ million documents.