The KL3M Dataset, previously known as the Kelvin Legal Datapack, is a collection of individual datasets that have been legally and ethically sourced, cleaned, and enriched for use in AI research and development.
Available on Hugging FaceRaw and tokenized data is also available on S3 under a requestor-pays model. Please contact us to learn more about this option.
We have released a tool that allows users to easily explore and understand the contents of KL3M datasets at gallery.kl3m.ai.
Read more about this tool in our blog post here..