Open Data

Collecting, enriching, and open-sourcing data to support the legal and ethical development and use of AI systems.

⩘   Our data

We believe that truly open data is essential for the development of AI systems that create positive social, economic, and environmental impacts.

That's why we are committed to collecting, enriching, and open-sourcing data that supports the legal and ethical development and use of AI systems.

In addition to open sourcing these datasets, we also provide educational resources and tools to help individuals and organizations understand how to collect and use their own data in ways that promote legal, ethical, and sustainable AI outcomes.

Our Datasets

KL3M Datasets

The KL3M Dataset, previously known as the Kelvin Legal Datapack, is a collection of individual datasets that have been legally and ethically sourced, cleaned, and enriched for use in AI research and development.

Available on Hugging Face

Raw and tokenized data is also available on S3 under a requestor-pays model. Please contact us to learn more about this option.

Tools and Educational Resources

KL3M Data Gallery

KL3M Data Gallery Preview

We have released a tool that allows users to easily explore and understand the contents of KL3M datasets at gallery.kl3m.ai.

Read more about this tool in our blog post here..