Skip to main content

Welcome to the Material Contracts Corpus

This website provides free access to the Material Contracts Corpus, a dataset compiled by Peter Adelson and Prof. Julian Nyarko. The corpus includes all exhibits filed as “material contracts” with the U.S. Securities and Exchange Commission (SEC) by registrants from 2000 to 2023. Contracts were obtained through the publicly available SEC EDGAR database.

What the Corpus Contains

  • Contracts: All material agreements filed as SEC exhibits from 2000 to 2023. In total, the corpus contains 1,038,766 agreements.
  • Metadata: Accompanying each contract is metadata such as party names, contract types, and filing dates. This metadata is largely generated using machine-learning classifiers and may contain inaccuracies.

How to Use the Dataset

  • Search and Read Online: Use our search tools to explore contracts and their metadata directly on this website.
  • Download in Bulk: The entire dataset, or portions of it (coming soon), can be downloaded for offline use.

Citation & Documentation

The following preprint describes the corpus in more detail and should be cited if the corpus is used.

Adelson, Peter, and Nyarko, Julian. The Material Contract Corpus. arXiv preprint arXiv:2504.02864, 2025
https://arxiv.org/abs/2504.02864

Additional Information

This resource is provided for research purposes and is freely accessible to the public. More details about the dataset, including search functionality and limitations of the metadata, are available on the Search page.

Back to the Top