Data Downloads

Download the Full Database

Users who are comfortable with SQL (Structured Query Language) can export the full relational database that powers this website. A data dictionary and a diagram of our relational schema are available on GitHub.

Export Event Data

All 52,000+ event records are available to download and analyze in several file formats. You can open these files in a text editor, such as Notepad or TextEdit, or use more specialized software to analyze and visualize the results that interest you.

Detailed event records

In creating these files, we tried to capture the most commonly requested information about LSDB events, including performance titles, cast lists, and connections to related dramatic works. These files represent the minimum reduction in complexity needed to present event records in two widely-used, human-readable formats: JSON (Javascript Object Notation), a data-interchange format that stores data objects as text; and XML (eXtensible Markup Language), a hierarchical tagging language similar to HTML.

Simplified event records

CSV (Comma Separated Values) files are easy to work with in spreadsheet software like Google Sheets or Microsoft Excel. Note the trade-off: this format further simplifies the data in order to present complex, relational information in a two-dimensional table.

Drama Corpus

The Drama Corpus is a subset of the TCP corpus produced by the Text Creation Partnership. Each of the 935 items in our Drama corpus is associated with one or more performances in the LSDB dataset, meaning it is listed as a "Print Witness" for one or more "Related Works."

Read more about Related Works and Print Witnesses on our blog.

We have not modified the XML files; they are exact copies of those distributed through the TCP's GitHub. We have simply made the files easier to work with as a coherent dataset in the following ways:

Download the Drama Corpus