You are here

Harvard LibraryCloud and PRESTO APIs

Harvard LibraryCloud is a service that provides a rich set of Harvard Library metadata, including:

  • Complete bibliographic information for the almost thirteen million items in the Harvard Library Bibliographic Dataset
  • Metadata for approximately four million items in Harvard’s image collection
  • Metadata for approximately two million items in the Harvard Archives, including links to several hundred thousand that have been scanned and are available online.
  • Stackscore, a metric that gauges the “community relevance” of items based on usage and other factors.




LibraryCloud is a metadata hub that makes Harvard Library bibliographic metadata data openly available for re-use through an open API. With LibraryCloud, anyone at Harvard and beyond can use Harvard Library’s data to fuel an app or integrate with a website.

The metadata that Harvard LibraryCloud makes available is offered under open licenses, as is the source code for LibraryCloud itself.

LibraryCloud was developed with the generous support of the Harvard Library Lab, the Harvard Library Office for Scholarly Communication, the Berkman Center for Internet &  Society, and the Arcadia Fund. The initial version of LibraryCloud (LilCloud) was created by the Harvard Library Innovation Lab as an experiment in combining metadata from multiple libraries. The production version was built and is maintained by Harvard Library Technology Services.


Inside LibraryCloud

LibraryCloud consists of four sets of services connected by automated workflows.

  1. Ingest: Connect to data sources and pull them in. This can be triggered by a particular event (e.g., a source update) or on a fixed schedule.

  2. Normalize: Once the data is ingested, it is converted into one of the standard formats LibraryCloud uses. For example, internally bibliographic data is represented using the MODS standard, with extensions. Work is in progress to enable the original source record (eg. MARC XML) to be also carried through the workflow.

  3. Enrich: LibraryCloud enhances its data. LibraryCloud currently enriches records from the main catalog with holdings information and stackscore, a measure of how often the item is used by Harvard library patrons.

  4. Discover: Access to LibraryCloud data is through an Item API and a Collections API. Metadata is available in XML or JSON format, using the MODS or Dublin Core schemas. The original source record schema will also be available, projected in winter 2015.

LibraryCommons Logical Architecture



More Information



  • Production Systems

 LibraryCloud is being used to support metadata migration for Harvard's DIgital Repository Service (DRS). The item API is used to retrieve a MODS bibliographic record for each migrated DRS object in order to populate descriptive metadata in the DRS.LibraryCloud is being integrated with the CURIOSity digital collection service to provide metadata for on line exhibits and digital collections. 

  • Innovation examples
    • Marc Duby's native Android 4.2 that lets you search by title, scroll through results, and display details. This is a work in progress, expected to be ready in very early 2015. Available at GitHub.
    • David Weinberger is a hobbyist programmer. He promises you that his code is embarrassing. Nevertheless: the BoogyWoogy Browser is a colorful browser of the Harvard collection, intended to encourage serendipity.
    • David's similarly hobbyist wordWalker displays the text from an item's record, lets the user select up to ten words, and then searches the Library for the works that have those words the most often in their records.
    • mediaCount (also from David) draws pie charts showing the types of media found in up to ten subject searches.
    • Hank Sway's which-harvard-library is another example of a small app in which you can enter a subject and find the library that has the most holdings for that subject.



PRESTO APIs query Harvard Library catalogs, HOLLIS and VIA, directly, and return original source metadata records (in Marc, MODS, DC, or VIA format), HOLLIS+ relevance ranking, as well as business information, such as holdings and availability.

The PRESTO Data Lookup service provides a RESTful web API that can return specific types of library data in response to a URL request. Data Lookup options currently include: