Lifecycle Management Tools

Revision as of 20:08, 25 January 2012 by Mgschultz (Talk | contribs)

Jump to: navigation, search

This is a placeholder workspace for draft and final documents related to this project deliverable.


About the Lifecycle Management Tools

The project will develop and disseminate a set of software tools to address specific needs in managing ETDs throughout their lifecycle. These tools will be created as completely modular micro-services, i.e. single function standalone services that that can be used alone or incorporated into larger repository systems. Micro- services for digital curation functions are a new approach to system integration pioneered by the California Digital Library and the Library of Congress, and subsequently adopted by the University of North Texas, Chronopolis, MetaArchive, and other digital preservation repositories.

The micro-services listed below will be relatively simple to construct, as they are primarily based on the idea of being able to call other existing open source software tools.

  • ETD Format Recognition
  • PREMIS Metadata Event Record-keeping
  • Virus Checking
  • Digital Drop Box with Metadata Submission Functionality

Functional Requirements for ETD Micro-Services


Each of the Lifecycle Management Tools will be designed as standalone micro-services that can be called via command line or script interfaces in order to ensure that the systems can be easily integrated in existing environments in a modular way. Each micro-service will have clear documentation that will enable implementers to deploy the tool in their own setting. The intent of creating these four micro-services is that they will catalytically enhance existing repository systems being used for ETDs, which often lack simple mechanisms for these functions.

The micro-service packages produced in the course of this project will include the following tools:

ETD Format Recognition Service

Accurate identification of ETD component format types is an important step in the ingestion process, especially as ETDs become more complex. This micro-service will:

  1. enable batch identification of ETD files through integration of function calls from the JHOVE2 and DROID format identification toolkits; and
  2. structure micro-service output in ad hoc tabular formats for importation into repository systems used for ETDs such as DSpace, and the ETD-db software, as well preservation repository software such as iRODS and DAITSS and preservation network software such as LOCKSS.

Components & Basic Requirements:

  • JHOVE2
  • XML output schema
  • Utility scripts (run commands, output parsers, etc.) & code libraries
  • API function calls
  • System requirements
  • Documentation & instructions

PREMIS Metadata Event Record-Keeping Service

One gap highlighted in the needs analysis was the lack of simple PREMIS metadata and event record keeping tools for ETDs. This micro-service needs to:

  1. generate PREMIS Event semantic units to track a set of transitions in the lifecycle of particular ETDs using parameter calls to the micro-service; and
  2. provide profile conformance options and documentation on how to use the metadata in different ETD repository systems.

Components & Basic Requirements:

  • PREMIS Event profiles (example records) for ETDs
  • Event-type identifier schemes and authority control
  • AtomPub service document & feed elements
  • Utility scripts (modules) & code libraries
  • API function calls
  • Simple database schema & config
  • System requirements
  • Documentation

Virus Checking Service

Virus checking is an obvious service needed in ETD programs, as students’ work is often infected unintentionally with computer viruses. This micro-service will:

  1. provide the capability to check ETD component files using the ClamAV open source email gateway virus checking software;
  2. record results of scans using the PREMIS metadata event tracking service; and
  3. be designed such that other anti-virus tools can be called with it.

Components & Basic Requirements:

  • ClamAV
  • Utility scripts (run commands, output parser, etc.) & code libraries
  • API function calls
  • System requirements
  • Documentation & instructions

Digital Drop Box (with metadata submission functionality)

This micro-service addresses a frequently sought function to provide a simple capability for users to deposit ETDs into a remote location via a webform that gathers requisite submission information requested by the ETD program. The submission information will:

  1. generate PREMIS metadata for the ETD files deposited;
  2. have the capacity to replicate the deposited content securely upon ingest into additional locations by calling other Unix tools such as rsync; and
  3. record this replication in the PREMIS metadata.

Components & Basic Requirements:

  • Metadata submission profile(s)
  • Client/server architecture
  • GUI interface
  • SSL, authentication support
  • Versioning support
  • Various executables, scripts & code libraries
  • Database schema & config
  • System requirements
  • Documentation
Personal tools