Difference between revisions of "Lifecycle Management Tools"

From IMLS
Jump to: navigation, search
(Digital Drop Box (with metadata submission functionality))
 
(6 intermediate revisions by one user not shown)
Line 1: Line 1:
This is a placeholder workspace for draft and final documents related to this project deliverable.
+
=About the Lifecycle Management Tools=
 +
The Institute of Museum and Library Services (IMLS) funded Lifecycle Management of ETDs project is pleased to make available to ETD programs a series of Lifecycle Management Tools. The tools cover four major areas of lifecycle curation for ETDs, including:
  
=About the Lifecycle Management Tools (Updated 04/15/13)=
+
* '''Virus Checking:''' Submitted ETD's may contain viruses that could damage the entire collection if not screened in advance. We provide instructions for using ClamAv.
The project has aimed to develop and disseminate a set of software tools to address specific needs in managing ETDs throughout their lifecycle. These tools have been intended to be created as completely modular micro-services, i.e. single function standalone services that that can be used alone or incorporated into larger repository systems. Micro- services for digital curation functions are a new approach to system integration pioneered by the California Digital Library and the Library of Congress, and subsequently adopted by the University of North Texas, Chronopolis, MetaArchive, and other digital preservation repositories.
+
  
The micro-services listed below have been viewed as relatively simple to construct, as they are primarily based on the idea of being able to call other existing open source software tools.
+
* '''File Format Identification:''' Knowledge of the formats used in an ETD can help a program determine whether an ETD submission (particularly supplemental files) adheres to program requirements and what software will be necessary to access the data in the future. We provide instructions for using DROID, FITS, JHOVE2 & the Unix file command.
  
*ETD Format Recognition
+
* '''Preservation Metadata:''' Actions taken during curation should be recorded in order to track their success and failure. We provide instructions for using the new PREMIS Event Service.
*PREMIS Metadata Event Record-keeping
+
*Virus Checking
+
*Digital Drop Box with Metadata Submission Functionality
+
  
Over the course of the research for this project it was determined that two of the above micro-services (ETD Format Recognition & Virus Checking) needed no additional scripting to integrate with other repository systems in use, and would work best as standalone utilities coupled with some project-produced documentation for implementing on behalf of ETDs. See below for further details.
+
* '''ETD Submission:''' Submissions systems can range from simple to complex, but at their most basic should allow for simple upload, standardized metadata collection, and facilitate on-going ETD workflows into an institutional repository. We provide instructions for using the new ETD Drop application.
  
The PREMIS Metadata Event Record-keeping micro-service is undergoing improvements to its API at UNT and being packaged and documented for deployment as a standalone micro-service.
+
=Get the Tools=
 +
The Lifecycle Management Tools are open source and freely available. Request using the link below to receive our usage manual which includes well-described use cases and instructions for downloading and using the Tools. Thank you!
  
The Digital Drop Box with Metadata Submission Functionality is undergoing research in the context of two existing open-source applications (Archivematica & DataStage)
+
[[Request_Tools | Request to receive the tools]]
 
+
=Functional Requirements for ETD Micro-Services=
+
 
+
==Overview==
+
Each of the Lifecycle Management Tools aims to serve as a standalone micro-service that can be called via command line or script interfaces in order to ensure that the systems can be easily integrated in existing environments in a modular way. Each micro-service will have clear documentation that will enable implementers to deploy the tool in their own setting. The intent of researching, developing and documenting these four micro-services is that they will catalytically enhance existing repository systems being used for ETDs, which often lack simple mechanisms for these functions.
+
 
+
The micro-service packages produced in the course of this project will include the following tools:
+
 
+
===ETD Format Recognition Service===
+
Accurate identification of ETD component format types is an important step in the ingestion process, especially as ETDs become more complex. This micro-service should:
+
#enable batch identification of ETD files through integration of function calls from tools like JHOVE2, DROID, and other format identification toolkits; and
+
#structure micro-service output in ad hoc tabular formats for importation into repository systems used for ETDs such as DSpace, and the ETD-db software, as well preservation repository software such as iRODS and DAITSS and preservation network software such as LOCKSS.
+
 
+
The project has successfully researched all of the primary format recognition utilities in current use, including JHOVE/2, DROID, FITS and even the UNIX file command. Based on our research of these tools, a thorough analysis of integrations with existing repository systems, and interviews with numerous and diverse ETD programs, project work has shifted to documenting the proper usage of these tools in an ETD Program context.
+
 
+
See [[Format Recognition Tools Documentation for ETDs]].
+
 
+
===PREMIS Metadata Event Record-Keeping Service===
+
One gap highlighted in the needs analysis was the lack of simple PREMIS metadata and event record keeping tools for ETDs. This micro-service needs to:
+
#generate PREMIS Event semantic units to track a set of transitions in the lifecycle of particular ETDs using parameter calls to the micro-service; and
+
#provide profile conformance options and documentation on how to use the metadata in different ETD repository systems.
+
 
+
See [[PREMIS Event Service Documentation]]
+
 
+
===Virus Checking Service===
+
Virus checking is an obvious service needed in ETD programs, as students’ work is often infected unintentionally with computer viruses. This micro-service should:
+
#provide the capability to check ETD component files using the ClamAV open source email gateway virus checking software;
+
#record results of scans using the PREMIS metadata event tracking service; and
+
#be designed such that other anti-virus tools can be called with it.
+
 
+
ClamAV was closely researched for any needed scripting and improvements, along with investigations into its potential for systematic integration with various repository systems. No special scripting is needed and this utility is best documented as a standalone service in conjunction with various ETD workflows prior to depositing in an IR. The pass/fail of this service can be recorded and used by the PREMIS Event Service. There are no other open-source anti-virus services as robust or preferable as ClamAv.
+
 
+
See [[Virus Checking Documentation for ETDs]]
+
 
+
===Digital Drop Box (with metadata submission functionality)===
+
This micro-service addresses a frequently sought function to provide a simple capability for users to deposit ETDs into a remote location via a webform that gathers requisite submission information requested by the ETD program. The submission information should:
+
#generate PREMIS metadata for the ETD files deposited;
+
#have the capacity to replicate the deposited content securely upon ingest into additional locations by calling other Unix tools such as rsync; and
+
#record this replication in the PREMIS metadata.
+
 
+
Use cases for a simple submission drop box for ETDs have proven difficult to research in the project due to the large number of deposits that originate with ProQuest's UMI. In the course of our research we have been directed by institutions to spend time with two existing technologies that would fulfill the needs for this service and which have potential for adoption by various ETD programs. These two technologies are:
+
 
+
* Archivematica: https://www.archivematica.org/wiki/Main_Page
+
 
+
* DataStage: http://www.dataflow.ox.ac.uk/index.php/about/about-datastage
+
 
+
The project team will have more to report on our experiments with modeling ETD workflows using these two applications during the Spring/Summer of 2013.
+

Latest revision as of 15:46, 15 April 2014

[edit] About the Lifecycle Management Tools

The Institute of Museum and Library Services (IMLS) funded Lifecycle Management of ETDs project is pleased to make available to ETD programs a series of Lifecycle Management Tools. The tools cover four major areas of lifecycle curation for ETDs, including:

  • Virus Checking: Submitted ETD's may contain viruses that could damage the entire collection if not screened in advance. We provide instructions for using ClamAv.
  • File Format Identification: Knowledge of the formats used in an ETD can help a program determine whether an ETD submission (particularly supplemental files) adheres to program requirements and what software will be necessary to access the data in the future. We provide instructions for using DROID, FITS, JHOVE2 & the Unix file command.
  • Preservation Metadata: Actions taken during curation should be recorded in order to track their success and failure. We provide instructions for using the new PREMIS Event Service.
  • ETD Submission: Submissions systems can range from simple to complex, but at their most basic should allow for simple upload, standardized metadata collection, and facilitate on-going ETD workflows into an institutional repository. We provide instructions for using the new ETD Drop application.

[edit] Get the Tools

The Lifecycle Management Tools are open source and freely available. Request using the link below to receive our usage manual which includes well-described use cases and instructions for downloading and using the Tools. Thank you!

Request to receive the tools

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox