Metadata for ETDs v1

Jump to: navigation, search

This is Version 1 of this Guidance Document. Version 2 is here.


About this Document (Summary)

Another issue revealed in the needs assessment process was that most institutions do not have workflows and systems in place to capture the appropriate levels of metadata needed to manage ETDs over their entire lifecyle, often because of a lack of awareness of the issues and ramifications of not maintaining such information. A concise overview of this topic will be prepared to inform stakeholders and decision makers about the critical issues to be aware of in gathering and maintaining preservation metadata for ETDs, not just at the point of ingestion, but subsequently, as ETDs often have transitional events in their lifecyle (embargo releases, redactions, etc.). This overview will be developed in concert with the related metadata tools.

Metadata for ETDS


Digital collection development has moved from being an additional activity to a core service in many libraries. Since late 1990’s, ETDs have been playing significant roles, not just as new forms of scholarly communication, but as drivers for the development of institutional repositories and digital libraries in general. Digital libraries and supporting technologies have now matured to the point where their contents are incorporating complex and dynamic resources and services. The emphasis has now moved to the challenges inherent in ensuring long-term access to digital collections. Most academic institutions have identified digital preservation as a priority that they wish to collectively address. As stated by many digital preservation researchers, preservation activities require the development of standards, best practices,and sustainable funding models to support long-term commitment to digital resources. This document will provide an overview of PREMIS Metadata and Lifecycle Event Recordkeeping for ETDs. As so many institutions are still seeking feasible strategies and action plans that lay the foundations for ensuring long term access, we think that it is important to try to take a broader view of digital resource management as well. In this regard, ETDs can and will serve once again, as drivers for the development of digital preservation strategies.

Metadata Schemes for ETDs

ETDs Specific Metadata Elements



Recognizing the critical role of metadata in any successful digital preservation strategy, the Preservation Metadata Implementation Strategies (PREMIS) have been influential in providing a "core" set of preservation metadata elements that support the digital preservation process. In light of PREMIS requirements, this document will specify the metadata needed for ensuring ETDs long-term access and preservation.


The PREMIS data model consists of five interrelated entities: Intellectual, Object, Event, Agent, and Rights with each semantic unit mapped to one of these areas. Although all of the five entities are interrelated, they can be used and implemented independently from each other. Accordingly, different institutions adopt one or more different PREMIS entities. For example, the University of North Texas (UNT) Libraries have implemented Object, Event, and Agent entities at this point. The UNT implementation experience will be described further in the use cases sections.

Intellectual Entity

Object Entity

Event Entity

Agent Entity

Rights Entity

Rights entity is of particular interest to the ETDs community. For the purpose of the PREMIS Data Dictionary, statements of rights and permissions are taken to be constructs that can be described as the Rights entity.

  • Rights are entitlements allowed to agents by copyright or other intellectual property law.
  • Permissions are powers or privileges granted by agreement between a rights holder and another party or parties.

The minimum core rights information that a preservation repository must know, however, is what rights or permissions it has to carry out actions related to objects within the repository. These may be generally granted by copyright law, by statute, or by a license agreement with the rightholder.

The PREMIS Editorial Committee has been working on changes for Rights. A number of institutions (including University of North Texas, Harvard University, and other early PREMIS adopters) identified that the Rights entity lacked the robustness required by various types of digital objects such as ETDs. To address such limitations, the PREMIS Editorial Committee has been working on changes and enhancements to the Rights entity semantic units, for example to allow for terms of restrictions (e.g., for embargo periods). Currently there is just "term of grant", and "term of restriction" is being proposed under the same container.

The PREMIS Editorial Committee is also working on a major revision for version 3.0. This will include changes in the data model, where Intellectual Entity will become another level of Object instead of a separate entity and "Environment" will become another entity rather than tied to an Object. This new version will not be available until at least April 2012. The PREMIS Editorial Committee has had several requests to implement these rights changes earlier and so they are proposing to prepare a version 2.2 early in 2012 to include these changes. Among other possible changes that are not compatible with the existing data dictionary is changing the name of licenseIdentifier and its components to licenseDocumentationIdentifier to better convey its use. A revision of the Rights Entity is available on thePIGPEN wiki page. we hope this will be included in the next major revision 3.0, which is expected to be out in April 2012.

PREMIS Implementation

Issues and Considerations

One thing to keep in mind is that PREMIS is designed to be a preservation metadata format, which means that ETDs are no different from other digital objects at the archiving level. Different repositories and systems are implementing different workflow and submission models. It should be noted that some of the identified metadata are (or aren't) currently being collected by repositories. Regardless, decisions on how to implement the recommended additions or enhancements will be entirely up to the repositories themselves, depending on various internal and external factors.

UNTL's approach

Other Institutions' Experiences

Metadata Approach to ETDs' Lifecyle Management

Metadata Quality

Why metadata Quality?

Factors affecting metadata quality

Various (local, collaboraors, system,...) requirements

Resource (human, financial, time,...) availablity

Other issues

Metadata qualities assurances Mechanisms



Metadata Analysis tools

Related Bibliography

Standards and Metadata



Dubkin Core: The Road from Metadata to Linked Date. NISO/DCMI Webinar, August 25, 2010:


Authorities and Vocabularies


Standards and Best Practices Resources

Preservation Related Web Sites

Tools and Software

  • PREMIS in METS Toolbox:
  • DAITSS (Dark Archive in the Sunshine State):
    • DAITSS is a digital preservation repository application developed by the Florida Center for Library Automation FCLA with some support from the IMLS.


Metadata Elements for ETDS

ETDs Metadata Crosswalks (UNT,ND-LTD, TDL,...)

Controlled Vocabularies for ETDs


Timeline of Major PREMIS Development Activities

The Portal to Texas History in PREMIS Implementation Registry: Sample Document

There are currently 45 projects in the PREMIS Implementation Registry

Example: PREMIS-Event

Others Examples

Personal tools