Preservation metadata for digital collections
1 Introduction
There have been a number of efforts to develop metadata specifications and sets to support preservation of a variety of digital resources. Because of its pressing business needs to manage both ‘born digital’ and ‘digital surrogate’ collections, the National Library of Australia has tried to find, or if necessary develop, metadata models to accommodate both.
The National Library of Australia, through its PANDORA Project (Preserving and Accessing Networked Documentary Resources of Australia), has been working at two levels in its efforts to ensure long-term access to Australian online publications. At a conceptual level, the Library has defined its business processes in a Business Process Model, and identified the data that will need to be collected for current and future management of each title in a Logical Data Model. In addition, in December 1998, the Library published its Digital Services Information Paper, which sets out requirements for a technical infrastructure to collect, store, provide access to, and manage its PANDORA Archive of Australian online publications, as well as to support the management of other digital and paper-based collections.
Concurrently, the Library has been working at a practical level, implementing the business principles by developing selection guidelines, liaising with publishers, and building a small archive of titles, which by February 2000, numbered over 400 and occupied approximately fifteen gigabytes of storage space.
Our purpose is to make Australia's cultural heritage available to future generations, as well as to today's scholars and researchers. Because to date there is little commercial publishing on the Internet in Australia, we have not yet had to deal with the complications of archiving subscription-only publications. We have, however, developed principles for managing commercial publications and have begun discussion with publishers on implementation.
2 Management of Metadata
www.nla.gov.au Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009
Future preservation strategies for online publications will require detailed information about the nature of the item and how it has been treated over time. Future researchers may also want historical information about the items they are using: what format it was originally in, and whether anything has been lost in the capture and preservation process. Day-to-day management of titles for the Archive also requires administrative information such as whether the publisher has given permission to archive.
To date we have no facility for recording the full complement of metadata required for each title as outlined in the Logical Data Model. We await the implementation of a full archive management system. In the meantime, to enable us to document the administrative history of the titles being archived, our IT Section created the PANDORA Archive Management System (PAMS) database. PAMS is rather a grandiose title for what is only a small metadata repository. Yet while it does not provide for all of the data elements that are required for long-term preservation, it does provide us with sufficient information about a title to manage archiving. Once an archive management system is available, the data from PAMS can be migrated to it.
At present PAMS does not record all the preservation metadata encompassed by the logical data model. In the absence of a satisfactory preservation metadata model that seems to achieve this objective, the NLA has invested in drafting its own model: a statement of the information it believes will be needed to manage the preservation of its digital collections. (1)
The draft Preservation Metadata Set draws on our corporate experience in a range of relevant fields:
• preservation, and preservation documentation, of library collections • management of archives of online digital publications, physical format digital
publications, and analogue and digital audio collections • management of digitisation projects for text-based and image-based collections • development of logical data models for a specific digital archiving implementation • website database design.
This means that the draft Preservation Metadata Set is built on considerable relevant experience and thinking about the issues involved. However, we are very keen to subject the draft to critical scrutiny from specialists in all of these fields and others with an interest inmanaging digital collections over time, especially in a library context.
This proposed preservation metadata framework has been informed by many models. Some are of broad relevance, (eg the ReferenceModel for an Open Archival Information System (OAIS) Draft Recommendation for Space Data System Standards(2)), while some came to us as results of data modelling exercises for particular projects (the NEDLIB project(3) and the NLA’s own PANDORAproject(4)). Some were more refined metadata specifications developed for particular programs or projects (the Library of Congress- CNRI Experiment Project (5); The Making of America II Project(6); the CEDARS project(7); the National Archives of Australia’s Recordkeeping Metadata Standard(8)). One particular starting point for our exercise was the metadata set proposed by the
2 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
Research Libraries Group (RLG) PRESERV Working Group on Preservation Uses of Metadata(9), which mainly addressed digitisation projects. RLG invited us to adapt this set to describe a wider range of materials.
While we have learned a great deal from all these models, we accept responsibility for the metadata set we are proposing.
3 What the Preservation Metadata Set is
It is most important to realise that our proposed Preservation Metadata Set is intended to be
a statement of the information we believe is needed to manage preservation of digital collections. It is meant to be a data output model, not a data input model. It indicates the information we want out of a metadata system, not necessarily what data should be entered, how it should be entered, by whom and at what time; nor does it concern itself with how the metadata should be associated with what it is describing. We believe this model should be applicable to many implementations that may decide to record this information in a variety of ways. This model simply says: ‘however you do it, this is what you have to deliver so we can manage preservation.’
It is also important to note that we are focusing solely on preservation requirements. The proposed metadata set does not attempt to deal with anything else. We recognise that in any implementation system there is likely to be an overlap between metadata recorded for different purposes. By focusing on the information we need out of the system to manage preservation, we put aside the question of whether particular elements may already be included in, say, other administrative or resource discovery metadata.
Different types of digital materials, and different archiving systems, will need different metadata support. There may be types of material and processes that are not adequately accommmodated by our proposal despite our intentions, and we would welcome feedback.
4 Granularity
The metadata set is based on the need to manage and describe collections, objects, and sub- objects (which we have called "files"). We have tried to show where we expect the elements in the metadata set to be relevant to these different levels. We expect to make pragmatic decisions about the level at which records are needed, based on the level at which collections, objects and files are managed separately. This model assumes that the digital object is the primary focus of management and description. File and collection descriptions are created when appropriate.
5 Change history
Maintaining a history of what is being described is one of the essential objectives of any preservation documentation system. We looked at two options:
• maintaining a single record over time, which records all changes and processes
applied to the item being described; or • creating a new record each time the item changes to something different,
maintaining a history by maintaining a sequence of linked records.
www.nla.gov.au 3 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009
We chose the latter approach. Managing digital objects and collections over time will mean creating and managing considerable amounts of information about them. We believe that the creation of a new record for each new manifestation will organise this information more clearly and conveniently.
6 Supporting alternative preservation strategies
It is impossible to determine unequivocally what we will need to know in order to manage digital preservation in the future, so our set of metadata elements necessarily reflects assumptions about our future requirements. Our aim with this proposed metadata set is to support both migration and emulation approaches. Just what is needed for these approaches will become clearer as we gain more collective experience with them.
7 Some key terms
To minimise confusion, we need to explain some of the terms we have used in the draft proposed Preservation Metadata Set:
• ‘work’, ‘manifestation’ – we have distinguished between a work, as a concept, and the physical or virtual manifestations that instance it. Most preservation processes involve managing manifestations. However, we found it useful to recognise that archiving decisions could be made for the work (eg ‘we will maintain this work in perpetuity’), with different archiving decisions applying to particular manifestions of it (eg ‘we do not need to keep this copy of it’).
• repeatability – because of the approach we have taken (a 1:1 relationship between
each manifestation and its metadata record), our comments about the repeatability of information in any element do not refer to a sequence of changes, but to the possibility of multiple bits of information that may be true at the same time; for example, two agencies may collaborate in an archiving decision.
• obligation – we have avoided terms like ‘mandatory’, ‘conditional’, and ‘optional’, because they are so closely associated with data input models. Instead, we use the terms ‘essential’, ‘essential if appropriate’, and ‘desirable’, in their common usages. Essential information we believe will definitely be required. Some elements are more relevant to some materials or processes than others, so they may be essential if applicable . Desirable information will not be critical, but is expected to be helpful.
• examples – we have provided examples wherever they are applicable. In some cases we have found it more useful to give generic examples, which appear in
square brackets.
8 Comments
We invite comments on the draft Preservation Metadata Set. These may apply to the overall approaches we have taken, the details of any elements, the presentation, and any other issues.
Comments may be directed to:
4 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
Colin Webb Director of Preservation National Library of Australia Canberra, ACT 2600 AUSTRALIA Telephone + 61 2 6262 1662 Facsimile +61 2 6257 1703 cwebb@nla.gov.au
9 References
(1) The NLA Preservation Metadata Working Group consists of: Margaret Phillips, Deborah Woodyard, Kevin Bradley, Colin Webb.
(2) Consultative Committee for Space Data Systems (CCSDS), CCSDS 650.0-R-1, May 1999.. Reference Model for an Open Archival Information System (OAIS) Draft Recommendation for Space Data System Standards. Online. Available: http://www.ccsds.org/RP9905/RP9905.html. 7 October 1999.
(3) Koninklijke Bibliotheek. NEDLIB Networked European Deposit Library (home page). Online. Available: http://www.konbib.nl/nedlib/. 7 October 1999. Also see: van der Werf- Davelaar, Titia. "Long-term preservation of electronic publications: The NEDLIB Project", D-Lib Magazine. Volume 5 Number 9 (1999). Online. Available: http://www.dlib.org/dlib/september99/vanderwerf/09vanderwerf.html. 8 October 1999.
(4) National Library of Australia. PANDORA Project: Preserving and Accessing Networked DOcumentary Resources of Australia (home page). Online. Available: http://www.nla.gov.au/pandora/. 7 October 1999.
See also: National Library of Australia. Digital Services Project. Online. Available: http://www.nla.gov.au/dsp/ http://www.nla.gov.au/dsp/. 8 October 1999.
(5) Carl Fleischhauer. Library of Congress-CNRI Experiment Project Proposed Metadata Set. 12 March 1999. Online. Available: http://lcweb2.loc.gov/ammem/award/docs/nisometa/NISOintr.html http://lcweb2.loc.gov/ammem/award/docs/nisometa/NISOintr.html. 8 October 1999.
(6) The Making of America II Testbed Project White Paper. Version 2.0 (September 15, 1998). Online. Available: http://sunsite.berkeley.edu/MOA2/wp-v2.html http://sunsite.berkeley.edu/MOA2/wp-v2.html. 8 October 1999.
(7) Day, Michael. Metadata for Preservation Cedars Project Document AIW01. CEDARS,
3 August 1998. Online. Available: http://www.ukoln.ac.uk/metadata/cedars/AIW01.html. 8 October 1999, and later papers.
(8) National Archives of Australia. Recordkeeping Metadata Standard for Commonwealth Agencies. Version 1.0. May 1999. Online. Available: HYPERLINK http://www.naa.gov.au/govserv/techpub/rkms/intro.htm http://www.naa.gov.au/govserv/techpub/rkms/intro.htm. 8 October 1999.
www.nla.gov.au 5 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009
(9) RLG Working Group on the Preservation Uses of Metadata. Final Report. May 1998. Online. Available: http://www.rlg.org./preserv/presmeta.html http://www.rlg.org./preserv/presmeta.html. 8 October 1999.
10 Recommended Elements
11 Element Name 12 1. Persistent Identifier - type and identifier
Definition An identifier or 'permanent name' for an object that identifies it uniquely and persistently, and enables links to different manifestations of it, to metadata about it, and to other objects related to it.
Rationale Each object described must have a persistent identifier to identify it uniquely, to discriminate between different manifestations of it and to link it with its metadata record.
LEVEL COLLECTION OBJECT FILE Scope Unique Identifier
may be used to
Unique Identifier
define file if this is
can be used to
Unique Identifier must be different from the
define collection if
used to define object.
object. It is not
description exists at
necessary for an
that level.
object with only one file.
Examples 1.Handle: loc.ndlp.amrlp/3a1611622. URN:NBN:fi-fe19981122
Repeatable Yes
Yes
Yes
Obligation Essential
Essential
Essential
Remarks This metadata set permits any scheme for Unique Identifier in use by the agency
13 Element Name
14 2. Date of Creation
Definition Date expressed in a standardised form that the manifestation came into being.
6 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
Rationale The date, in combination with other metadata elements, provides evidence of an object's authenticity and provenance.
LEVEL COLLECTION OBJECT FILE
Scope If applicable, date that this instance of a Date that this
Date that this
collection came into
manifestation of the
manifestation of the
being. May be a start object came into
file came into being.
date or a range of
Repeatable No No No Obligation Essential Essential Essential Remarks Other dates will be recorded under appropriate elements.
15 Element Name
16 3. Structural Type
Definition The type of object or collection being described using one of the following categories: Image, Sound, Video, Text, Database, Software, or, where the object comprises more than one form, Web Document or Multi-media. This list is extensible to accommodate new formats.
Rationale Choice of appropriate preservation strategy depends on knowing structural type.
LEVEL COLLECTION OBJECT FILE
Scope Collection Structural Object Structural Type describes the
Type describes the
collection using one
object using one of
of the following
the following
Not described at this
categories: Image,
categories: Image ,
level.
Sound, Video, Text,
Sound, Video, Text,
Database, Software,
Database, Software,
or where the
or where the object
documents in the
comprises more than
www.nla.gov.au 7 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009 31 March 2009
one form, Web
more than one form, Document or Multi- Web Document, or
media. This list is
Multi-media. This list extensible to is extensible to
accommodate new
accommodate new
formats..
formats.
Examples Example 1: 52 images Example 2: various
Example 1: Video (where the collection
Example 2: Web
contains documents Document in a number of
different formats.) Repeatable No No Obligation Essential Essential Remarks Many complex documents will require multiple descriptions in 5. File
Description.
17 Element Name
18 4. Technical Infrastructure of Complex Object
Definition The over-arching technical infrastructure of a complex object. Rationale Managing preservation will require managing the structure of complex
objects as well as their components.
LEVEL COLLECTION OBJECT FILE
Scope Describe the technical aspects of a complex object. This may include format of a Web page, or a CD-
Not relevant at
ROM. It will also
Not relevant at file
collection level.
include the total
level.
number of files and total of each type of file in the complex object. If the object comprises a single
8 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
Examples Example 1: CD-ROM containing 22 files -
14 .gif image files, 3 .wav audio files, 3 .txt files and 2.ex executables
assembled in accordance with ISO 9660. Example 2: Access database containing 1 .mdb file.
Repeatable No Obligation Essential
Remarks
19 Element Name
20 5. File Description
Remarks We have not yet ascertained whether these headings will accommodate all components, and will continue work to test them. MIME types could be used to automatically populate the fields. We anticipate, however, that some files would be wrongly labelled by this approach. We welcome ideas both on the completeness of our descriptive fields and the processes by which they could be populated.
Obligation Essential if applicable Sub-elements 5.1 Image
www.nla.gov.au 9 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009
5.6 Executables
The table below provides a comparison of sub-elements between file types.
5.1 Image 5.2 Audio 5.3 Video 5.4 Text 5.5 Database 5.6 Executable
5.6.1 Code Format and Format and
File Format Format and Format and Type and Version
Version
and Version Version
Version
Version
5.2.2 Audio Resolution
Colour Bit Rate
Rate
5.1.5 Image Tonal Resolution
5.1.6 Image Colour Space
5.1.7 Image Colour Management
5.1.8 Image Colour Lookup Table
5.1.9 Image
10 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
Orientation
5.1.10 5.2.5 5.3.5 5.4.2 5.5.2 Compressio Compression Compressio Compressio Compression n
5.2.6 5.3.6 Video
Encapsulatio Encoding n
Structure
5.2.7 Track Number and Type
5.3.7 Video Sound
5.4.3 Text Character Set
5.4.4 Text Associated DTD
5.4.5 Text Structural Divisions.
5.5.3 Datatype
and Representatio n category
5.5.4 Representatio n Form and Layout
5.5.5 Maximum size of data element values
www.nla.gov.au 11 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009
5.5.6 Minimum size of data element values
21 5.1 Image
5.1.1 Image Format Definition : The file type and version. and Version
Examples : TIFF v 4.0
5.1.2 Image
Definition : The spatial resolution of the image, expressed as pixels Resolution per inch or cm (ppi, p/cm) or dots per inch or cm (dpi, d/cm).
Examples : 600 dpi; 320 dpi, 1500 d/cm
5.1.3 Image
Definition : The number of pixels along the vertical and horizontal Dimensions dimensions
Examples : 4096 x 6144 pixels
5.1.4 Image Tonal
Definition : Bit depth of each pixel, and whether multiple bits convey Resolution grey tones or colour
Examples : 1-bit; 8-bit greyscale; 24-bit colour
5.1.6 Image Colour
Definition : The colour space used for the image.
Space
Examples : CMYK; RGB
5.1.7 Image Colour
Definition : Any system used to improve consistency of colour across Management capture, display and output of image.
Examples : PhotoCD; OptiCal; Profile/80; Softproof (Photoshop plug- in)
5.1.8 Image Colour
Definition : Location and encoding for any CLUT used to map from Lookup Table low to high colour depth.
Examples : FResident (if CLUT inside image file), Base64 (if CLUT binary encoded)
5.1.9 Image
Definition : How scanned image is stored relative to the correct "top of Orientation the image".
12 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
Examples : 000 (ie top of image is correctly oriented); 090 (ie top of image is 90 degrees clockwise from where it should be)
5.1.10 Compression Definition : The type and level of compression. Examples : CCIT 4
22 5.2 Audio
5.2.1 Audio Format Definition : The file type and version. and Version
Examples : AIFF interleaved
5.2.2 Audio
Definition : The sampling frequency in kHz
Resolution
Examples : 44.1kHz; 96kHz
5.2.3 Duration Definition : The length of the audio recording in minutes and seconds, or minutes, seconds, 100ths of seconds, and frames.
Examples : 67 minutes 12 seconds; 03:12:24:20
5.2.4 Audio Bit Rate Definition: Word length used to encode the audio. Consequently an
indication of dynamic range. Examples : 16 bit, 24 bit.
5.2.5 Compression Definition : The type and level of compression (note audio compression, or bit rate reduction is a non reversable, "lossy" process)
Examples : MPEG 3
5.2.6 Encapsulation Definition : The delivery format and version. Examples : Real Audio II
5.2.7 Track Number Definition : The number of tracks and how they are related to each and Type other.
Examples : 1. 2 track Stereo
2. Single Track
3. 5 channel surround
23 5.3Video
www.nla.gov.au 13 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009
5.3.1 Video File
Definition : The file type and version
Format and Version
Examples : Quicktime version 1.1
Definition : The resolution in pixels of a single still frame Dimensions
5.3.2 Frame
Examples : 640 pixels x 480 pixels
5.3.3 Duration Definition : The length of the video recording in minutes and seconds, or minutes, seconds, 100ths of seconds, and frames.
Examples : 67 minutes 12 seconds; 03:12:24:20
5.3.4 Frame Rate Definition : The standard frame rate per second of the video material Examples : 25 fps
5.3.5 Compression Definition : The type and level of compression. (note video compression, or bit rate reduction is a non reversable, "lossy" process)
Examples : MPEG 3
Definition : The type of encoding structure and version Encoding Structure
5.3.6 Video
Examples : Mpeg 3 Remark : It is possible for MPEG to be both encapsulation or delivery
format and file type.
5.3.7 Video Sound Definition : The sound parameters where they are incorporated into a single video file structure. May include all fields specified in audio.
24 5.4 Text
5.4.1 Text Format
Definition : The file type and version.
and Version
Examples : MS Word 97
5.4.2 Compression Definition : The type and level of compression. Examples : .zip file
5.4.3 Text Character Definition : The character set used in the document Set
Examples : ASCII; Unicode; EBCDIC
14 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
5.4.4 Text
Definition : Name of the Document Type Definition applied to the Associated DTD structured text
Examples : EAD
5.4.5 Text Structural Definition : The logical divisions in a structured text file Divisions.
Examples : TEI element DIVn used
25 5.5 Database
5.5.1 Database
Definition : The file type and version.
Format and Version
Examples : MS Access 3.1
5.5.2 Compression Definition : The type and level of compression. Examples : .zip file
5.5.3 Datatype and
Definition : Type of symbol, character or other designation used to
Representation
represent a data element found in a database and the type of values category used to represent it. May be general description of symbols or characters found in the database, or be specific to database elements.
Examples 1: Alphanumeric characters and graphical image.
2. The database element known as "xxx1" contains alphanumeric characters, the database element known as "xxx2" contains Graphical images.
5.5.4 Representation Definition : Name or description of the form of representation for the Form and Layout data element and the layout of the characters that represent it (as appropriate). May be general description of form of representation found in the database, or be specific to database elements.
Examples 1. Text:Alphabetic, code:numeric, quantitative value:currency$$,$$$.99, date:yyyy:mm:dd.
2. The database element known as "xxx1" contains date:yyyy:mm:dd, the database element known as "xxx2" contains a quantitative value:numericNNNN.NN the database element known as "xxx3" contains a quantitative value:currency$$,$$$.99
5.5.5 Maximum size Definition : The maximum number of data units (eg characters) of the
of data element
corresponding datatype.
values Examples : The database element known as "xxx3" (money) has a
www.nla.gov.au 15 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009 31 March 2009
5.5.6 Minimum size Definition : The minimum number of data units (eg characters) of the
of data element
corresponding datatype that would be present if data has been entered. values
Examples : The database element known as "xxx1" (date) has a minimum character count of 8.
26 5.6 Executables
Remark : These are the executable components of a complex object, such as a CD-ROM or Web document. These executables perform certain operations within the digital object. They are not the software stated in system requirements, though they may be supported by it.
5.6.1 Code Type and Definition : The code type used to compile the executable and version. Version
Examples : 1. Compiled using Intel code executable for Windows 95 environment
2. Compiled using Perl script
3. Java version 1.2
27 Element Name
28 6. Known System Requirements
Definition The system or software necessary to access the information in the object or to use it. May describe the range of systems on which the object will operate, or the earliest version if the object continues to be compatible with newer version. May also describe system requirements or plug-ins for operation, or memory requirements for an uncompressed file. Should state whether the requirements are preferred or mandatory
Rationale Needed to manage requirements for accessing and operating digital objects.
LEVEL COLLECTION OBJECT FILE
Scope Describes the system
If useful, may be
or software necessary If appropriate, may be
summarised at this
to access the
described at this
level.
information in the
level.
object or to use it.
Examples As for Object
1. Mac G3, OS 8.0 or As for Object
16 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
2. PC, Windows 3.1 to Windows 98
3. Pentium 200 or better mandatory, Netscape Navigator v
4.0 with preferred.
4. Windows 95 Netscape Navigator v
4.0 with WinZip and 'x' plug-ins.
5. Java Virtual Machine.Real Audio G2 or better
Repeatable Yes Yes Yes Obligation Desirable if useful
Essential
Essential if applicable
Remarks
29 Element Name
30 7. Installation Requirements
Definition Any specialised procedures needed to install an object. Rationale To enable access to objects with special installation requirements.
LEVEL COLLECTION OBJECT FILE Scope Record any additional May be described at
If useful, may be
specific instructions
this level, eg an
summarised at this
on passwords, how to executable file in an
level.
start the program, etc. object
Examples 1. Copy files to A
This file needs to be
Use password
drive
copied into a separate
[xxxxxxxxx]
2. Copy to C drive
directory
and click on icon
Repeatable Yes Yes Yes Obligation Desirable
Essential if applicable Essential if applicable
www.nla.gov.au 17 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009
Remarks This information will be particularly useful when undertaking future migrations.
31 Element Name
32 8. Storage Information
Definition Storage capacity for objects and details of the storage system, including physical format.
Rationale May help in planning preservation action relevant to particular carriers and storage systems.
LEVEL COLLECTION OBJECT FILE
Scope Storage size and
Storage size and
Storage size and
system/carrier for
system/carrier for
system/carrier for file
collection
object
Examples 1. 3.8 Gb on IBM
1. 500kb on exabyte
1. 1.3 Mb on CD
digital library
tape
Repeatable No No No Obligation Desirable Desirable Desirable Remarks May record compressed or uncompressed size, as applicable, and
should indicate which.
33 Element Name
34 9. Access Inhibitors
Definition Any method used to inhibit access, which would impact on preservation procedures, such as encryption or watermarking.
Rationale Without this information, the object may not be able to be accessed, copied or migrated.
LEVEL COLLECTION OBJECT FILE
Scope If useful, may be
If useful may be
Describes access
summarised at this
summarised at this
18 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
Examples 1. Watermark by Digimarc Professional
Use password
Associated dongle
2. Watermark by
[xxxxxxxx]
required.
Invisible Ink for Images, embedded before acquisition.
Repeatable Yes Yes Yes Obligation Essential if applicable Essential if applicable Essential if applicable Remarks Dongle may be more appropriately described under 6. Known System
Requirements
35 Element Name
36 10. Finding and Searching Aids, and Access Facilitators
Definition Any system or method used to enhance access to information within the digital object, which need to be maintained in successive generations.
Rationale To enable the aids and facilitators to be taken into account in any preservation process.
LEVEL COLLECTION OBJECT FILE
Scope If useful, may be
Describes systems or
Not described at this
summarised at this
methods at object
Examples 1. CD type ID points linked to file
2. Video and text time code linked.
Repeatable Yes Obligation Essential if applicable
Remarks
www.nla.gov.au 19 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009
37 Element Name 38 11. Preservation Action Permission
Definition A statement of whether or not permission is held to create copies of the object for preservation purposes.
Rationale To record information about whether permission to copy for preservation is held by the agency, to facilitate management of preservation action.
LEVEL COLLECTION OBJECT FILE
Scope Describes whether permission is held. Where permission is held, records
If useful, may be
date and who gave Describe at this level if
summarised at this
permission. Where different from object.
level. permission is not
held, records detail of negotiation status and date.
Examples Permission to copy URN:NBN:au:nla:nph-arch/
Redhead
1999/Q1999-Feb-1//http://
Publications
www.lib.latrobe.edu.au/AHR/
granted permission
archive/Issue-December-
1998/ smith.html withheld by the author
Repeatable No
No
No
Obligation Desirable Desirable Desirable Remarks The need for this information may be influenced by provisions in
relevant legal deposit legislation.
39 Element Name
40 12. Validation
Definition Information about a validation mechanism either within the document before it was taken into the archive, or a validation mechanism applied by the archive manager.
20 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
Rationale To verify authenticity and to provide information for decision making on preservation pathways.
LEVEL COLLECTION OBJECT FILE
Scope Describe at
Describe at
Describes validation
appropriate level.
appropriate level.
mechanism Examples 1. Standard Internet
checksum applied by publisher
2. Roland checksum applied by NLA 19991912
Repeatable Yes Yes Yes Obligation Desirable Desirable Desirable Remarks We are not sure whether this should be recorded in a separate element
or whether it should be recorded under 23. Process This is a mechanism, usually consisting of a number, that allows one to verify that an electronically transmitted file is what it purports to be, ie, the file is what is described in the metadata. At the simplest level, such a key might consist of the number of lines in a file (similar to the way that one indicates the number of pages that are transmitted via fax.) Or it might consist of a checksum which is an algorithm based on
a manipulation of the sum of the bits that make up a file to yield a number that serves as a unique identifier for that file.
41 Element Name
42 13. Relationships
Definition Relationships between this manifestation and other objects necessary for preservation management.
Rationale To enable an object to be linked to its metadata, to earlier or later manifestations of it, other forms of it, and other objects, including finding aids. It is essential to maintaining a history of the change of an object by linking to the metadata of earlier manifestations, including that of the source object.
LEVEL Scope and Examples
www.nla.gov.au 21 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009
COLLECTION Describes links relevant to a collection. 1.Linked to previous manifestation in a migration sequence, eg, was migrated from [Unique Identifier and unique identifier type] 2.Linked to following manifestation in a migration sequence, eg, was migrated to [Unique Identifier and Unique Identifier type]
3. Contains the lower component (must be repeatable) eg contains [Unique Identifier and Unique Identifier type]
4. Relation to the primary instance of the collection, eg. This is the 5th generation copy of [Unique Identifier and Unique Identifier type]
5. Link to Preservation Master (if it exists), eg. Linked to [Unique Identifier and unique identifier type of preservation master]
6. Link to Duplication Master (if it exists), eg Linked to [Unique Identifier and Unique Identifier of duplication master]
7. Link to finding aid, eg. Linked to [Unique Identifier and Unique Identifier type]
Repeatable Yes Obligation Essential if applicable OBJECT Describes links relevant to an object.
1. Linked to previous in a migration sequence, eg was migrated from [Unique Identifier and unique identifier type]
2. Linked to following in a migration sequence, eg was migrated to [Unique Identifier and unique identifier type]
3. Is a part of a higher aggregation, eg part of [collection unique Identifier and unique identifier type]
4. Contains the lower component (must be repeatable) eg contains [Unique Identifier and unique identifier type]
5. Relation to the primary instance of the collection, eg this is the 5th generation copy of [unique identifier of primary instance and unique identifier type].
6. Related to accompanying material, eg accompanied by book [call number]
7. Link to Preservation Master (if it exists), eg Linked to [Unique Identifier and unique identifier type of preservation master]
8. Link to Duplication Master (if it exists), eg Linked to [Unique Identifier and unique identifier type of duplication master]
9. Linked to a previous object in a sequence in a periodic capture process eg, sequential copies of a web page.
10. Linked to a previous object in a sequence related to content, eg page in a book
11. Linked to a following object in a sequence in a periodic capture process, eg sequential copies of a web page.
12. Linked to a following object in a sequence related to content, eg page in a book
22 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
13. Number in sequence and number of total in the sequence eg, 3 of
54. 14). Linked to items derived from the same instance, eg high definition copy available at [Unique Identifier and unique identifier type].
15. Linked to a database specification in accordance with ISO 11179.
Repeatable Yes Obligation Essential if applicable FILE Describes links relevant to a file.
1. Linked to previous in a migration sequence, eg was migrated from [Unique Identifier and unique identifier type]2. Linked to following in
a migration sequence, eg was migrated to [Unique Identifier and unique identifier type] 3. Is a part of a higher aggregation, eg part of [collection unique Identifier and unique identifier type]
4. Relation to the primary instance of the collection, eg this is the 5th generation copy of [unique identifier of primary instance and unique identifier type].
5. Link to Preservation Master (if it exists), eg Linked to [Unique Identifier and unique identifier type of preservation master]
6. Link to Duplication Master (if it exists), eg Linked to [Unique Identifier and unique identifier type of duplication master]
7. Linked to a previous object in a sequence in a periodic capture process eg, sequential copies of a web page.
8. Linked to a previous object in a sequence related to content, eg page in a book
9. Linked to a following object in a sequence in a periodic capture process, eg sequential copies of a web page.
10. Linked to a following object in a sequence related to content, eg page in a book
11. Number in sequence and number of total in the sequence eg, 3 of
12. Linked to items derived from the same instance, eg high definition copy available at [Unique Identifier and unique identifier type].
13. Linked to a database specification in accordance with ISO 11179. Repeatable Yes Obligation Essential if applicable Remarks These examples are not definitive: there will be others.
www.nla.gov.au 23 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009
43 Element Name
44 14. Quirks
Definition Any characteristic that may appear as a loss in functionality or change in the look and feel of a collection , object or file. May describe quirks or provide links to quirks. Includes only descriptions of quirks that are relevant to the use of the current instance. Should include any relevant dates.
Rationale To assist preservation managers to assess the success or otherwise of preservation strategies and should prevent time being spent on trying to solve problems that were inherent in the object at the time the strategy was applied. This element documents changes that occur as a result of digitisation, duplication or migration, as well as those that might be inherent in the source document.
LEVEL COLLECTION OBJECT FILE
Scope If useful, quirks at the object or file levels
Describes quirks at
Describes quirks at
may be summarised
the object level.
the file level.
at collection level.
Examples 1. The text format tag is no longer supported by many browsers due to changes in HTML 4.
1.For all Web
2. In the transfer from
documents in the
1. The Shockwave
the previous format,
collection produced
files could not be
the functionality of
prior to HTML 4, the captured from the
the mpeg video was
text format tag is no
source document.
impaired.
longer supported.
3. The original printed item contains high levels of bleed through, which degrades the image quality.
Repeatable Yes Yes Yes Obligation Essential if applicable Essential if applicable Essential if applicable
24 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
Remarks
45 Element Name
46 15. Archiving Decision (work)
Definition The decision whether this work should be archived and the date of that decision. This field may also include a retention period or review date.
Rationale This information contributes to the preservation history of the work and facilitates future decision making.
LEVEL COLLECTION OBJECT FILE
Scope Decision may be
Decision may be taken
taken and described
and described at this
Not described at this
at this level, or may
level, or may be
level.
be summarised at this summarised at this level.
level. Examples Australian Humanities
Hansen Collection of Review to be archived. digitised images to Date of Decision:
be archived. Date of
19991013[yyyymmdd], Decision: 19990321 Date of Review [yyy:mm:dd] 20011013[yyyymmdd]
Repeatable No No Obligation Essential Essential
Remarks
47 Element Name
48 16. Decision Reason (work)
Definition Why the decision to archive the work (or not) was made. Rationale This information contributes to the preservation history of the object
and facilitates future decision making.
LEVEL COLLECTION OBJECT FILE
www.nla.gov.au 25 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009
Scope Decision may be
Decision may be
taken and described at taken and described at
Not described at this
this level, or may be this level, or may be
level.
summarised at this
summarised at this
level.
level.
Examples 1. Conforms to PANDORA selection
1. Source images on guidelines [version glass negative are and date yyyymmdd]
very fragile and are
2. National Archives not available for of Australia Disposal research purposes. Authority reference
number
Repeatable No No Obligation Essential Essential
Remarks
49 Element Name
50 17. Institution Responsible for Archiving Decision (work)
Definition The name of the agency responsible for the decision that this work should be archived.
Rationale In a distributed archiving model, the agency making the archiving decision may be different from the one actually archiving the object.
LEVEL COLLECTION OBJECT FILE
Scope Decision may be
Decision may be
taken and described at taken and described at
Not described at this
this level, or may be
this level, or may be
level.
summarised at this
summarised at this
level.
level.
Examples State Library of
State Library of
Victoria
Victoria
Repeatable Yes Yes
26 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
Obligation Essential Essential Remarks Responsibility at the manifestation level is described separately at
Elements 18-20.
51 Element Name
52 18. Archiving Decision (manifestation)
Definition The decision whether this manifestation should be archived/retained and date of that decision. This field may also include a retention period or review date.
Rationale This information facilitates decision-making about the particular manifestation, recognising that while some manifestations of a work may be retained indefinitely, other manifestations may not.
LEVEL COLLECTION OBJECT FILE
Scope Decision may be
Decision may be
taken and described at taken and described at
Not described at this
this level, or may be
this level, or may be
level.
summarised at this
summarised at this
level.
level.
Examples 1. Decision: To be archived. Date of Decision: 19971013 [yyyymmdd]. Review Decision: Do not retain. Date of Review: 19991013 [yyyymmdd]
22. Decision: To be archived. Date of Decision: 19971013 [yyyymmdd]. Review Decision: Retain indefinitely. Date of Review 19991013 [yyyymmdd]
Repeatable Yes Obligation Essential
www.nla.gov.au 27 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009
Remarks
53 Element Name
54 19. Decision Reason (manifestation)
Definition Why the decision to archive/retain the manifestation (or not) was made.
Rationale This information contributes to the preservation history of the work and facilitates future decision-making about the manifestation. Although the work itself may be required for permanent retention, a particular manifestation may be redundant in the archive.
LEVEL COLLECTION OBJECT FILE
Scope Decision may be
Decision may be
taken and described at taken and described at
Not described at this
this level, or may be
this level, or may be
level.
summarised at this
summarised at this
level.
level.
Examples 1. Manifestation has hit a migration dead end. Future migrations will be done from an earlier manifestation.
2. Source manifestation - retain indefinitely
Repeatable Yes Obligation Essential
Remarks
55 Element Name
56 20. Institution Responsible for Archiving Decision (manifestation)
Definition The name of the agency responsible for the decision that this manifestation should be archived/retained.
28 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
Rationale In a distributed archiving model, the agency making the decision about archiving or retention may be different from the one actually archiving the object.
LEVEL COLLECTION OBJECT FILE
Scope Decision may be
Decision may be
taken and described at taken and described at
Not described at this
this level, or may be
this level, or may be
level.
summarised at this
summarised at this
level.
level.
Examples State Library of
State Library of
Victoria
Victoria
Repeatable No No Obligation Essential Essential
Remarks
57 Element Name
58 21. Intention Type
Definition The intended use of a particular manifestation. Rationale Provides information necessary to manage various copies of an object.
LEVEL COLLECTION OBJECT FILE
Scope Describes the
Not described at this intended use of the
Not described at this
manifestation. Examples 1. Preservation master
2. Access copy
Repeatable No Obligation Essential if applicable
www.nla.gov.au 29 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009
Remarks
59 Element Name
60 22. Institution with preservation responsibility
Definition The name of the agency that has accepted responsibility for preservation. Should include date of commencement of acceptance of responsibility, or range of dates of responsibility.
Rationale Attributes responsibility and provides information for allocation of resources and prevention of unwanted duplication. May be different from the agency selecting and the agency actively carrying out processes.
LEVEL COLLECTION OBJECT FILE
Scope If useful, may be described or summarised at this
Records the name of
level. Records the
the agency responsible
Not described at this
name of the agency
for the preservation of
level.
responsible for
this object and the
preservation of this
relevant dates.
collection and relevnat dates.
Examples National Library of
National Library of
Australia, 1 July 2000 Australia, 1 July 2000 -
Repeatable Yes
Yes
Obligation Essential Essential Remarks Primary level of description is the object. If useful, may be dscribed or
summarised at collection level. Information about responsibility should
be available at all levels, even if input only at object level.
61 Element Name
62 23. Process
Definition All relevant details of any process applied to a digital object or file, including software, specific settings or actions that were required to produce the current manifestation, details of all equipment and
30 │39 www.nla.gov.au 31 March 2009
Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
Rationale This element documents what has happened to a particular manifestation of an object. The series of linked records pertaining to manifestations of an object builds up a change history over time. This information is essential to document what preservation methods have been applied to the object and how the various manifestations might differ from each other.
LEVEL COLLECTION OBJECT FILE Repeatable Yes Yes Yes Obligation Essential if applicable Essential if applicable Essential if applicable Remarks The entire element, including sub-elements, must be repeatable.
Sub-elements 23.1 Description of Process
23.2 Name of the Agency Responsible for the Process
23.3 Critical Hardware Used in the Process
23.4 Critical Software Used in the Process
23.5 How Process was Carried Out
23.6 Guidelines Specified to Implement Process
23.7 Date and time
23.8 Result
23.9 Process Rationale
23.10 Changes
23.11 Other
63 Sub-element Name
64 23.1 Name of the Process
Definition Name of the process applied. Rationale To record what process was applied
LEVEL COLLECTION OBJECT FILE
Scope If useful, may be
Describes the process Describes the process
summarised at this
www.nla.gov.au 31 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia
31 March 2009
Examples 1. Copy from floppy disk to CD-R
1. Move from UNIX
1. Conversion of .wav
2. Copy from
to Solaris platform
to .aiff
publishers' Web site to archive
Repeatable No No No Obligation Essential Essential Essential
Remarks
65 Sub-element Name
66 23.2 Agency
Definition The name of the agency responsible for the process. Rationale Track responsibility for changes to the collection, object or file.