Preservation metadata for digital collections

1 Introduction

There have been a number of efforts to develop metadata specifications and sets to support preservation of a variety of digital resources. Because of its pressing business needs to manage both ‘born digital’ and ‘digital surrogate’ collections, the National Library of Australia has tried to find, or if necessary develop, metadata models to accommodate both.

The National Library of Australia, through its PANDORA Project (Preserving and Accessing Networked Documentary Resources of Australia), has been working at two levels in its efforts to ensure long-term access to Australian online publications. At a conceptual level, the Library has defined its business processes in a Business Process Model, and identified the data that will need to be collected for current and future management of each title in a Logical Data Model. In addition, in December 1998, the Library published its Digital Services Information Paper, which sets out requirements for a technical infrastructure to collect, store, provide access to, and manage its PANDORA Archive of Australian online publications, as well as to support the management of other digital and paper-based collections.

Concurrently, the Library has been working at a practical level, implementing the business principles by developing selection guidelines, liaising with publishers, and building a small archive of titles, which by February 2000, numbered over 400 and occupied approximately fifteen gigabytes of storage space.

Our purpose is to make Australia's cultural heritage available to future generations, as well as to today's scholars and researchers. Because to date there is little commercial publishing on the Internet in Australia, we have not yet had to deal with the complications of archiving subscription-only publications. We have, however, developed principles for managing commercial publications and have begun discussion with publishers on implementation.

2 Management of Metadata

www.nla.gov.au Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009

Future preservation strategies for online publications will require detailed information about the nature of the item and how it has been treated over time. Future researchers may also want historical information about the items they are using: what format it was originally in, and whether anything has been lost in the capture and preservation process. Day-to-day management of titles for the Archive also requires administrative information such as whether the publisher has given permission to archive.

To date we have no facility for recording the full complement of metadata required for each title as outlined in the Logical Data Model. We await the implementation of a full archive management system. In the meantime, to enable us to document the administrative history of the titles being archived, our IT Section created the PANDORA Archive Management System (PAMS) database. PAMS is rather a grandiose title for what is only a small metadata repository. Yet while it does not provide for all of the data elements that are required for long-term preservation, it does provide us with sufficient information about a title to manage archiving. Once an archive management system is available, the data from PAMS can be migrated to it.

At present PAMS does not record all the preservation metadata encompassed by the logical data model. In the absence of a satisfactory preservation metadata model that seems to achieve this objective, the NLA has invested in drafting its own model: a statement of the information it believes will be needed to manage the preservation of its digital collections. (1)

The draft Preservation Metadata Set draws on our corporate experience in a range of relevant fields:

• preservation, and preservation documentation, of library collections • management of archives of online digital publications, physical format digital

publications, and analogue and digital audio collections • management of digitisation projects for text-based and image-based collections • development of logical data models for a specific digital archiving implementation • website database design.

This means that the draft Preservation Metadata Set is built on considerable relevant experience and thinking about the issues involved. However, we are very keen to subject the draft to critical scrutiny from specialists in all of these fields and others with an interest inmanaging digital collections over time, especially in a library context.

This proposed preservation metadata framework has been informed by many models. Some are of broad relevance, (eg the ReferenceModel for an Open Archival Information System (OAIS) Draft Recommendation for Space Data System Standards(2)), while some came to us as results of data modelling exercises for particular projects (the NEDLIB project(3) and the NLA’s own PANDORAproject(4)). Some were more refined metadata specifications developed for particular programs or projects (the Library of Congress- CNRI Experiment Project (5); The Making of America II Project(6); the CEDARS project(7); the National Archives of Australia’s Recordkeeping Metadata Standard(8)). One particular starting point for our exercise was the metadata set proposed by the

2 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

Research Libraries Group (RLG) PRESERV Working Group on Preservation Uses of Metadata(9), which mainly addressed digitisation projects. RLG invited us to adapt this set to describe a wider range of materials.

While we have learned a great deal from all these models, we accept responsibility for the metadata set we are proposing.

3 What the Preservation Metadata Set is

It is most important to realise that our proposed Preservation Metadata Set is intended to be

a statement of the information we believe is needed to manage preservation of digital collections. It is meant to be a data output model, not a data input model. It indicates the information we want out of a metadata system, not necessarily what data should be entered, how it should be entered, by whom and at what time; nor does it concern itself with how the metadata should be associated with what it is describing. We believe this model should be applicable to many implementations that may decide to record this information in a variety of ways. This model simply says: ‘however you do it, this is what you have to deliver so we can manage preservation.’

It is also important to note that we are focusing solely on preservation requirements. The proposed metadata set does not attempt to deal with anything else. We recognise that in any implementation system there is likely to be an overlap between metadata recorded for different purposes. By focusing on the information we need out of the system to manage preservation, we put aside the question of whether particular elements may already be included in, say, other administrative or resource discovery metadata.

Different types of digital materials, and different archiving systems, will need different metadata support. There may be types of material and processes that are not adequately accommmodated by our proposal despite our intentions, and we would welcome feedback.

4 Granularity

The metadata set is based on the need to manage and describe collections, objects, and sub- objects (which we have called "files"). We have tried to show where we expect the elements in the metadata set to be relevant to these different levels. We expect to make pragmatic decisions about the level at which records are needed, based on the level at which collections, objects and files are managed separately. This model assumes that the digital object is the primary focus of management and description. File and collection descriptions are created when appropriate.

5 Change history

Maintaining a history of what is being described is one of the essential objectives of any preservation documentation system. We looked at two options:

• maintaining a single record over time, which records all changes and processes

applied to the item being described; or • creating a new record each time the item changes to something different,

maintaining a history by maintaining a sequence of linked records.

www.nla.gov.au 3 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009

We chose the latter approach. Managing digital objects and collections over time will mean creating and managing considerable amounts of information about them. We believe that the creation of a new record for each new manifestation will organise this information more clearly and conveniently.

6 Supporting alternative preservation strategies

It is impossible to determine unequivocally what we will need to know in order to manage digital preservation in the future, so our set of metadata elements necessarily reflects assumptions about our future requirements. Our aim with this proposed metadata set is to support both migration and emulation approaches. Just what is needed for these approaches will become clearer as we gain more collective experience with them.

7 Some key terms

To minimise confusion, we need to explain some of the terms we have used in the draft proposed Preservation Metadata Set:

• ‘work’, ‘manifestation’ – we have distinguished between a work, as a concept, and the physical or virtual manifestations that instance it. Most preservation processes involve managing manifestations. However, we found it useful to recognise that archiving decisions could be made for the work (eg ‘we will maintain this work in perpetuity’), with different archiving decisions applying to particular manifestions of it (eg ‘we do not need to keep this copy of it’).

• repeatability – because of the approach we have taken (a 1:1 relationship between

each manifestation and its metadata record), our comments about the repeatability of information in any element do not refer to a sequence of changes, but to the possibility of multiple bits of information that may be true at the same time; for example, two agencies may collaborate in an archiving decision.

• obligation – we have avoided terms like ‘mandatory’, ‘conditional’, and ‘optional’, because they are so closely associated with data input models. Instead, we use the terms ‘essential’, ‘essential if appropriate’, and ‘desirable’, in their common usages. Essential information we believe will definitely be required. Some elements are more relevant to some materials or processes than others, so they may be essential if applicable . Desirable information will not be critical, but is expected to be helpful.

• examples – we have provided examples wherever they are applicable. In some cases we have found it more useful to give generic examples, which appear in

square brackets.

8 Comments

We invite comments on the draft Preservation Metadata Set. These may apply to the overall approaches we have taken, the details of any elements, the presentation, and any other issues.

Comments may be directed to:

4 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

Colin Webb Director of Preservation National Library of Australia Canberra, ACT 2600 AUSTRALIA Telephone + 61 2 6262 1662 Facsimile +61 2 6257 1703 cwebb@nla.gov.au

9 References

(1) The NLA Preservation Metadata Working Group consists of: Margaret Phillips, Deborah Woodyard, Kevin Bradley, Colin Webb.

(2) Consultative Committee for Space Data Systems (CCSDS), CCSDS 650.0-R-1, May 1999.. Reference Model for an Open Archival Information System (OAIS) Draft Recommendation for Space Data System Standards. Online. Available: http://www.ccsds.org/RP9905/RP9905.html. 7 October 1999.

(3) Koninklijke Bibliotheek. NEDLIB Networked European Deposit Library (home page). Online. Available: http://www.konbib.nl/nedlib/. 7 October 1999. Also see: van der Werf- Davelaar, Titia. "Long-term preservation of electronic publications: The NEDLIB Project", D-Lib Magazine. Volume 5 Number 9 (1999). Online. Available: http://www.dlib.org/dlib/september99/vanderwerf/09vanderwerf.html. 8 October 1999.

(4) National Library of Australia. PANDORA Project: Preserving and Accessing Networked DOcumentary Resources of Australia (home page). Online. Available: http://www.nla.gov.au/pandora/. 7 October 1999.

See also: National Library of Australia. Digital Services Project. Online. Available: http://www.nla.gov.au/dsp/ http://www.nla.gov.au/dsp/. 8 October 1999.

(5) Carl Fleischhauer. Library of Congress-CNRI Experiment Project Proposed Metadata Set. 12 March 1999. Online. Available: http://lcweb2.loc.gov/ammem/award/docs/nisometa/NISOintr.html http://lcweb2.loc.gov/ammem/award/docs/nisometa/NISOintr.html. 8 October 1999.

(6) The Making of America II Testbed Project White Paper. Version 2.0 (September 15, 1998). Online. Available: http://sunsite.berkeley.edu/MOA2/wp-v2.html http://sunsite.berkeley.edu/MOA2/wp-v2.html. 8 October 1999.

(7) Day, Michael. Metadata for Preservation Cedars Project Document AIW01. CEDARS,

3 August 1998. Online. Available: http://www.ukoln.ac.uk/metadata/cedars/AIW01.html. 8 October 1999, and later papers.

(8) National Archives of Australia. Recordkeeping Metadata Standard for Commonwealth Agencies. Version 1.0. May 1999. Online. Available: HYPERLINK http://www.naa.gov.au/govserv/techpub/rkms/intro.htm http://www.naa.gov.au/govserv/techpub/rkms/intro.htm. 8 October 1999.

www.nla.gov.au 5 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009

(9) RLG Working Group on the Preservation Uses of Metadata. Final Report. May 1998. Online. Available: http://www.rlg.org./preserv/presmeta.html http://www.rlg.org./preserv/presmeta.html. 8 October 1999.

10 Recommended Elements

11 Element Name 12 1. Persistent Identifier - type and identifier

Definition An identifier or 'permanent name' for an object that identifies it uniquely and persistently, and enables links to different manifestations of it, to metadata about it, and to other objects related to it.

Rationale Each object described must have a persistent identifier to identify it uniquely, to discriminate between different manifestations of it and to link it with its metadata record.

LEVEL COLLECTION OBJECT FILE Scope Unique Identifier

may be used to

Unique Identifier

define file if this is

can be used to

Unique Identifier must be different from the

define collection if

used to define object.

object. It is not

description exists at

necessary for an

that level.

object with only one file.

Examples 1.Handle: loc.ndlp.amrlp/3a1611622. URN:NBN:fi-fe19981122

Repeatable Yes

Yes

Yes

Obligation Essential

Essential

Essential

Remarks This metadata set permits any scheme for Unique Identifier in use by the agency

13 Element Name

14 2. Date of Creation

Definition Date expressed in a standardised form that the manifestation came into being.

6 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

Rationale The date, in combination with other metadata elements, provides evidence of an object's authenticity and provenance.

LEVEL COLLECTION OBJECT FILE

Scope If applicable, date that this instance of a Date that this

Date that this

collection came into

manifestation of the

manifestation of the

being. May be a start object came into

file came into being.

date or a range of

Repeatable No No No Obligation Essential Essential Essential Remarks Other dates will be recorded under appropriate elements.

15 Element Name

16 3. Structural Type

Definition The type of object or collection being described using one of the following categories: Image, Sound, Video, Text, Database, Software, or, where the object comprises more than one form, Web Document or Multi-media. This list is extensible to accommodate new formats.

Rationale Choice of appropriate preservation strategy depends on knowing structural type.

LEVEL COLLECTION OBJECT FILE

Scope Collection Structural Object Structural Type describes the

Type describes the

collection using one

object using one of

of the following

the following

Not described at this

categories: Image,

categories: Image ,

level.

Sound, Video, Text,

Sound, Video, Text,

Database, Software,

Database, Software,

or where the

or where the object

documents in the

comprises more than

www.nla.gov.au 7 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009 31 March 2009

one form, Web

more than one form, Document or Multi- Web Document, or

media. This list is

Multi-media. This list extensible to is extensible to

accommodate new

accommodate new

formats..

formats.

Examples Example 1: 52 images Example 2: various

Example 1: Video (where the collection

Example 2: Web

contains documents Document in a number of

different formats.) Repeatable No No Obligation Essential Essential Remarks Many complex documents will require multiple descriptions in 5. File

Description.

17 Element Name

18 4. Technical Infrastructure of Complex Object

Definition The over-arching technical infrastructure of a complex object. Rationale Managing preservation will require managing the structure of complex

objects as well as their components.

LEVEL COLLECTION OBJECT FILE

Scope Describe the technical aspects of a complex object. This may include format of a Web page, or a CD-

Not relevant at

ROM. It will also

Not relevant at file

collection level.

include the total

level.

number of files and total of each type of file in the complex object. If the object comprises a single

8 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

Examples Example 1: CD-ROM containing 22 files -

14 .gif image files, 3 .wav audio files, 3 .txt files and 2.ex executables

assembled in accordance with ISO 9660. Example 2: Access database containing 1 .mdb file.

Repeatable No Obligation Essential

Remarks

19 Element Name

20 5. File Description

Remarks We have not yet ascertained whether these headings will accommodate all components, and will continue work to test them. MIME types could be used to automatically populate the fields. We anticipate, however, that some files would be wrongly labelled by this approach. We welcome ideas both on the completeness of our descriptive fields and the processes by which they could be populated.

Obligation Essential if applicable Sub-elements 5.1 Image

www.nla.gov.au 9 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009

5.6 Executables

The table below provides a comparison of sub-elements between file types.

5.1 Image 5.2 Audio 5.3 Video 5.4 Text 5.5 Database 5.6 Executable

5.6.1 Code Format and Format and

File Format Format and Format and Type and Version

Version

and Version Version

Version

Version

5.2.2 Audio Resolution

Colour Bit Rate

Rate

5.1.5 Image Tonal Resolution

5.1.6 Image Colour Space

5.1.7 Image Colour Management

5.1.8 Image Colour Lookup Table

5.1.9 Image

10 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

Orientation

5.1.10 5.2.5 5.3.5 5.4.2 5.5.2 Compressio Compression Compressio Compressio Compression n

5.2.6 5.3.6 Video

Encapsulatio Encoding n

Structure

5.2.7 Track Number and Type

5.3.7 Video Sound

5.4.3 Text Character Set

5.4.4 Text Associated DTD

5.4.5 Text Structural Divisions.

5.5.3 Datatype

and Representatio n category

5.5.4 Representatio n Form and Layout

5.5.5 Maximum size of data element values

www.nla.gov.au 11 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009

5.5.6 Minimum size of data element values

21 5.1 Image

5.1.1 Image Format Definition : The file type and version. and Version

Examples : TIFF v 4.0

5.1.2 Image

Definition : The spatial resolution of the image, expressed as pixels Resolution per inch or cm (ppi, p/cm) or dots per inch or cm (dpi, d/cm).

Examples : 600 dpi; 320 dpi, 1500 d/cm

5.1.3 Image

Definition : The number of pixels along the vertical and horizontal Dimensions dimensions

Examples : 4096 x 6144 pixels

5.1.4 Image Tonal

Definition : Bit depth of each pixel, and whether multiple bits convey Resolution grey tones or colour

Examples : 1-bit; 8-bit greyscale; 24-bit colour

5.1.6 Image Colour

Definition : The colour space used for the image.

Space

Examples : CMYK; RGB

5.1.7 Image Colour

Definition : Any system used to improve consistency of colour across Management capture, display and output of image.

Examples : PhotoCD; OptiCal; Profile/80; Softproof (Photoshop plug- in)

5.1.8 Image Colour

Definition : Location and encoding for any CLUT used to map from Lookup Table low to high colour depth.

Examples : FResident (if CLUT inside image file), Base64 (if CLUT binary encoded)

5.1.9 Image

Definition : How scanned image is stored relative to the correct "top of Orientation the image".

12 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

Examples : 000 (ie top of image is correctly oriented); 090 (ie top of image is 90 degrees clockwise from where it should be)

5.1.10 Compression Definition : The type and level of compression. Examples : CCIT 4

22 5.2 Audio

5.2.1 Audio Format Definition : The file type and version. and Version

Examples : AIFF interleaved

5.2.2 Audio

Definition : The sampling frequency in kHz

Resolution

Examples : 44.1kHz; 96kHz

5.2.3 Duration Definition : The length of the audio recording in minutes and seconds, or minutes, seconds, 100ths of seconds, and frames.

Examples : 67 minutes 12 seconds; 03:12:24:20

5.2.4 Audio Bit Rate Definition: Word length used to encode the audio. Consequently an

indication of dynamic range. Examples : 16 bit, 24 bit.

5.2.5 Compression Definition : The type and level of compression (note audio compression, or bit rate reduction is a non reversable, "lossy" process)

Examples : MPEG 3

5.2.6 Encapsulation Definition : The delivery format and version. Examples : Real Audio II

5.2.7 Track Number Definition : The number of tracks and how they are related to each and Type other.

Examples : 1. 2 track Stereo

2. Single Track

3. 5 channel surround

23 5.3Video

www.nla.gov.au 13 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009

5.3.1 Video File

Definition : The file type and version

Format and Version

Examples : Quicktime version 1.1

Definition : The resolution in pixels of a single still frame Dimensions

5.3.2 Frame

Examples : 640 pixels x 480 pixels

5.3.3 Duration Definition : The length of the video recording in minutes and seconds, or minutes, seconds, 100ths of seconds, and frames.

Examples : 67 minutes 12 seconds; 03:12:24:20

5.3.4 Frame Rate Definition : The standard frame rate per second of the video material Examples : 25 fps

5.3.5 Compression Definition : The type and level of compression. (note video compression, or bit rate reduction is a non reversable, "lossy" process)

Examples : MPEG 3

Definition : The type of encoding structure and version Encoding Structure

5.3.6 Video

Examples : Mpeg 3 Remark : It is possible for MPEG to be both encapsulation or delivery

format and file type.

5.3.7 Video Sound Definition : The sound parameters where they are incorporated into a single video file structure. May include all fields specified in audio.

24 5.4 Text

5.4.1 Text Format

Definition : The file type and version.

and Version

Examples : MS Word 97

5.4.2 Compression Definition : The type and level of compression. Examples : .zip file

5.4.3 Text Character Definition : The character set used in the document Set

Examples : ASCII; Unicode; EBCDIC

14 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

5.4.4 Text

Definition : Name of the Document Type Definition applied to the Associated DTD structured text

Examples : EAD

5.4.5 Text Structural Definition : The logical divisions in a structured text file Divisions.

Examples : TEI element DIVn used

25 5.5 Database

5.5.1 Database

Definition : The file type and version.

Format and Version

Examples : MS Access 3.1

5.5.2 Compression Definition : The type and level of compression. Examples : .zip file

5.5.3 Datatype and

Definition : Type of symbol, character or other designation used to

Representation

represent a data element found in a database and the type of values category used to represent it. May be general description of symbols or characters found in the database, or be specific to database elements.

Examples 1: Alphanumeric characters and graphical image.

2. The database element known as "xxx1" contains alphanumeric characters, the database element known as "xxx2" contains Graphical images.

5.5.4 Representation Definition : Name or description of the form of representation for the Form and Layout data element and the layout of the characters that represent it (as appropriate). May be general description of form of representation found in the database, or be specific to database elements.

Examples 1. Text:Alphabetic, code:numeric, quantitative value:currency$$,$$$.99, date:yyyy:mm:dd.

2. The database element known as "xxx1" contains date:yyyy:mm:dd, the database element known as "xxx2" contains a quantitative value:numericNNNN.NN the database element known as "xxx3" contains a quantitative value:currency$$,$$$.99

5.5.5 Maximum size Definition : The maximum number of data units (eg characters) of the

of data element

corresponding datatype.

values Examples : The database element known as "xxx3" (money) has a

www.nla.gov.au 15 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009 31 March 2009

5.5.6 Minimum size Definition : The minimum number of data units (eg characters) of the

of data element

corresponding datatype that would be present if data has been entered. values

Examples : The database element known as "xxx1" (date) has a minimum character count of 8.

26 5.6 Executables

Remark : These are the executable components of a complex object, such as a CD-ROM or Web document. These executables perform certain operations within the digital object. They are not the software stated in system requirements, though they may be supported by it.

5.6.1 Code Type and Definition : The code type used to compile the executable and version. Version

Examples : 1. Compiled using Intel code executable for Windows 95 environment

2. Compiled using Perl script

3. Java version 1.2

27 Element Name

28 6. Known System Requirements

Definition The system or software necessary to access the information in the object or to use it. May describe the range of systems on which the object will operate, or the earliest version if the object continues to be compatible with newer version. May also describe system requirements or plug-ins for operation, or memory requirements for an uncompressed file. Should state whether the requirements are preferred or mandatory

Rationale Needed to manage requirements for accessing and operating digital objects.

LEVEL COLLECTION OBJECT FILE

Scope Describes the system

If useful, may be

or software necessary If appropriate, may be

summarised at this

to access the

described at this

level.

information in the

level.

object or to use it.

Examples As for Object

1. Mac G3, OS 8.0 or As for Object

16 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

2. PC, Windows 3.1 to Windows 98

3. Pentium 200 or better mandatory, Netscape Navigator v

4.0 with preferred.

4. Windows 95 Netscape Navigator v

4.0 with WinZip and 'x' plug-ins.

5. Java Virtual Machine.Real Audio G2 or better

Repeatable Yes Yes Yes Obligation Desirable if useful

Essential

Essential if applicable

Remarks

29 Element Name

30 7. Installation Requirements

Definition Any specialised procedures needed to install an object. Rationale To enable access to objects with special installation requirements.

LEVEL COLLECTION OBJECT FILE Scope Record any additional May be described at

If useful, may be

specific instructions

this level, eg an

summarised at this

on passwords, how to executable file in an

level.

start the program, etc. object

Examples 1. Copy files to A

This file needs to be

Use password

drive

copied into a separate

[xxxxxxxxx]

2. Copy to C drive

directory

and click on icon

Repeatable Yes Yes Yes Obligation Desirable

Essential if applicable Essential if applicable

www.nla.gov.au 17 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009

Remarks This information will be particularly useful when undertaking future migrations.

31 Element Name

32 8. Storage Information

Definition Storage capacity for objects and details of the storage system, including physical format.

Rationale May help in planning preservation action relevant to particular carriers and storage systems.

LEVEL COLLECTION OBJECT FILE

Scope Storage size and

Storage size and

Storage size and

system/carrier for

system/carrier for

system/carrier for file

collection

object

Examples 1. 3.8 Gb on IBM

1. 500kb on exabyte

1. 1.3 Mb on CD

digital library

tape

Repeatable No No No Obligation Desirable Desirable Desirable Remarks May record compressed or uncompressed size, as applicable, and

should indicate which.

33 Element Name

34 9. Access Inhibitors

Definition Any method used to inhibit access, which would impact on preservation procedures, such as encryption or watermarking.

Rationale Without this information, the object may not be able to be accessed, copied or migrated.

LEVEL COLLECTION OBJECT FILE

Scope If useful, may be

If useful may be

Describes access

summarised at this

summarised at this

18 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

Examples 1. Watermark by Digimarc Professional

Use password

Associated dongle

2. Watermark by

[xxxxxxxx]

required.

Invisible Ink for Images, embedded before acquisition.

Repeatable Yes Yes Yes Obligation Essential if applicable Essential if applicable Essential if applicable Remarks Dongle may be more appropriately described under 6. Known System

Requirements

35 Element Name

36 10. Finding and Searching Aids, and Access Facilitators

Definition Any system or method used to enhance access to information within the digital object, which need to be maintained in successive generations.

Rationale To enable the aids and facilitators to be taken into account in any preservation process.

LEVEL COLLECTION OBJECT FILE

Scope If useful, may be

Describes systems or

Not described at this

summarised at this

methods at object

Examples 1. CD type ID points linked to file

2. Video and text time code linked.

Repeatable Yes Obligation Essential if applicable

Remarks

www.nla.gov.au 19 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009

37 Element Name 38 11. Preservation Action Permission

Definition A statement of whether or not permission is held to create copies of the object for preservation purposes.

Rationale To record information about whether permission to copy for preservation is held by the agency, to facilitate management of preservation action.

LEVEL COLLECTION OBJECT FILE

Scope Describes whether permission is held. Where permission is held, records

If useful, may be

date and who gave Describe at this level if

summarised at this

permission. Where different from object.

level. permission is not

held, records detail of negotiation status and date.

Examples Permission to copy URN:NBN:au:nla:nph-arch/

Redhead

1999/Q1999-Feb-1//http://

Publications

www.lib.latrobe.edu.au/AHR/

granted permission

archive/Issue-December-

1998/ smith.html withheld by the author

Repeatable No

No

No

Obligation Desirable Desirable Desirable Remarks The need for this information may be influenced by provisions in

relevant legal deposit legislation.

39 Element Name

40 12. Validation

Definition Information about a validation mechanism either within the document before it was taken into the archive, or a validation mechanism applied by the archive manager.

20 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

Rationale To verify authenticity and to provide information for decision making on preservation pathways.

LEVEL COLLECTION OBJECT FILE

Scope Describe at

Describe at

Describes validation

appropriate level.

appropriate level.

mechanism Examples 1. Standard Internet

checksum applied by publisher

2. Roland checksum applied by NLA 19991912

Repeatable Yes Yes Yes Obligation Desirable Desirable Desirable Remarks We are not sure whether this should be recorded in a separate element

or whether it should be recorded under 23. Process This is a mechanism, usually consisting of a number, that allows one to verify that an electronically transmitted file is what it purports to be, ie, the file is what is described in the metadata. At the simplest level, such a key might consist of the number of lines in a file (similar to the way that one indicates the number of pages that are transmitted via fax.) Or it might consist of a checksum which is an algorithm based on

a manipulation of the sum of the bits that make up a file to yield a number that serves as a unique identifier for that file.

41 Element Name

42 13. Relationships

Definition Relationships between this manifestation and other objects necessary for preservation management.

Rationale To enable an object to be linked to its metadata, to earlier or later manifestations of it, other forms of it, and other objects, including finding aids. It is essential to maintaining a history of the change of an object by linking to the metadata of earlier manifestations, including that of the source object.

LEVEL Scope and Examples

www.nla.gov.au 21 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009

COLLECTION Describes links relevant to a collection. 1.Linked to previous manifestation in a migration sequence, eg, was migrated from [Unique Identifier and unique identifier type] 2.Linked to following manifestation in a migration sequence, eg, was migrated to [Unique Identifier and Unique Identifier type]

3. Contains the lower component (must be repeatable) eg contains [Unique Identifier and Unique Identifier type]

4. Relation to the primary instance of the collection, eg. This is the 5th generation copy of [Unique Identifier and Unique Identifier type]

5. Link to Preservation Master (if it exists), eg. Linked to [Unique Identifier and unique identifier type of preservation master]

6. Link to Duplication Master (if it exists), eg Linked to [Unique Identifier and Unique Identifier of duplication master]

7. Link to finding aid, eg. Linked to [Unique Identifier and Unique Identifier type]

Repeatable Yes Obligation Essential if applicable OBJECT Describes links relevant to an object.

1. Linked to previous in a migration sequence, eg was migrated from [Unique Identifier and unique identifier type]

2. Linked to following in a migration sequence, eg was migrated to [Unique Identifier and unique identifier type]

3. Is a part of a higher aggregation, eg part of [collection unique Identifier and unique identifier type]

4. Contains the lower component (must be repeatable) eg contains [Unique Identifier and unique identifier type]

5. Relation to the primary instance of the collection, eg this is the 5th generation copy of [unique identifier of primary instance and unique identifier type].

6. Related to accompanying material, eg accompanied by book [call number]

7. Link to Preservation Master (if it exists), eg Linked to [Unique Identifier and unique identifier type of preservation master]

8. Link to Duplication Master (if it exists), eg Linked to [Unique Identifier and unique identifier type of duplication master]

9. Linked to a previous object in a sequence in a periodic capture process eg, sequential copies of a web page.

10. Linked to a previous object in a sequence related to content, eg page in a book

11. Linked to a following object in a sequence in a periodic capture process, eg sequential copies of a web page.

12. Linked to a following object in a sequence related to content, eg page in a book

22 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

13. Number in sequence and number of total in the sequence eg, 3 of

54. 14). Linked to items derived from the same instance, eg high definition copy available at [Unique Identifier and unique identifier type].

15. Linked to a database specification in accordance with ISO 11179.

Repeatable Yes Obligation Essential if applicable FILE Describes links relevant to a file.

1. Linked to previous in a migration sequence, eg was migrated from [Unique Identifier and unique identifier type]2. Linked to following in

a migration sequence, eg was migrated to [Unique Identifier and unique identifier type] 3. Is a part of a higher aggregation, eg part of [collection unique Identifier and unique identifier type]

4. Relation to the primary instance of the collection, eg this is the 5th generation copy of [unique identifier of primary instance and unique identifier type].

5. Link to Preservation Master (if it exists), eg Linked to [Unique Identifier and unique identifier type of preservation master]

6. Link to Duplication Master (if it exists), eg Linked to [Unique Identifier and unique identifier type of duplication master]

7. Linked to a previous object in a sequence in a periodic capture process eg, sequential copies of a web page.

8. Linked to a previous object in a sequence related to content, eg page in a book

9. Linked to a following object in a sequence in a periodic capture process, eg sequential copies of a web page.

10. Linked to a following object in a sequence related to content, eg page in a book

11. Number in sequence and number of total in the sequence eg, 3 of

12. Linked to items derived from the same instance, eg high definition copy available at [Unique Identifier and unique identifier type].

13. Linked to a database specification in accordance with ISO 11179. Repeatable Yes Obligation Essential if applicable Remarks These examples are not definitive: there will be others.

www.nla.gov.au 23 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009

43 Element Name

44 14. Quirks

Definition Any characteristic that may appear as a loss in functionality or change in the look and feel of a collection , object or file. May describe quirks or provide links to quirks. Includes only descriptions of quirks that are relevant to the use of the current instance. Should include any relevant dates.

Rationale To assist preservation managers to assess the success or otherwise of preservation strategies and should prevent time being spent on trying to solve problems that were inherent in the object at the time the strategy was applied. This element documents changes that occur as a result of digitisation, duplication or migration, as well as those that might be inherent in the source document.

LEVEL COLLECTION OBJECT FILE

Scope If useful, quirks at the object or file levels

Describes quirks at

Describes quirks at

may be summarised

the object level.

the file level.

at collection level.

Examples 1. The text format tag is no longer supported by many browsers due to changes in HTML 4.

1.For all Web

2. In the transfer from

documents in the

1. The Shockwave

the previous format,

collection produced

files could not be

the functionality of

prior to HTML 4, the captured from the

the mpeg video was

text format tag is no

source document.

impaired.

longer supported.

3. The original printed item contains high levels of bleed through, which degrades the image quality.

Repeatable Yes Yes Yes Obligation Essential if applicable Essential if applicable Essential if applicable

24 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

Remarks

45 Element Name

46 15. Archiving Decision (work)

Definition The decision whether this work should be archived and the date of that decision. This field may also include a retention period or review date.

Rationale This information contributes to the preservation history of the work and facilitates future decision making.

LEVEL COLLECTION OBJECT FILE

Scope Decision may be

Decision may be taken

taken and described

and described at this

Not described at this

at this level, or may

level, or may be

level.

be summarised at this summarised at this level.

level. Examples Australian Humanities

Hansen Collection of Review to be archived. digitised images to Date of Decision:

be archived. Date of

19991013[yyyymmdd], Decision: 19990321 Date of Review [yyy:mm:dd] 20011013[yyyymmdd]

Repeatable No No Obligation Essential Essential

Remarks

47 Element Name

48 16. Decision Reason (work)

Definition Why the decision to archive the work (or not) was made. Rationale This information contributes to the preservation history of the object

and facilitates future decision making.

LEVEL COLLECTION OBJECT FILE

www.nla.gov.au 25 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009

Scope Decision may be

Decision may be

taken and described at taken and described at

Not described at this

this level, or may be this level, or may be

level.

summarised at this

summarised at this

level.

level.

Examples 1. Conforms to PANDORA selection

1. Source images on guidelines [version glass negative are and date yyyymmdd]

very fragile and are

2. National Archives not available for of Australia Disposal research purposes. Authority reference

number

Repeatable No No Obligation Essential Essential

Remarks

49 Element Name

50 17. Institution Responsible for Archiving Decision (work)

Definition The name of the agency responsible for the decision that this work should be archived.

Rationale In a distributed archiving model, the agency making the archiving decision may be different from the one actually archiving the object.

LEVEL COLLECTION OBJECT FILE

Scope Decision may be

Decision may be

taken and described at taken and described at

Not described at this

this level, or may be

this level, or may be

level.

summarised at this

summarised at this

level.

level.

Examples State Library of

State Library of

Victoria

Victoria

Repeatable Yes Yes

26 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

Obligation Essential Essential Remarks Responsibility at the manifestation level is described separately at

Elements 18-20.

51 Element Name

52 18. Archiving Decision (manifestation)

Definition The decision whether this manifestation should be archived/retained and date of that decision. This field may also include a retention period or review date.

Rationale This information facilitates decision-making about the particular manifestation, recognising that while some manifestations of a work may be retained indefinitely, other manifestations may not.

LEVEL COLLECTION OBJECT FILE

Scope Decision may be

Decision may be

taken and described at taken and described at

Not described at this

this level, or may be

this level, or may be

level.

summarised at this

summarised at this

level.

level.

Examples 1. Decision: To be archived. Date of Decision: 19971013 [yyyymmdd]. Review Decision: Do not retain. Date of Review: 19991013 [yyyymmdd]

22. Decision: To be archived. Date of Decision: 19971013 [yyyymmdd]. Review Decision: Retain indefinitely. Date of Review 19991013 [yyyymmdd]

Repeatable Yes Obligation Essential

www.nla.gov.au 27 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009

Remarks

53 Element Name

54 19. Decision Reason (manifestation)

Definition Why the decision to archive/retain the manifestation (or not) was made.

Rationale This information contributes to the preservation history of the work and facilitates future decision-making about the manifestation. Although the work itself may be required for permanent retention, a particular manifestation may be redundant in the archive.

LEVEL COLLECTION OBJECT FILE

Scope Decision may be

Decision may be

taken and described at taken and described at

Not described at this

this level, or may be

this level, or may be

level.

summarised at this

summarised at this

level.

level.

Examples 1. Manifestation has hit a migration dead end. Future migrations will be done from an earlier manifestation.

2. Source manifestation - retain indefinitely

Repeatable Yes Obligation Essential

Remarks

55 Element Name

56 20. Institution Responsible for Archiving Decision (manifestation)

Definition The name of the agency responsible for the decision that this manifestation should be archived/retained.

28 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

Rationale In a distributed archiving model, the agency making the decision about archiving or retention may be different from the one actually archiving the object.

LEVEL COLLECTION OBJECT FILE

Scope Decision may be

Decision may be

taken and described at taken and described at

Not described at this

this level, or may be

this level, or may be

level.

summarised at this

summarised at this

level.

level.

Examples State Library of

State Library of

Victoria

Victoria

Repeatable No No Obligation Essential Essential

Remarks

57 Element Name

58 21. Intention Type

Definition The intended use of a particular manifestation. Rationale Provides information necessary to manage various copies of an object.

LEVEL COLLECTION OBJECT FILE

Scope Describes the

Not described at this intended use of the

Not described at this

manifestation. Examples 1. Preservation master

2. Access copy

Repeatable No Obligation Essential if applicable

www.nla.gov.au 29 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009

Remarks

59 Element Name

60 22. Institution with preservation responsibility

Definition The name of the agency that has accepted responsibility for preservation. Should include date of commencement of acceptance of responsibility, or range of dates of responsibility.

Rationale Attributes responsibility and provides information for allocation of resources and prevention of unwanted duplication. May be different from the agency selecting and the agency actively carrying out processes.

LEVEL COLLECTION OBJECT FILE

Scope If useful, may be described or summarised at this

Records the name of

level. Records the

the agency responsible

Not described at this

name of the agency

for the preservation of

level.

responsible for

this object and the

preservation of this

relevant dates.

collection and relevnat dates.

Examples National Library of

National Library of

Australia, 1 July 2000 Australia, 1 July 2000 -

Repeatable Yes

Yes

Obligation Essential Essential Remarks Primary level of description is the object. If useful, may be dscribed or

summarised at collection level. Information about responsibility should

be available at all levels, even if input only at object level.

61 Element Name

62 23. Process

Definition All relevant details of any process applied to a digital object or file, including software, specific settings or actions that were required to produce the current manifestation, details of all equipment and

30 │39 www.nla.gov.au 31 March 2009

Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

Rationale This element documents what has happened to a particular manifestation of an object. The series of linked records pertaining to manifestations of an object builds up a change history over time. This information is essential to document what preservation methods have been applied to the object and how the various manifestations might differ from each other.

LEVEL COLLECTION OBJECT FILE Repeatable Yes Yes Yes Obligation Essential if applicable Essential if applicable Essential if applicable Remarks The entire element, including sub-elements, must be repeatable.

Sub-elements 23.1 Description of Process

23.2 Name of the Agency Responsible for the Process

23.3 Critical Hardware Used in the Process

23.4 Critical Software Used in the Process

23.5 How Process was Carried Out

23.6 Guidelines Specified to Implement Process

23.7 Date and time

23.8 Result

23.9 Process Rationale

23.10 Changes

23.11 Other

63 Sub-element Name

64 23.1 Name of the Process

Definition Name of the process applied. Rationale To record what process was applied

LEVEL COLLECTION OBJECT FILE

Scope If useful, may be

Describes the process Describes the process

summarised at this

www.nla.gov.au 31 │39 Creative Commons Attribution-NonCommercial-ShareAlike 2.1 Australia

31 March 2009

Examples 1. Copy from floppy disk to CD-R

1. Move from UNIX

1. Conversion of .wav

2. Copy from

to Solaris platform

to .aiff

publishers' Web site to archive

Repeatable No No No Obligation Essential Essential Essential

Remarks

65 Sub-element Name

66 23.2 Agency

Definition The name of the agency responsible for the process. Rationale Track responsibility for changes to the collection, object or file.