ppt an architecture for information in digital libraries 05 2001

Alternative Architecture
for Information in Digital
Libraries
Onno W. Purbo
Onno@indo.net.id

Reference




http://www.dlib.org/dlib/february9
7/cnri/02arms1.html
William Y. Arms, Christophe
Blanchi, Edward A. Overly, “An
Architecture for Information in
Digital Libraries,” Corporation for
National Research Initiatives
Reston, Virginia, February 1997.

The Structure of

Information



Digital data  digital library.
Digital objects







Metadata
Unique identifier (handle).

Group of digital objects  set of
digital objects.
Different type of material 
categories.


Components of Comp
System

Work Flow Example


Search





Select
Retrieval




Z.39.50 – list of digital objects

identified by handle.

Resipository Access Protocol (RAP)

Display

Information Architecture

Structure of Info in Digi Lib






Relationship (chapter, index)
Format (SGML, HTML)
Version
Right & Permission
Computer System & Network

(dialup vs. broadband).

Basic Principles






User & app. Program must be
flexible.
Collections must be
straightforward to manage.
The information archirectire must
reflect economic, social & legal
framework.

Data type, structural
metadata







Data type – technical properties of
data, format & processing.
Structural metadata – type,
version, relationship of digital
material.
Meta-object – reference to a set of
digital object.

Guidelines for all
categories









All data is given an explicit data type
All metadata is encoded explicitly
Handles are given to individual items
of intellectual property
Meta-objects are used to aggregate
digital objects
Handles are used to identify items
listed in meta-objects

An Example of the Use of
Meta-objects









Scanned photographs
Digital objects for a scanned
photograph
Digital objects for individual versions
Meta-object
Handles for scanned photographs
Depositing a scanned photograph

Digital objects for a
scanned photograph



Low resolution “thumbnail”
High resolution “reference” image

Digital objects for
individual versions



Key metadata.




Structural metadata.




used to manage the object in a networked
environment. It includes the handle, and the rights
and permissions associated with the digital object.
includes fields for description, owner, handle of
meta-object, data size, data type (e.g., "jpg"),
version number, description, date deposited, use
(e.g., "thumbnail"), and the date of last revision.


Image data.


This is the image data.

Meta-object


Key metadata.




Structural metadata.




includes the handle, and the rights and permissions
associated with the digital object.

includes a description, the owner, the number of
versions, the date deposited, the use ("metaobject"), and the date of last revision.

Data about each version.


For each of the three scanned versions (e.g., the
thumbnail), there is a package of information
including the handle of the version, and the
relationship among the versions.

Handles for scanned
photographs





control identifier - 3a16116r.jpg
replace the control identifiers by handles, which

provide a unique, persistent, location independent
name for each item - loc.ndlp.amrlp/3a16116
Terminology to describe handles:





"loc.ndlp.amrlp" is the naming authority
"3a16116" is a locally unique string

For convenience in processing, use sequence
numbers



loc.ndlp.amrlp/3a16116.1
loc.ndlp.amrlp/3a16116.2

Meta object identifies 2
image

Depositing a scanned
photograph



Human
machine

Depositing a scanned
photograph - human




Selection of the material that will
be made into each digital object.
Specification of the metadata for
those fields that require judgment.

Depositing a scanned
photograph - machine






Creation of the meta-object and
the links to other digital objects.
Depositing the digital objects in
the repository.
Registering the handles in the
handle system.

Access to a scanned
photograph






Bibliographic entries in search systems
refer to the scanned photograph by the
handle of the meta- object.
If a user requests a summary of the
photograph, the "thumbnail" image is
provided.
If the user requests access to the
photograph without specifying which
version, the "access" image is provided.

Technical Information

Digital Object

Digital Object


Key-metadata




The key-metadata is the information stored in
the digital object that is needed to manage the
digital object in a networked environment -- for
example to store, replicate, or transmit the
object without providing access to the content.
This includes terms and conditions, and the
handle.

Digital material


The digital material (or data) comprises a set of
sequences of bits.

Digital Objects Internal
Structure






An element is a bit sequence
comprising an elementary unit of
information. An element has its own
ID.
A package is a collection of elements
and other packages, with its own ID.
A digital object is a package with
key-metadata for use in a networked
environment. The ID is a handle.

Data Element

Data Element


Data element




Element ID




A data element is any bit-sequence.
The element ID is the internal identifier of the element
within the digital object. Unlike a handle, which is
unique and known publicly, the element ID is of local
importance only.

Attributes


Attributes are the information that is needed to
process the element. They include: a role, which
defines the function of the element (such as "DTD" in
the SGML world), and a type, which includes technical
information (such as "jpeg").

A Package

Packages




Packages are used to group or associate
elements and other packages.
A package has a package ID.


If the package is a digital object, the package
ID is a handle. Otherwise, it is the internal
identifier of the package within the digital
object. Unlike a handle, which is unique and
known publicly, such a package ID is of local
importance only. The content of a package
consists of elements and other packages.

Handle & Handle System

Handle & Handle System




The digital library is assembled from a great
variety of components. They include
people, computers, networks, repositories,
databases, search systems, Web servers,
digital objects, elements of objects,
bibliographic records, and many more.
Keeping track of these components requires
a systematic approach to identification.
http://www.handle.net

Typical handle record

Handle record for web

Handle System




To resolve a handle is to present a handle to
the handle system and receive as a reply
information about the item identified.
The handle system is a distributed
computer system, with many computers
distributed across the world. CNRI manages
a global handle registry and there are local
handle services operated by other
organizations, e.g. http://www.handle.net/

Naming Authority


Handles are created by naming
authorities, administrative units
that are authorized to create and
edit handles.

The Repository

Structure of a Repository






A repository is a system for networked based
storage and access to digital objects.
All interaction with the repository uses a simple
protocol, known as the Repository Access
Protocol (RAP). RAP has a small number of
fundamental operations, such as "deposit",
which stores a digital object in the repository,
and "access", which provides access to a digital
object.
Thus RAP provides a clearly defined, open
interface for the repository that allows others to
write clients and higher level interfaces.

Structure of Repository

Structure of Repository


Repository shell




Persistent store




The repository shell is the part of the repository that
interfaces with the outside world. It implements the
RAP protocol
Information in the repository is held in the persistent
store. The persistent store is completely hidden from
the outside.

Object management layer


The object management layer provides an interface
between the services provided by the persistent store
and the object oriented functions required by the
repository shell.

The Repository Access
Protocol (RAP)













VerifyHandle. Confirm that a handle has been
registered in the handle system.
AccessRepoMeta. Access the repository metadata.
Verify_DO. Confirm that a repository stores a digital
object with a specified handle.
AccessMeta. Access the metadata for a specified
digital object.
Access_DO. Access the digital object.
Deposit_DO. Deposit a digital object in a repository.
Delete_DO. Deletes a digital object from a repository.
MutateMeta. Edit the metadata for a digital object.
Mutate_DO. Edit a digital object.

Handle system to access
DO

Example RAP Work Flow






The handle "loc.ndlp/1234" is sent to the handle
system. It resolves to data type "handle" (HDL),
value "loc/repos1". This is interpreted as
information that the digital object is stored in the
repository identified by the given handle.
The handle "loc/repos1" is sent to the handle
system. It resolves to information of type "RAP".
This is information that the repository implements
RAP. The corresponding data is a reference to a
CORBA Object Request Broker (ORB).
The command "Access_DO (loc.ndlp/1234)" is
now sent to the repository.

Benefit Using Handle


Since the digital object is identified by a
handle, if it is moved to another repository
the only change required is to alter the data
in the first of the handle records in the
figure. Since the repository is identified by a
handle, if the repository is moved to a
different computer or otherwise changed,
but its handle remains the same, altering the
single data item in the second handle record
in the figure is the only change needed, for
all the digital objects stored in the repository.

User Interface

User Interface System

Client via CGI-BIN

DO sets as hierarchies

Hierarchies


Level 0:




Level 1:




contains the digitized image, sound, text, or other
data.
is a parent of digital objects of Level 0. Upon
encountering a digital object of this type, the digital
object browser extracts the content of the all the
child Level 0 digital objects and displays them in an
indexed list to the user. This type has been used to
display indexes of thumbnail images.

Level 2:


is a parent of digital objects of Level 1.