ppt an architecture for information in digital libraries 05 2001
Alternative Architecture
for Information in Digital
Libraries
Onno W. Purbo
Onno@indo.net.id
Reference
http://www.dlib.org/dlib/february9
7/cnri/02arms1.html
William Y. Arms, Christophe
Blanchi, Edward A. Overly, “An
Architecture for Information in
Digital Libraries,” Corporation for
National Research Initiatives
Reston, Virginia, February 1997.
The Structure of
Information
Digital data digital library.
Digital objects
Metadata
Unique identifier (handle).
Group of digital objects set of
digital objects.
Different type of material
categories.
Components of Comp
System
Work Flow Example
Search
Select
Retrieval
Z.39.50 – list of digital objects
identified by handle.
Resipository Access Protocol (RAP)
Display
Information Architecture
Structure of Info in Digi Lib
Relationship (chapter, index)
Format (SGML, HTML)
Version
Right & Permission
Computer System & Network
(dialup vs. broadband).
Basic Principles
User & app. Program must be
flexible.
Collections must be
straightforward to manage.
The information archirectire must
reflect economic, social & legal
framework.
Data type, structural
metadata
Data type – technical properties of
data, format & processing.
Structural metadata – type,
version, relationship of digital
material.
Meta-object – reference to a set of
digital object.
Guidelines for all
categories
All data is given an explicit data type
All metadata is encoded explicitly
Handles are given to individual items
of intellectual property
Meta-objects are used to aggregate
digital objects
Handles are used to identify items
listed in meta-objects
An Example of the Use of
Meta-objects
Scanned photographs
Digital objects for a scanned
photograph
Digital objects for individual versions
Meta-object
Handles for scanned photographs
Depositing a scanned photograph
Digital objects for a
scanned photograph
Low resolution “thumbnail”
High resolution “reference” image
Digital objects for
individual versions
Key metadata.
Structural metadata.
used to manage the object in a networked
environment. It includes the handle, and the rights
and permissions associated with the digital object.
includes fields for description, owner, handle of
meta-object, data size, data type (e.g., "jpg"),
version number, description, date deposited, use
(e.g., "thumbnail"), and the date of last revision.
Image data.
This is the image data.
Meta-object
Key metadata.
Structural metadata.
includes the handle, and the rights and permissions
associated with the digital object.
includes a description, the owner, the number of
versions, the date deposited, the use ("metaobject"), and the date of last revision.
Data about each version.
For each of the three scanned versions (e.g., the
thumbnail), there is a package of information
including the handle of the version, and the
relationship among the versions.
Handles for scanned
photographs
control identifier - 3a16116r.jpg
replace the control identifiers by handles, which
provide a unique, persistent, location independent
name for each item - loc.ndlp.amrlp/3a16116
Terminology to describe handles:
"loc.ndlp.amrlp" is the naming authority
"3a16116" is a locally unique string
For convenience in processing, use sequence
numbers
loc.ndlp.amrlp/3a16116.1
loc.ndlp.amrlp/3a16116.2
Meta object identifies 2
image
Depositing a scanned
photograph
Human
machine
Depositing a scanned
photograph - human
Selection of the material that will
be made into each digital object.
Specification of the metadata for
those fields that require judgment.
Depositing a scanned
photograph - machine
Creation of the meta-object and
the links to other digital objects.
Depositing the digital objects in
the repository.
Registering the handles in the
handle system.
Access to a scanned
photograph
Bibliographic entries in search systems
refer to the scanned photograph by the
handle of the meta- object.
If a user requests a summary of the
photograph, the "thumbnail" image is
provided.
If the user requests access to the
photograph without specifying which
version, the "access" image is provided.
Technical Information
Digital Object
Digital Object
Key-metadata
The key-metadata is the information stored in
the digital object that is needed to manage the
digital object in a networked environment -- for
example to store, replicate, or transmit the
object without providing access to the content.
This includes terms and conditions, and the
handle.
Digital material
The digital material (or data) comprises a set of
sequences of bits.
Digital Objects Internal
Structure
An element is a bit sequence
comprising an elementary unit of
information. An element has its own
ID.
A package is a collection of elements
and other packages, with its own ID.
A digital object is a package with
key-metadata for use in a networked
environment. The ID is a handle.
Data Element
Data Element
Data element
Element ID
A data element is any bit-sequence.
The element ID is the internal identifier of the element
within the digital object. Unlike a handle, which is
unique and known publicly, the element ID is of local
importance only.
Attributes
Attributes are the information that is needed to
process the element. They include: a role, which
defines the function of the element (such as "DTD" in
the SGML world), and a type, which includes technical
information (such as "jpeg").
A Package
Packages
Packages are used to group or associate
elements and other packages.
A package has a package ID.
If the package is a digital object, the package
ID is a handle. Otherwise, it is the internal
identifier of the package within the digital
object. Unlike a handle, which is unique and
known publicly, such a package ID is of local
importance only. The content of a package
consists of elements and other packages.
Handle & Handle System
Handle & Handle System
The digital library is assembled from a great
variety of components. They include
people, computers, networks, repositories,
databases, search systems, Web servers,
digital objects, elements of objects,
bibliographic records, and many more.
Keeping track of these components requires
a systematic approach to identification.
http://www.handle.net
Typical handle record
Handle record for web
Handle System
To resolve a handle is to present a handle to
the handle system and receive as a reply
information about the item identified.
The handle system is a distributed
computer system, with many computers
distributed across the world. CNRI manages
a global handle registry and there are local
handle services operated by other
organizations, e.g. http://www.handle.net/
Naming Authority
Handles are created by naming
authorities, administrative units
that are authorized to create and
edit handles.
The Repository
Structure of a Repository
A repository is a system for networked based
storage and access to digital objects.
All interaction with the repository uses a simple
protocol, known as the Repository Access
Protocol (RAP). RAP has a small number of
fundamental operations, such as "deposit",
which stores a digital object in the repository,
and "access", which provides access to a digital
object.
Thus RAP provides a clearly defined, open
interface for the repository that allows others to
write clients and higher level interfaces.
Structure of Repository
Structure of Repository
Repository shell
Persistent store
The repository shell is the part of the repository that
interfaces with the outside world. It implements the
RAP protocol
Information in the repository is held in the persistent
store. The persistent store is completely hidden from
the outside.
Object management layer
The object management layer provides an interface
between the services provided by the persistent store
and the object oriented functions required by the
repository shell.
The Repository Access
Protocol (RAP)
VerifyHandle. Confirm that a handle has been
registered in the handle system.
AccessRepoMeta. Access the repository metadata.
Verify_DO. Confirm that a repository stores a digital
object with a specified handle.
AccessMeta. Access the metadata for a specified
digital object.
Access_DO. Access the digital object.
Deposit_DO. Deposit a digital object in a repository.
Delete_DO. Deletes a digital object from a repository.
MutateMeta. Edit the metadata for a digital object.
Mutate_DO. Edit a digital object.
Handle system to access
DO
Example RAP Work Flow
The handle "loc.ndlp/1234" is sent to the handle
system. It resolves to data type "handle" (HDL),
value "loc/repos1". This is interpreted as
information that the digital object is stored in the
repository identified by the given handle.
The handle "loc/repos1" is sent to the handle
system. It resolves to information of type "RAP".
This is information that the repository implements
RAP. The corresponding data is a reference to a
CORBA Object Request Broker (ORB).
The command "Access_DO (loc.ndlp/1234)" is
now sent to the repository.
Benefit Using Handle
Since the digital object is identified by a
handle, if it is moved to another repository
the only change required is to alter the data
in the first of the handle records in the
figure. Since the repository is identified by a
handle, if the repository is moved to a
different computer or otherwise changed,
but its handle remains the same, altering the
single data item in the second handle record
in the figure is the only change needed, for
all the digital objects stored in the repository.
User Interface
User Interface System
Client via CGI-BIN
DO sets as hierarchies
Hierarchies
Level 0:
Level 1:
contains the digitized image, sound, text, or other
data.
is a parent of digital objects of Level 0. Upon
encountering a digital object of this type, the digital
object browser extracts the content of the all the
child Level 0 digital objects and displays them in an
indexed list to the user. This type has been used to
display indexes of thumbnail images.
Level 2:
is a parent of digital objects of Level 1.
for Information in Digital
Libraries
Onno W. Purbo
Onno@indo.net.id
Reference
http://www.dlib.org/dlib/february9
7/cnri/02arms1.html
William Y. Arms, Christophe
Blanchi, Edward A. Overly, “An
Architecture for Information in
Digital Libraries,” Corporation for
National Research Initiatives
Reston, Virginia, February 1997.
The Structure of
Information
Digital data digital library.
Digital objects
Metadata
Unique identifier (handle).
Group of digital objects set of
digital objects.
Different type of material
categories.
Components of Comp
System
Work Flow Example
Search
Select
Retrieval
Z.39.50 – list of digital objects
identified by handle.
Resipository Access Protocol (RAP)
Display
Information Architecture
Structure of Info in Digi Lib
Relationship (chapter, index)
Format (SGML, HTML)
Version
Right & Permission
Computer System & Network
(dialup vs. broadband).
Basic Principles
User & app. Program must be
flexible.
Collections must be
straightforward to manage.
The information archirectire must
reflect economic, social & legal
framework.
Data type, structural
metadata
Data type – technical properties of
data, format & processing.
Structural metadata – type,
version, relationship of digital
material.
Meta-object – reference to a set of
digital object.
Guidelines for all
categories
All data is given an explicit data type
All metadata is encoded explicitly
Handles are given to individual items
of intellectual property
Meta-objects are used to aggregate
digital objects
Handles are used to identify items
listed in meta-objects
An Example of the Use of
Meta-objects
Scanned photographs
Digital objects for a scanned
photograph
Digital objects for individual versions
Meta-object
Handles for scanned photographs
Depositing a scanned photograph
Digital objects for a
scanned photograph
Low resolution “thumbnail”
High resolution “reference” image
Digital objects for
individual versions
Key metadata.
Structural metadata.
used to manage the object in a networked
environment. It includes the handle, and the rights
and permissions associated with the digital object.
includes fields for description, owner, handle of
meta-object, data size, data type (e.g., "jpg"),
version number, description, date deposited, use
(e.g., "thumbnail"), and the date of last revision.
Image data.
This is the image data.
Meta-object
Key metadata.
Structural metadata.
includes the handle, and the rights and permissions
associated with the digital object.
includes a description, the owner, the number of
versions, the date deposited, the use ("metaobject"), and the date of last revision.
Data about each version.
For each of the three scanned versions (e.g., the
thumbnail), there is a package of information
including the handle of the version, and the
relationship among the versions.
Handles for scanned
photographs
control identifier - 3a16116r.jpg
replace the control identifiers by handles, which
provide a unique, persistent, location independent
name for each item - loc.ndlp.amrlp/3a16116
Terminology to describe handles:
"loc.ndlp.amrlp" is the naming authority
"3a16116" is a locally unique string
For convenience in processing, use sequence
numbers
loc.ndlp.amrlp/3a16116.1
loc.ndlp.amrlp/3a16116.2
Meta object identifies 2
image
Depositing a scanned
photograph
Human
machine
Depositing a scanned
photograph - human
Selection of the material that will
be made into each digital object.
Specification of the metadata for
those fields that require judgment.
Depositing a scanned
photograph - machine
Creation of the meta-object and
the links to other digital objects.
Depositing the digital objects in
the repository.
Registering the handles in the
handle system.
Access to a scanned
photograph
Bibliographic entries in search systems
refer to the scanned photograph by the
handle of the meta- object.
If a user requests a summary of the
photograph, the "thumbnail" image is
provided.
If the user requests access to the
photograph without specifying which
version, the "access" image is provided.
Technical Information
Digital Object
Digital Object
Key-metadata
The key-metadata is the information stored in
the digital object that is needed to manage the
digital object in a networked environment -- for
example to store, replicate, or transmit the
object without providing access to the content.
This includes terms and conditions, and the
handle.
Digital material
The digital material (or data) comprises a set of
sequences of bits.
Digital Objects Internal
Structure
An element is a bit sequence
comprising an elementary unit of
information. An element has its own
ID.
A package is a collection of elements
and other packages, with its own ID.
A digital object is a package with
key-metadata for use in a networked
environment. The ID is a handle.
Data Element
Data Element
Data element
Element ID
A data element is any bit-sequence.
The element ID is the internal identifier of the element
within the digital object. Unlike a handle, which is
unique and known publicly, the element ID is of local
importance only.
Attributes
Attributes are the information that is needed to
process the element. They include: a role, which
defines the function of the element (such as "DTD" in
the SGML world), and a type, which includes technical
information (such as "jpeg").
A Package
Packages
Packages are used to group or associate
elements and other packages.
A package has a package ID.
If the package is a digital object, the package
ID is a handle. Otherwise, it is the internal
identifier of the package within the digital
object. Unlike a handle, which is unique and
known publicly, such a package ID is of local
importance only. The content of a package
consists of elements and other packages.
Handle & Handle System
Handle & Handle System
The digital library is assembled from a great
variety of components. They include
people, computers, networks, repositories,
databases, search systems, Web servers,
digital objects, elements of objects,
bibliographic records, and many more.
Keeping track of these components requires
a systematic approach to identification.
http://www.handle.net
Typical handle record
Handle record for web
Handle System
To resolve a handle is to present a handle to
the handle system and receive as a reply
information about the item identified.
The handle system is a distributed
computer system, with many computers
distributed across the world. CNRI manages
a global handle registry and there are local
handle services operated by other
organizations, e.g. http://www.handle.net/
Naming Authority
Handles are created by naming
authorities, administrative units
that are authorized to create and
edit handles.
The Repository
Structure of a Repository
A repository is a system for networked based
storage and access to digital objects.
All interaction with the repository uses a simple
protocol, known as the Repository Access
Protocol (RAP). RAP has a small number of
fundamental operations, such as "deposit",
which stores a digital object in the repository,
and "access", which provides access to a digital
object.
Thus RAP provides a clearly defined, open
interface for the repository that allows others to
write clients and higher level interfaces.
Structure of Repository
Structure of Repository
Repository shell
Persistent store
The repository shell is the part of the repository that
interfaces with the outside world. It implements the
RAP protocol
Information in the repository is held in the persistent
store. The persistent store is completely hidden from
the outside.
Object management layer
The object management layer provides an interface
between the services provided by the persistent store
and the object oriented functions required by the
repository shell.
The Repository Access
Protocol (RAP)
VerifyHandle. Confirm that a handle has been
registered in the handle system.
AccessRepoMeta. Access the repository metadata.
Verify_DO. Confirm that a repository stores a digital
object with a specified handle.
AccessMeta. Access the metadata for a specified
digital object.
Access_DO. Access the digital object.
Deposit_DO. Deposit a digital object in a repository.
Delete_DO. Deletes a digital object from a repository.
MutateMeta. Edit the metadata for a digital object.
Mutate_DO. Edit a digital object.
Handle system to access
DO
Example RAP Work Flow
The handle "loc.ndlp/1234" is sent to the handle
system. It resolves to data type "handle" (HDL),
value "loc/repos1". This is interpreted as
information that the digital object is stored in the
repository identified by the given handle.
The handle "loc/repos1" is sent to the handle
system. It resolves to information of type "RAP".
This is information that the repository implements
RAP. The corresponding data is a reference to a
CORBA Object Request Broker (ORB).
The command "Access_DO (loc.ndlp/1234)" is
now sent to the repository.
Benefit Using Handle
Since the digital object is identified by a
handle, if it is moved to another repository
the only change required is to alter the data
in the first of the handle records in the
figure. Since the repository is identified by a
handle, if the repository is moved to a
different computer or otherwise changed,
but its handle remains the same, altering the
single data item in the second handle record
in the figure is the only change needed, for
all the digital objects stored in the repository.
User Interface
User Interface System
Client via CGI-BIN
DO sets as hierarchies
Hierarchies
Level 0:
Level 1:
contains the digitized image, sound, text, or other
data.
is a parent of digital objects of Level 0. Upon
encountering a digital object of this type, the digital
object browser extracts the content of the all the
child Level 0 digital objects and displays them in an
indexed list to the user. This type has been used to
display indexes of thumbnail images.
Level 2:
is a parent of digital objects of Level 1.