Slides for Chapter 8: Distributed File Systems - Chapter 8 slides
Slides for Chapter 8: Distributed File Systems From
Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 4, © Pearson Education 2005 Storage systems and their properties Sharing Persis- Distributed Consistency Example tence cache/replicas maintenance
Main memory RAM
1 File system
UNIX fle system
1 Distributed fle system
Sun NFS Web server
Web Distributed shared memory
Ivy (DSM, Ch. 18) Remote objects (RMI/ORB)
CORBA
1 Persistent object store
CORBA Persistent
1 Object Service
2 Peer-to-peer storage system Types of consistency:
OceanStore (Ch. 10) 1: strict one-copy. 3: slightly weaker guarantees. 2: considerably weaker guarantees.
File system modules
Directory module: relates file names to file IDs File module: relates file IDs to particular files Access control module: checks permission for operation requested
File access module: reads or writes file data or attributes
Block module: accesses and allocates disk blocks Device module: disk I/O and bufferingFile attribute record structure
File length Creation timestamp Read timestamp Write timestamp
Attribute timestamp Reference count Owner File type
Access control list UNIX file system operations filedes e= eopennname emoded Opens an existing fle with the given name. filedes e= ecreatnname emoded Creates a new fle with the given name.
Both operations deliver a fle descriptor referencing the open fle. The mode is read, write or both. status e= eclosenfiledesd
Closes the open fle filedes.
count e= ereadnfiledes ebufer end Transfers n bytes from the fle referenced by filedes to bufer.Transfers n bytes to the fle referenced by filedes from bufer. count e= ewritenfiledes ebufer end
Both operations deliver the number of bytes actually transferred and advance the read-write pointer. pos e= elseeknfiledes eofset e e
Moves the read-write pointer to ofset (relative or absolute, whenced e depending on whence). status e= eunlinknnamed
Removes the fle name from the directory structure. If the fle has no other names, it is deleted
status e= elinknname1 ename2d Adds a new name (name2) for a fle (name1). status e= estatnname ebuferd Gets the fle attributes for fle name into bufer.
File service architecture
Client computer Server computer
Directory service Application Application program program
Flat file service Client module Flat file service operations ReadnFileId ei end e-> eData e e — ethrows eBadPosition
If 1 e≤ ei e≤ eLengthnFiled: Reads a sequence of up to n items from a fle starting at item i and returns it in Data.
WritenFileId ei eDatad e e — ethrows eBadPosition
If 1 e≤ ei e≤ eLengthnFiled+1: Writes a sequence of Data to a fle, starting at item i, extending the fle if necessary
Creatend e-> eFileId Creates a new fle of length 0 and delivers a UFID for it. DeletenFileIdd e Removes the fle from the fle store. GetAttributesnFileIdd e-> eAttr e Returns the fle attributes for the fle.
SetAttributesnFileId eAttrd e Sets the fle attributes (only those attributes that are not shaded in Figure 8.3). Directory service operations LookupnDir eNamed e-> eFileId
Locates the text name in the directory and returns the — ethrows eNotFound e e relevant UFID. If Name is not in the directory, throws an exception.
AddNamenDir eName eFileIdd e e If Name is not in the directory, adds (Name, File) to the — e throws eNameDuplicate e e directory and updates the fle’s attribute record.
If Name is already in the directory: throws an exception.
UnNamenDir eNamed e e If Name is in the directory: the entry containing Name is — e throws eNotFound e e removed from the directory.
If Name is not in the directory: throws an exception.
GetNamesnDir ePatternd e-> eNameSeq e Returns all the text names in the directory that match the regular expression Pattern.
NFS architecture
Client computer Server computer
Application Application program program UNIX system calls
UNIX kernel UNIX kernel
Virtual file system Virtual file system
Local Remote
m te
UNIX UNIX NFS
NFS
ys
file file
er s
client server
th
system system
ile O f
NFS protocol NFS server operations (simplified) – 1 lookupndirfh enamed e-> efh eattr
Returns fle handle and attributes for the fle name in the directory dirfh.
createndirfh ename eattrd e-> e
Creates a new fle name in directory dirfh with attributes attr and
newfh eattr returns the new fle handle and attributes. removendirfh enamed e estatus Removes fle name from directory dirfh. getattrnfhd e-> eattr
Returns fle attributes of fle fh. (Similar to the UNIX stat system call.)
setattrnfh eattrd e-> eattr
Sets the attributes (mode, user id, group id, size, access time and modify time of a fle). Setting the size to 0 truncates the fle.
readnfh eofset ecountd e-> eattr edata Returns up to count bytes of data from a fle starting at ofset.
Also returns the latest attributes of the fle.
writenfh eofset ecount edatad e-> eattr
Writes count bytes of data to a fle starting at ofset. Returns the attributes of the fle after the write has taken place.
renamendirfh ename etodirfh etonamed
Changes the name of fle name in directory dirfh to toname in
- -> estatus directory to todirfh .
linknnewdirfh enewname edirfh enamed e e
Creates an entry newname in the directory newdirfh which refers to
- -> estatus fle name in the directory dirfh.
Continues on next slide ... NFS server operations (simplified) – 2 symlinknnewdirfh enewname estringd
Creates an entry newname in the directory newdirfh of type
- -> estatus
symbolic link with the value string. The server does not interpret the string but makes a symbolic link fle to hold it.
readlinknfhd e-> estring
Returns the string that is associated with the symbolic link fle identifed by fh.
mkdirndirfh ename eattrd e-> e
Creates a new directory name with attributes attr and returns the
newfh eattr new fle handle and attributes. rmdirndirfh enamed e-> estatus Removes the empty directory name from the parent directory dirfh.
Fails if the directory is not empty.
readdirndirfh ecookie ecountd e-> e
Returns up to count bytes of directory entries from the directory
entries dirfh. Each entry contains a fle name, a fle handle, and an opaque
pointer to the next directory entry, called a cookie. The cookie is used in subsequent readdir calls to start reading from the following entry. If the value of cookie is 0, reads from the frst entry in the directory.
statfsnfhd e-> efsstats
Returns fle system information (such as block size, number of free blocks and so on) for the fle system containing a fle fh.
Local and remote file systems accessible on an NFS client
jim jane joe ann users students usr vmunix
Client Server 2 . . . nfs
Remote mount staff big bob jon people
Server 1 export (root)
Remote mount . . . x
(root) (root)
Note: The fle system mounted at /usr/students in the client is actually the sub-tree located at /export/people in Server 1; the fle system mounted at /usr/staf in the client is actually the sub-tree located at /nfs/users in Server 2.
Distribution of processes in the Andrew File System
Workstations Servers
Venus User program
Vice UNIX kernel
UNIX kernel Venus
User Network program
UNIX kernel Vice
Venus User program
UNIX kernel UNIX kernel
File name space seen by clients of AFS
/ (root) tmp bin cmu
. . . vmunix bin
Shared Local Symbolic links
System call interception in AFS
UNIX file system calls Non-local file operations
Workstation Local disk
User program UNIX kernel Venus
UNIX file system Venus Implementation of file system calls in AFS User eprocess UNIX ekernel Venus
Net Vice opennFileName moded
If FileName refers to a fle in shared fle space, pass the request to Venus.
Open the local fle and return the fle descriptor to the application.
Check list of fles in local cache. If not present or there is no valid callback epromise, send a request for the fle to the Vice server that is custodian of the volume containing the fle.
Place the copy of the fle in the local fle system, enter its local name in the local cache list and return the local name to UNIX.
Transfer a copy of the fle and a callback promise to the workstation. Log the callback promise. readnFileDescriptor Bufer elengthd
Perform a normal UNIX read operation on the local copy. writenFileDescriptor Bufer elengthd
Perform a normal UNIX write operation on the local copy. closenFileDescriptord Close the local copy and notify Venus that the fle has been closed. If the local copy has been changed, send a copy to the Vice server that is the custodian of Replace the fle contents and send a
The main components of the Vice service interface
Fetchnfidd e-> eattr edata Returns the attributes (status) and, optionally, the contents of fle identifed by the fid and records a callback promise on it. Storenfid eattr edatad Updates the attributes and (optionally) the contents of a specifed fle. Creatend e-> efid Creates a new fle and records a callback promise on it. Removenfidd Deletes the specifed fle.
SetLocknfid emoded Sets a lock on the specifed fle or directory. The mode of the
lock may be shared or exclusive. Locks that are not removed expire after 30 minutes.ReleaseLocknfidd Unlocks the specifed fle or directory.
RemoveCallbacknfidd Informs server that a Venus process has fushed a fle from its
cache.BreakCallbacknfidd This call is made by a Vice server to a Venus process. It cancels the callback promise on the relevant fle.