Slides for Chapter 8: Distributed File Systems - Chapter 8 slides

  Slides for Chapter 8: Distributed File Systems From

  Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 4, © Pearson Education 2005 Storage systems and their properties Sharing Persis- Distributed Consistency Example tence cache/replicas maintenance

  Main memory RAM

  1 File system

  UNIX fle system

  1 Distributed fle system

  Sun NFS Web server

  Web Distributed shared memory

  Ivy (DSM, Ch. 18) Remote objects (RMI/ORB)

  CORBA

  1 Persistent object store

  CORBA Persistent

  1 Object Service

  2 Peer-to-peer storage system Types of consistency:

  OceanStore (Ch. 10) 1: strict one-copy. 3: slightly weaker guarantees. 2: considerably weaker guarantees.

File system modules

  Directory module: relates file names to file IDs File module: relates file IDs to particular files Access control module: checks permission for operation requested

File access module: reads or writes file data or attributes

Block module: accesses and allocates disk blocks Device module: disk I/O and buffering

File attribute record structure

  File length Creation timestamp Read timestamp Write timestamp

  Attribute timestamp Reference count Owner File type

  Access control list UNIX file system operations filedes e= eopennname emoded Opens an existing fle with the given name. filedes e= ecreatnname emoded Creates a new fle with the given name.

  Both operations deliver a fle descriptor referencing the open fle. The mode is read, write or both. status e= eclosenfiledesd

Closes the open fle filedes.

count e= ereadnfiledes ebufer end Transfers n bytes from the fle referenced by filedes to bufer.

  Transfers n bytes to the fle referenced by filedes from bufer. count e= ewritenfiledes ebufer end

  Both operations deliver the number of bytes actually transferred and advance the read-write pointer. pos e= elseeknfiledes eofset e e

  Moves the read-write pointer to ofset (relative or absolute, whenced e depending on whence). status e= eunlinknnamed

Removes the fle name from the directory structure. If the fle has no other names, it is deleted

  status e= elinknname1 ename2d Adds a new name (name2) for a fle (name1). status e= estatnname ebuferd Gets the fle attributes for fle name into bufer.

File service architecture

  Client computer Server computer

  Directory service Application Application program program

  Flat file service Client module Flat file service operations ReadnFileId ei end e-> eData e e — ethrows eBadPosition

  If 1 e≤ ei e≤ eLengthnFiled: Reads a sequence of up to n items from a fle starting at item i and returns it in Data.

  WritenFileId ei eDatad e e — ethrows eBadPosition

If 1 e≤ ei e≤ eLengthnFiled+1: Writes a sequence of Data to a fle, starting at item i, extending the fle if necessary

  

Creatend e-> eFileId Creates a new fle of length 0 and delivers a UFID for it. DeletenFileIdd e Removes the fle from the fle store. GetAttributesnFileIdd e-> eAttr e Returns the fle attributes for the fle.

SetAttributesnFileId eAttrd e Sets the fle attributes (only those attributes that are not shaded in Figure 8.3). Directory service operations LookupnDir eNamed e-> eFileId

  Locates the text name in the directory and returns the — ethrows eNotFound e e relevant UFID. If Name is not in the directory, throws an exception.

  AddNamenDir eName eFileIdd e e If Name is not in the directory, adds (Name, File) to the — e throws eNameDuplicate e e directory and updates the fle’s attribute record.

  If Name is already in the directory: throws an exception.

  UnNamenDir eNamed e e If Name is in the directory: the entry containing Name is — e throws eNotFound e e removed from the directory.

  If Name is not in the directory: throws an exception.

  GetNamesnDir ePatternd e-> eNameSeq e Returns all the text names in the directory that match the regular expression Pattern.

NFS architecture

  Client computer Server computer

  Application Application program program UNIX system calls

  UNIX kernel UNIX kernel

  Virtual file system Virtual file system

  Local Remote

  m te

  UNIX UNIX NFS

  NFS

  ys

  file file

  er s

  client server

  th

  system system

  ile O f

  NFS protocol NFS server operations (simplified) – 1 lookupndirfh enamed e-> efh eattr

  Returns fle handle and attributes for the fle name in the directory dirfh.

  createndirfh ename eattrd e-> e

  Creates a new fle name in directory dirfh with attributes attr and

  newfh eattr returns the new fle handle and attributes. removendirfh enamed e estatus Removes fle name from directory dirfh. getattrnfhd e-> eattr

  Returns fle attributes of fle fh. (Similar to the UNIX stat system call.)

  setattrnfh eattrd e-> eattr

  Sets the attributes (mode, user id, group id, size, access time and modify time of a fle). Setting the size to 0 truncates the fle.

  readnfh eofset ecountd e-> eattr edata Returns up to count bytes of data from a fle starting at ofset.

  Also returns the latest attributes of the fle.

  writenfh eofset ecount edatad e-> eattr

  Writes count bytes of data to a fle starting at ofset. Returns the attributes of the fle after the write has taken place.

  renamendirfh ename etodirfh etonamed

  Changes the name of fle name in directory dirfh to toname in

  • -> estatus directory to todirfh .

  linknnewdirfh enewname edirfh enamed e e

  Creates an entry newname in the directory newdirfh which refers to

  • -> estatus fle name in the directory dirfh.

  Continues on next slide ... NFS server operations (simplified) – 2 symlinknnewdirfh enewname estringd

  Creates an entry newname in the directory newdirfh of type

  • -> estatus

  symbolic link with the value string. The server does not interpret the string but makes a symbolic link fle to hold it.

  readlinknfhd e-> estring

  Returns the string that is associated with the symbolic link fle identifed by fh.

  mkdirndirfh ename eattrd e-> e

  Creates a new directory name with attributes attr and returns the

  newfh eattr new fle handle and attributes. rmdirndirfh enamed e-> estatus Removes the empty directory name from the parent directory dirfh.

  Fails if the directory is not empty.

  readdirndirfh ecookie ecountd e-> e

  Returns up to count bytes of directory entries from the directory

  entries dirfh. Each entry contains a fle name, a fle handle, and an opaque

  pointer to the next directory entry, called a cookie. The cookie is used in subsequent readdir calls to start reading from the following entry. If the value of cookie is 0, reads from the frst entry in the directory.

  statfsnfhd e-> efsstats

  Returns fle system information (such as block size, number of free blocks and so on) for the fle system containing a fle fh.

Local and remote file systems accessible on an NFS client

  jim jane joe ann users students usr vmunix

  Client Server 2 . . . nfs

  Remote mount staff big bob jon people

  Server 1 export (root)

  Remote mount . . . x

  (root) (root)

  Note: The fle system mounted at /usr/students in the client is actually the sub-tree located at /export/people in Server 1; the fle system mounted at /usr/staf in the client is actually the sub-tree located at /nfs/users in Server 2.

Distribution of processes in the Andrew File System

  Workstations Servers

  Venus User program

  Vice UNIX kernel

  UNIX kernel Venus

  User Network program

  UNIX kernel Vice

  Venus User program

  UNIX kernel UNIX kernel

File name space seen by clients of AFS

  / (root) tmp bin cmu

  . . . vmunix bin

  Shared Local Symbolic links

System call interception in AFS

  UNIX file system calls Non-local file operations

  Workstation Local disk

  User program UNIX kernel Venus

  UNIX file system Venus Implementation of file system calls in AFS User eprocess UNIX ekernel Venus

  Net Vice opennFileName moded

  If FileName refers to a fle in shared fle space, pass the request to Venus.

  Open the local fle and return the fle descriptor to the application.

  Check list of fles in local cache. If not present or there is no valid callback epromise, send a request for the fle to the Vice server that is custodian of the volume containing the fle.

  Place the copy of the fle in the local fle system, enter its local name in the local cache list and return the local name to UNIX.

  Transfer a copy of the fle and a callback promise to the workstation. Log the callback promise. readnFileDescriptor Bufer elengthd

  Perform a normal UNIX read operation on the local copy. writenFileDescriptor Bufer elengthd

  Perform a normal UNIX write operation on the local copy. closenFileDescriptord Close the local copy and notify Venus that the fle has been closed. If the local copy has been changed, send a copy to the Vice server that is the custodian of Replace the fle contents and send a

The main components of the Vice service interface

  Fetchnfidd e-> eattr edata Returns the attributes (status) and, optionally, the contents of fle identifed by the fid and records a callback promise on it. Storenfid eattr edatad Updates the attributes and (optionally) the contents of a specifed fle. Creatend e-> efid Creates a new fle and records a callback promise on it. Removenfidd Deletes the specifed fle.

SetLocknfid emoded Sets a lock on the specifed fle or directory. The mode of the

lock may be shared or exclusive. Locks that are not removed expire after 30 minutes.

  ReleaseLocknfidd Unlocks the specifed fle or directory.

RemoveCallbacknfidd Informs server that a Venus process has fushed a fle from its

cache.

  BreakCallbacknfidd This call is made by a Vice server to a Venus process. It cancels the callback promise on the relevant fle.