Introduction UNIX Filesystem Layout

Chapter 6: UNIX Filesystem Layout

6.1 Introduction

In Chapter 5 we discussed the UNIX filesystem primarily from the user standpoint. UNIX users create, read, write, and purge files. And this is correct — UNIX filesystems exist to make the files accessible to users. But there is a lot of work behind the scenes to fulfill this logical requirement. This part is done by the UNIX system itself, and it is mostly hidden from the users. But UNIX administrators must be aware of this fact and should understand this process. Everybody knows that files reside on disk. They are saved somewhere, and when we need them, we get them. But how it works is more mysterious. We use the term filesystem layout to explain how the files are organized within the available disk space. UNIX files cannot exist out of the UNIX filesystems. UNIX filesystem is the vehicle to organize storage resources in a usable way. The filesystem merges files in a hierarchical way and enables their physical storage, as well as access to the stored files when needed. This is always true, independent of the filesystem type and organization. The filesystem layout is the main topic discussed in this chapter. A thorough understanding of filesystem layout is extremely important for successful filesystem management. Once this important topic is understood, many other UNIX issues will become automatically clear. Filesystem management is crucial for overall UNIX administration. This cannot be overstated. Just remember what we said earlier: on UNIX everything is a file or file−like. Files are in the center of UNIX. Consequently managing the files is the core of UNIX administration. Disk space can vary in size, type, characteristics, and even location a remote disk space can be used, just as the local one, and UNIX must respond to all possible situations. The total disk space is usually partitioned into smaller storage entities convenient for more flexible use, and a separate UNIX filesystem is created in each storage entity. To make the created filesystem visible to users, an additional step is required: it must be merged with other filesystems in an overall UNIX directory hierarchy, which we will address as an overall UNIX filesystem. Strictly speaking the overall UNIX filesystem is not a filesystem per se, rather this is a set of merged filesystems ready for use. UNIX filesystems are organized on two levels: physical and logical. Physical layout directly reflects the filesystem organization within a storage entity. It takes care of files parameters and maps them into corresponding hardware parameters of the storage entity. However, the UNIX filesystem can be organized and managed in a more sophisticated way within a virtual logical storage space that is built around physical entities. A new level of abstraction was introduced to make filesystem organization more flexible and powerful. Logical layouts of a storage space and its physical counterpart do not have to be necessarily the same. A logical storage can be spread over a part of a disk, over a whole disk, or as in todays modern UNIX flavors, over several disks. Nearly any combination of multiple partitions of multiple disks can be combined performance−wise in an extremely powerful way. Of course, a precise mapping of the logical storage to the physical storage counterpart is crucial. Once this bidirectional relationship is firmly established, UNIX can manage files on a logical level only. Logical storage entities are known as logical volumes, and the corresponding system software for their management is known as logical volume manager LVM. Logical volumes appeared at the moment when the disk technology reached the point where disk size, speed, and price stopped to be issues. LVM is a relatively new UNIX topic; for most of the UNIX flavors it is still an optional piece 141 We will use the general term data to refer to the system and user data stored on the disk. User data is the real data kept in files within the filesystem; system data is the data needed to identify and manage the user data. The system data presents a necessary overhead, but from the system standpoint this data is crucial for managing the filesystem. The data block is the smallest data unit. Each UNIX file consumes one or more blocks. If all the files blocks are known, the file itself can be easily managed. An additional step to identify the sequence of blocks that make the file is required. This is exactly why we organize files into a filesystem. We can look to the filesystem as a kind of umbrella that covers files and provides mechanisms for their use; system data keeps information needed for their accurate identification and allocation.

6.2 Physical Filesystem Layout