Phoenix File System

Documentation Version: 1.14

Table Of Contents

Physical Media Abstraction Layer - This layer abstracts the physical format of the media into equal-sized logical allocation units called Device-Logical Sectors using the following guidelines:

The Physical Media Abstraction Layer determines Device-Logical Sector size, LogicalSectorSize, which is the number of 8-bit bytes each Device-Logical Sector represents of the physical media. In addition, the Physical Media Abstraction layer determines the number of Device-Logical Sectors used to represent the physical media, NumDeviceLogicalSectors; it also determines the function which maps Device-Logical Sectors to physical regions of the media.

It is the sole responsibility of this layer to translate Device-Logical Sectors, used by the Volume Abstraction Layer, to physical regions of the media. This is the only layer that should have to concern itself with the details of accessing the physical media. All higher layers have no idea of the working of the hardware nor the physical layout of the media, but rather perform all operations in terms of Device-Logical Sectors assuming the above guidelines for Device-Logical Sector properties.

Volume Abstraction Layer - This layer manages partitioning of physical media into segments as well as the grouping of one or more segments into logical volumes. Furthermore, this performs mapping between logical sectors, which are used by all higher level file system layers, and Device-Logical Sectors.

Logical Sector Allocation - This layer exists to keep track of which logical sectors have been allocated to hold data, which are available, and which are unusable (bad).

Single logical sectors called Free Space Descriptors are used to determine whether a number of sectors are in use or available for use. This is done by using each bit in the Free Space Descriptor to represent whether a single successive logical sector is in use or not (0 indicates the respective logical sector is in use, 1 indicates it is not). The number of logical sectors accounted for per Free Space Descriptor depends on the size of a logical sector (since a Free Space Descriptor is exactly one logical sector in size), but can be determined using the formula

SectorsPerFSD = LogicalSectorSize * 8
Traditionally, PCs have used 512-byte sectors implying that 512*8, or 4096, sectors could be accounted for per Free Space Descriptor. In addition, it needs to be noted that the logical sectors which are used to hold Free Space Descriptors themselves have to be accounted for like any other logical sector. The number of Free Space Descriptors varies based on the number of logical sectors, NumLogicalSectors. Every logical sector MUST be accounted for using a Free Space Descriptor, and it is not possible for the sectors account for by Free Space Descriptors to overlap. Logical sectors used by the Free Space Descriptor Location Table and by the Free Space Descriptors are considered in use and therefore are not available for allocation.

The location of each of the Free Space Descriptors is stored as a series of 64-bit Logical Sector numbers in a consecutive series of sectors ideally stored near the beginning of the storage media. This series of sectors is called the Free Space Descriptor Location Table and each 64-bit entry is the Logical Sector number of the Free Space Descriptor which accounts for a successive series of logical sectors. For example, if LogicalSectorSize is equal to 512 bytes, SectorsPerFSD would then be 4096, as such the first entry in the Free Space Descriptor Location Table (FSDLT) would hold the Logical Sector number of the Free Space Descriptor accountable for the first 4096 sectors (numbered 0 through 4095) and the next entry would hol the Logical Sector number of the Free Space Descriptor accountable for the next 4096 sectors (numbered 4096 through 8091), etc.
      FSDLT         Free Space Descriptors (FSDs)
      .----.
    1 |  o--------->|100010010...100101|  Allocation map for sectors 0 - 4095
      |----|
    2 |  o--------->|000010101...110001|  Allocation map for sectors 4096 - 8091
      |----|
    3 |  o--------->|100100101...001000|  Allocation map for sectors 8092 - 12287
      |----|
      .    .
      .    .
      |----|
    N |  o--------->|001001000...001001|  Allocation map for sectors 4096*N - 4096*(N-1)-1
      `----'
Where N is the number of entries in the Free Space Descriptor table (equal to NumLogicalSectors / SectorsPerFSD, or in this example, NumLogicalSectors / 4096).

The Free Space Descriptor Location Table may span as many sectors as it needs in order to hold the location of all of the Free Space Descriptors, but the sectors must be contiguous. As shown in the table below, the Free Space Descriptor Location Table is very efficient at describing the media and therefore the FSDLT will be relatively small (for example, only 4 kilobytes per gigabyte of capacity given a LogicalSectorSize of 512 bytes). For this reason, it is suggested that any OS implementing the Phoenix File System load and store a copy of the entire Free Space Descriptor Location table in memory for quick reference.

LogicalSectorSize SectorsPerFSD Bytes accounted for per FSD FSD location entries per sector of Free Space Descriptor Location Table Sectors accountable per sector of Free Space Descriptor Location Table Bytes accountable per sector of Free Space Descriptor Location Table
256 bytes2048 sectors16,384 bytes32 entries65,536 sectors16,777,216 bytes
512 bytes4096 sectors32,768 bytes64 entries262,144 sectors134,217,728 bytes
1024 bytes8,192 sectors65,536 bytes128 entries1,048,576 sectors1,073,741,824 bytes
2048 bytes16,384 sectors131,072 bytes256 entries4,194,304 sectors8,589,934,592 bytes
4096 bytes32,768 sectors262,144 bytes512 entries16,777,216 sectors68,719,476,736 bytes
8192 bytes65,536 sectors524,288 bytes1024 entries67,108,864 sectors549,755,813,888 bytes
16384 bytes131,072 sectors1,048,576 bytes2048 entries268,435,456 sectors4,398,046,511,104 bytes


The suggested arangement for the Free Space Descriptors in relation to the logical sectors they describe is that for even indexed Free Space Descriptors the Free Space Descriptor is located in the first sector it describes and for odd indexed Free Space Descriptor the Free Space Descriptor is located in the last sector it describes; this maximizes the number of contiguous sectors for allocation and minimizes the distance, and thus the seek time, between Free Space descriptors and the data they describe. This is by no means the only way to arrange the Free Space Descriptors, Free Space Descriptors need not even reside in the region they describe, but whenever possible it would be considered desireable to place Free Space Descriptors near the sectors they account for in order to minimize access times.

Bad Sector Recovery: when a given media is first formatted, the Free Space Descriptors are located only in "good" sectors (ie. not bad) and bad sectors are marked as used and recorded in a special Bad Sector List described below. Should a sector be found to be bad after formatting, the given sector should be marked as in use in the appropriate Free Space Descriptor and added to the Bad Sector List. However, if the bad sector is detected at the location of a Free Space Descriptor then major error recovery methods need to be employed resulting in the sector being added to the Bad Sector List and the Free Space Descriptor being moved to the a free good sector (or possibly moving data from a used sector to a free sector and then locating the Free Space Descriptor there so as to minimize the distance between the Free Space Descriptor and the data it describes).

File Allocation Layer - This layer exists to group logical sectors into units called files, a file is a collection of information that logically related. In the Phoenix File System, files are described using a tree structure where that tree leafs hold the actual file information. Each node and leaf of the data tree is described using a Tree Node Descriptor:

  Structure of a Tree Node Descriptor:
  Offset   Size           Field Name     Description
    0h     QWORD
	     bit      0 : Type           (1) Descriptor refers to SubTree block (internal node)
					 (0) Descriptor refers to data block (leaf)
	     bits  1-63 : Location       Logical Sector number for block location
    8h     QWORD          Length         If Type is 1, is total number of bytes described by SubTree
					 If Type is 0, is number of bytes in data block
  ----------------
  Total: 16 bytes
Where each block of information is exactly 1 logical sector in size and can hold information either about the files contents (a tree leaf) or about the location of additional blocks (a tree node). If a sector is a data block, Length bytes of the sector are considered to be part of the file's contents. If a sector is a SubTree block, it is simply a consecutive list of Tree Node Descriptors and the Length field represents the total number of bytes of the file's information that is described by all data blocks in or under the SubTree.

Sparse files: a region of a file is considered sparse if it contains no data. This can occur, for example, if a new file is created, then a seek if performed to 1000 bytes into the file, and then a single byte is written and the file is closed. The total length of the file would be 1001 bytes, even though only 1 byte of information is actually stored in the file; the first 1000 bytes are considered sparse. The Phoenix File System is efficient is storing files with sparse data by making note of the condition using a Tree Node Descriptor. The Type indicates the region of the file is a data block since it actually refers to the file's contents. The Length is the number of bytes in the sparse region, and the Location has the special reserved value of 0. (Location value 0 is considered reserved in the File Allocation Layer because logical sector 0 will always hold the Media Descriptor Block for the device). There cannot exist a sparse subtree as it would be illogical. Any information read from a sparse file region will always be 0. A sparse region 0 bytes in length is used to indicate a Tree Node Descriptor is not in use (entire Tree Node Descriptor is all zeros).

The basic properties of a file are described using a File Node with the following structure:
  Structure of a File Node:
  Offset   Size           Field Name     Description
   00h     DWORD          MagicNumber    Special number to help discern a File Node from other
					 data should the file system become corrupt, equal to
					 31534650h
   04h     DWORD          HardLinks      Number of references made from Directories to this file
   08h     DWORD          Flags          Basic file attributes
	     bit      0 : ArchiveFlag    Set whenever the LastModified field is updated
	     bit      1 : SystemFlag     Indicates file is an operating-system related file
	     bit      2 : HiddenFlag     Indicates file should not be listed in default file listings
	     bit      3 : ReadOnlyFlag   Indicates file cannot be written to or deleted
	     bits   4-7 : (reserved)     must be 0
	     bit      8 : ImmediatePurge Indicates whether file should be immediately purged on delete
	     bits  9-15 : (reserved)     must be 0
					 --- end general flags, begin internal flags ---
	     bits 16-18 : FileType       Type definition for file data
					 000 = generic file
					 001 = directory
					 010 = symbolic link
	     bits 19-21 : (reserved)     must be 0
	     bit     22 : DeletedFlag    Indicates whether or not this file has been deleted
	     bit     23 : PurgeFlag      Indicates whether or not this file is to be purged
	     bits 24,25 : InternalFlag   Indicates what file information, if any, is stored
					 internally if the File Node
					    00 = no internal data
					    01 = internal rights information
					    10 = internal extended attributes
					    11 = internal file data
	     bits 26,27 : (reserved)     must be 0
	     bit     28 : NodeSize       Indicates whether or not the File Node occupies the full
					 logical sector
					     0 = File Node is half the size of the logical sector
					     1 = File Node occupies the entire logical sector
	     bits 29-31 : (reserved)     must be 0
   0Ch     DWORD          Owner          Object ID of owner of this File Node
   10h     DWORD          Creator        Object ID which created this File Node
   14h     DWORD          Modifier       Object ID of user who last modified File Node
   18h     DWORD          Created        Date/Time File Node was created
   1Ch     DWORD          LastModified   Date/Time File Node was last modified
   20h     DWORD          DataAccessed   Date/Time file data last accessed
   24h     DWORD          DataModified   Date/Time file data last modified
   28h     QWORD          FileSize       Total length of file data
   30h     1 TND          Rights         Rights list data tree
   40h     2 TNDs         EAs            Extended Attributes data tree
   60h     7 TNDs         FileData       File contents data tree
   D0h     x BYTES        InternalData   minimum of 48 bytes of space specifically set aside for
					 storing small amounts of data inside the File Node
					 without using data trees. The InternalFlag determines
					 which information, if any, is stored internally.
  ----------------
  Total: 256 bytes minimum

A file node is always at most 1 logical sector in size and at least 256 bytes in size, as such, the amount of space reserved for internal data with a File Node can vary from 48 bytes to LogicalSectorSize-208 bytes in size. A File Node may occupy an entire sector or only half of a sector, the latter only being valid for LogicalSectorSizes of 512-bytes or more (since one half of 512 bytes is 256 bytes, the minimum size of a File Node). Furthermore, each File Node is identified using a File Node Number of which bits 1-63 indicate the logical sector number the File Node resides in, and bit 0 is clear if the File Node is in the first half of the sector and is 1 if the File Node is in the second half of the sector. The following table summarizes the minimum and maximum sizes of File Nodes and the amount of space reserved in each File Node for internal data, based on the size of a logical sector.

LogicalSectorSize Supports halfing logical sector Minimum File Node Size Maximum File Node Size Minimum Internal Data Reserve Maximum Internal Data Reserve
256 bytesNo256 bytes256 bytes48 bytes48 bytes
512 bytesYes256 bytes512 bytes48 bytes304 bytes
1024 bytesYes512 bytes1024 bytes304 bytes816 bytes
2048 bytesYes1024 bytes2048 bytes816 bytes1840 bytes
4096 bytesYes2048 bytes4096 bytes1840 bytes3888 bytes
8192 bytesYes4096 bytes8192 bytes3888 bytes7984 bytes
16384 bytesYes8192 bytes16384 bytes7984 bytes16176 bytes


When data is stored internally, the field(s) that would ordinarilly used to store Tree Node Descriptor(s) for the given data are instead overwritten with a portion of the internal data. This can be safely done since the InternalFlag identifies which data is stored internally, and if the data is being stored internally, then the external data Tree Node Descriptor field(s) are not used, thereby leaving them available to store additional internal data. The space normally reserved for the Tree Node Descriptors is first used in the storage of internal data before utilizing the space specifically reserved for internal data. If the Rights List or Extended Attributes are stored internally, the first two bytes of the internal data are the length of the remaining internal data in bytes; the remainder of the internal data is the actual information to be stored internally. In the case of internal file data, the FileSize field can be consulted to determine the amount of internal data, and as such two bytes are not prepended to the internal data. The following table shows the maximum amount of internal data that can be stored in a File Node utilizing this optimization; it should be noted that this efficient use of File Node space is not a feature that can be optionally implemented, but is a standard component of the Phoenix File System.

LogicalSectorSize Maximum Internal Data Reserve Maximum Internal Rights Info Maximum Internal Extended Attributes Maximum Internal File Data
256 bytes48 bytes62 bytes (64 total)78 bytes (80 total)160 bytes
512 bytes304 bytes318 bytes (320 total)334 bytes (336 total)416 bytes
1024 bytes816 bytes830 bytes (832 total)846 bytes (848 total)928 bytes
2048 bytes1840 bytes1854 bytes (1856 total)1870 bytes (1872 total)1952 bytes
4096 bytes3888 bytes3902 bytes (3904 total)3918 bytes (3920 total)4000 bytes
8192 bytes7984 bytes7998 bytes (8000 total)8014 bytes (8016 total)8096 bytes
16384 bytes16176 bytes16190 bytes (16192 total)16206 bytes (16208 total)16388 bytes


A Rights list is maintained to determine which users or groups of users have access to the information stored in the file data, and Extended Attributes are maintained to give system add-ons a storage area for additional file properties.

Directory files:

Symbolic links: Symbolic links are special files which simply hold the path to another file. Symbolic links can, in this manner, point to other files including generic files, directories, or other symbolic links. Symbolic links, unlike hard links, can refer to files located on volumes other than the one on which it resides. The file data of a symbolic link holds the path to the other file. A symbolic link has its own set of flags and extended attributes, its own owner, its own creator, etc. The Rights for a symbolic link determine access to the link itself, a separate rights access check is performed on the file the symbolic link refers to.

Rights list format: The rights list consists of an array of Rights List Entries, where each entry specifies the rights for the file node for one user or group Object ID. The number of entries in the rights list is determined by the length of the rights list as stored in the file node; simply divide the length of the Rights by the size of a single RightsListEntry to get the number of entries. Each Rights List Entry has the format:
  Structure of a Rights List Entry:
  Offset   Size           Field Name     Description
   00h     DWORD          User           Object ID of user or group to whom the rights pertain
   04h     DWORD          Rights         Determines what rights the User has to this file node
	     bit      0 : Find           user may read file node information
	     bit      1 : Read           user may read file data
					   For generic files  : may read the "contents" of the file
					   For directories    : may scan the directory contents
					   For symbolic links : may see where the link points to
	     bit      2 : ReadRights     user may read the rights information of the file node
	     bit      3 : ChangeRights   user may modify the rights information in the file node
	     bit      4 : ChangeEAs      user may modify the extended attributes of the file node
	     bit      5 : ChangeOwner    user may change the ownership of a file node
	     bit      6 : ChangeFlags    user may modify the file node's general flags
	     bit      7 : Unlink         user may remove a link to this file node
	     bit      8 : Write          user may modify the file data of a generic file
	     bit      9 : Redirect       user may change the file data of a symbolic link
	     bit     10 : Create         user may add an entry to a directory (modify directory file data)
	     bit     11 : Rename         user may rename an entry in a directory (modify directory file data)
	     bit     12 : Remove         user may remove an entry from a directory (modify directory file data)
	     bits 13-23 : (reserved)     must be 0
                     24 : Supervisor     user has all rights to the file node
             bits 25-30 : (reserved)     must be 0
             bit     31 : InheritMask    (1) rights list entry is an inherited rights mask
                                         (0) rights list entry contains access rights 
  ----------------
  Total: 8 bytes
Rights for a given user are resolved using the fully qualified path of the file in question. The rights list of the file are scanned for an entry explicitly defining the rights of the user in question, if an entry is found then it completely determines the user's rights. If not, the parent directory's rights list is scanned for an entry explicitly defining the rights of the user, if an entry is found then it determines the user's rights, modifiable by an inherited rights mask. If an entry is not found, the process is continued up the directory tree until the root is reached. If the root is reached and no rights were ever defined, then the entire process is repeated per group the user belongs to. If no rights entries are found to be applicable to the user, then the user is presumed to have no rights for the given file node. The fact that directories which constitute the fully qualified path to the file can determine a file's access rights illustrates that rights for a file may be inherited. In addition, this also illustrates how through the use of inheritance, two directory entries which are linked to the same file node may actually have different access rights. Even though the rights list for the file node is the same for both directory entries (since both entries actually refer to the same file), the inherited rights can differ.

In addition to defining access rights for user and group objects, Rights List Entries can be used to store Inherited Rights Masks limiting the amount of access that can be inherited from directories composing the fully qualified path to a file node. Usually Inherited Rights Masks are defined for groups of users, however it is possible to also have Inherited Rights Masks per user which override masks per group. The full Inherited Rights Mask for a file node is determined in a similar manner as a user's access rights for the file node. The effective Inherited Rights Mask is initialized to all bits set, then the rights list of each directory up the directory tree from the file node is scanned for Inherited Rights Mask entries pertaining to the user and any group the user belongs to. For each entry found the rights mask is logically AND'ed with the current value of the effective Inherited Rights Mask to form the new value for the effective Inherited Rights Mask. Note that only directories higher up the directory tree than the file in question are scanned. When the root of the directory tree is reached, or when the effective Inherited Rights Mask becomes 0, the effective Inherited Rights Mask is complete. This mask is then logically AND'ed with a user's access rights whenever there is not an rights list entry in the file node explictly defining the user's access to that file node. For example, consider the following set of access rights and Inherited Rights Masks for a given user (for simplicity, this example does not include use of groups):
  File or Directory             Access Rights    Inherited Rights Mask   effective Inherited Rights Mask
  /example/of/rights                                                     11111111...11
  /example/of                                    11110111...11           11110111...11
  /example                      11011111...01    10011111...11           10010111...11
  /                             00001100...00                            10010111...11

  So the given user's access to the file node linked to /example/of/rights would be
                      Access Rights : 11011111...01
    effective Inherited Rights Mask : 10010111...11 AND
                                     --------------
     user's effective access rights : 10010111...01
One thing worth pointing out is that a user's accessed rights are determined by a single rights list entry either located in the file node's rights list or in a directory's rights list which forms the fully qualified path to the given file. On the other hand, a user's Inheritied Rights Mask is determined by AND'ing appropriate Inherited Rights Mask entries from each directory that forms the fully qualified path to the given file. Futhermore, a effective Inheritied Rights Mask is only applied when the user inherited their rights to a file from a directory above the file (ie. the given file did not explicitly list access rights for the given user or a group that the user belongs to).

Extended Attributes format: Two ideas are up for grabs here, the first would treat EA's much like environment variables in DOS, i.e. list them in a "Name=Value" format, where each attribute would be a null terminated string. Equal signs ('=') that appeared in the Name or Value fields would have to be represented as double equal signs. This approach, however, has several downsides, for one thing, searching the attributes would have to be sequential and would involve string compares, a very time-consuming process. Also, By only storing strings, it would require a number that would only take 1 byte to be represented in binary to be represented by up to three characters. Also, it is limited in that it requires strings to be associated with other strings.
The second approach would be to store the data in an array of "chunks". Each attribute would consist of one "chunk". Each Extended Attribute would have a length field, an ID field, and data. The ID field would be a unique identifier that determines the format of the data. For instance, the Operating System could reserve ID 1 to signify association, and specify that the data would consist of the path of the program to execute if a user tries to execute a non-executable file. User programs could also use EA's to store data. A programmer could register an ID, and then store whatever data under that ID that he desired. ID's would then be given out similar to the way that DLL's are given ID's.

The SuperBlock I'm just going to make notes of what fields that will need to be in the superblock as we think of them, we can go back and define the format later.

  Location of Free Space Descriptor Table
  Bad Block List tree descriptor  (make the BBL a special file node, and have this be a node number)
  Root Directory tree descriptor  (just make it the file node number of the root dir)
  Volume Descriptor List tree descriptor
    VolumeType (Single Partition, Mulitple Partitions, Band?)
    Num Partitions In Volume
    Partition List:
       RECORD
          Unique Disk ID (serial number if possible) - QWORD/TBYTE?
          Partition Number on Disk - BYTE/WORD?
  File System Version Number
  Number of bytes of reserved space in File Nodes for internal data
  A Features DWORD, now is reserved(0), but later on bits may indicate advanced features
  Date/Time created
  Volume creator (who partitioned it and/or formatted it)


The Boot Record The Boot Record is data specific to each partition (whereas the SuperBlock contains Volume specific data). Like the SuperBlock above, I'm just going to throw some ideas down here:
   NumLogicalSectors (may want this in SuperBlock as well?)
   LogicalSectorSize (may want this in SuperBlock as well?)