Table Of Contents
Physical Media Abstraction Layer
- This layer abstracts the physical format of the media into equal-sized logical allocation units called Logical Sectors using the following guidelines:Media Partitioning
- The device is scanned to see if it contains a supported partition table format, if no supported partition table is found, the entire device is considered a single partition. Currently, only a PC-compatible master boot record and partition table is supported; it is located in the first accessible sector, Logical Sector 0. This allows the peaceful coexistence of the Phoenix File System on one partition along with any other PC-compatible file system, including FAT, VFAT, NTFS, and HPFS, on another partition of the same physical device. The format of the PC-compatible master boot record and partition table is: This document only discusses the structure of a partition dedicated to the Phoenix File System. The Phoenix File System subdivides each partition into one or more Segments; Segments are then grouped into Volumes. A single Volume may contain Segments from several different partitions on several different storage devices. Volumes may be used to store files, as virtual memory swap space, or any other purpose.
Phoenix File System Segment Table entry:
Offset Size Field Name Description
0h 1 QWORD StartSector Starting logical sector of this Segment
8h 1 QWORD Length Number of logical sectors in this Segment
10h 20 BYTES NextDrive DriveID of next Segment in Volume (or 0 if last)
24h 1 BYTE NextSegment Segment number of next Segment in Volume
25h 1 BYTE NumSegments Number of Segments belonging to this same Volume
26h 1 BYTE Sequence Index number of Segment within Volume
27h 1 BYTE Flags
bit 0 : BootVolume Whether the Volume this Segment belongs to is bootable
bit 1 : Interleaved Whether the Segments in this same Volume are interleaved
(also called data striping)
2 : InUse Indicates the System Segment Table entry contains valid information
3 : FinalSegment Indicates the given Segment is the last in its Volume
bits 4-7 : (reserved) must be 0
------------
Total: 40 bytes
Phoenix File System Segment Table and Bootstrap Loader format:
Offset Size Field Name Description
0h 138 BYTES BootstrapInit Bootstrap loader initialization
8Ah 1 BYTE Flags File system and device flags
bit 0 : LBA use BIOS Interrupt 13h LBA extensions
bits 1-7 : (reserved) must be 0
8Bh 1 BYTE BootSectors Number of logical sectors in the bootstrap loader
8Ch 1 DWORD MagicNumber Special number to help discern a Segment Table from other
data should the file system become corrupt, equal to
31676573h ("seg1")
90h 8 STEs Segments Segment Table
1E0h 1 QWORD CylinderTable Logical sector number where Cylinder Table for device resides
or 0 if no Cylinder Table is present
1E8h 1 WORD SectorSize Logical Sector Size for this device
1EAh 20 BYTES DiskID Serial number used to identify drive
1FEh 1 WORD Signature Boot Record Signature (AA55h)
------------
Total: 512 bytes*
* The Segment Table and Bootstrap Loader sector is always 512 bytes, even if the
device's logical sectors are larger than 512 bytes. Data stored in the remainder
of the Segment Table and Bootstrap Loader block is undefined by the Phoenix File
System and may be used however the operating system wishes.
The function of the Bootstrap loader initialization is to locate the bootable
PhoenixFS Segment, load the first sector of the bootstrap from the Segment into
memory, and then transfer control of the processor to the bootstrap program. The
BootSectors field should always at least 1
and if it is greater than 1, then the Bootstrap loader initialization should
load each sector from the storage device consecutively into memory. This allows
a device with 512-byte sectors and 2 BootSectors
to be practically indistinguisable during the boot process from a device with
1024-byte sectors, if one were supported, and a value of 1 in the BootSectors
field. Again, the format of the data in these extended boot sectors is undefined
and implementation-dependent.
Diagram of system partitions and PhoenixFS Segments:
PhoenixFS Segment Table
(First sector in partition)
Master Boot Record .-----------.
(Logical sector 0) /| bootstrap | Segment
.-------------. / | loader | .------------.
| boot | / |-----------| /| |
| code | / | Segment 0 | / | |
| . | / |-----------| / | |
| . | / | Segment 1 | / | |
| . | / |-----------|/ | |
|-------------|/ | Segment 2 | . .
| partition 0 | |-----------|\ . .
| | | Segment 3 | \ . .
|-------------|\ |-----------| \ | |
| partition 1 | \ | Segment 4 | \ | |
| | \ |-----------| \| |
|-------------| \ | Segment 5 | `------------'
| partition 2 | \ |-----------|
| | \ | Segment 6 |
|-------------| \ |-----------|
| partition 3 | \| Segment 7 |
| | `-----------'
`-------------'
The CylinderTable field holds the location of an optional
Cylinder Table which can be used to increase file system performance by noting between which
sequential logical sectors occurs greater seek latency. No specific knowledge of the storage device
is required but rather such a table is constructed by empirically observing seek latencies between
adjacent sectors and is stored for use between file system sessions. The name originates from the
fact that most storage devices store information logically in cylinders broken into sectors and that
seeking from a sector in one cylinder to a sector in another cylinder takes longer than seeking between
two sectors in the same cylinder. The Cylinder Table would then end up storing the first sector in each
cylinder as the seek between the last sector of the previous cylinder and it would take longer than
a seek between any two sectors in the same cylinder. However, the Cylinder Table does not have to be
restricted to only describing the locations of the beginning of cylinders, but can be used to indicate
any high latency seek between logically consecutive sectors. Many SCSI drives have the ability to
remap good sectors logically over sectors that may go bad during normal operation, any seeks to or from
such sectors would have much higher latency than would be expected if the translation were not being
made by the device.Volume Abstraction Layer
- This layer manages the grouping of one or more Segments into logical Volumes. Furthermore, this layer makes a collection of Segment Sectors in distinct Segments appear as a single array of Volume Sectors; each Volume Sector represents a portion of the underlying physical media. This is the final layer of sector-based abstraction present in the Phoenix File System.If the Interleaved flag is clear then each Segment is responsible for a consecutive number of Volume Sectors corresponding to the size of the Segment; the order of the Segments is determined by the Sequence value. For example, if two Segments belong to a single Volume, one Segment with 50,000 logical sectors and a Sequence value of 0, the other with 30,000 logical sectors and a Sequence value of 1, and each volume sector was composed of 2 Segment Sectors, then the Volume would consist of 40,000 Volume Sectors. The first 25,000 Volume Sectors, numbered 0 through 24,999, would be located on the first Segment and the last 15,000 Volume Sectors, numbered 25,000 through 39,999, would be located on the second Segment.Volume are the basic unit of all high level file operations. This layer of abstraction allows for Volumes to be independent of the underlying storage devices.
If the Interleaved flag is set then the Phoenix File System scatters data across all the Segments in the Volume in a uniform pattern such that individual read and write operations can be fulfilled cooperatively by all the underlying physical devices providing storage for that Volume. This requires that each Segment in the Volume represent exactly the same number of logical sectors. The Sequence value determines which modulus of the volume sector number and the NumSegments refers to the particular Segment. For example, if three Segments belong to a single volume, and each volume sector was composed of 2 Segment Sectors; a write to volume sector 3401 would be written to the Segment with Sequence value 2 while a write to volume sector 3402 would be written to the Segment with Sequence value 0. This is because the volume segment number 3401 modula the number of Segments in the Volume, 3, yields 2 and the volume Segment number 3402 modula the number of Segments in the Volume, 3, yields 0. Note that the entirety of the volume sector is written to the same Segment, event if the number of Segment Sectors per volume sector is greater than 1. It would be possible to further interleave the Segment Sectors which compose a volume sector if and only if the number of Segments per volume is a power of 2 and all Segments in the Volume have the same logical sector size. Currently this additional interleaving is not currently supported by the Phoenix File System but may be implemented in a future version.
Format of a Volume Descriptor:
Offset Size Field Name Description
0h 4 BYTES BootStubStart Short intrasegment jump to real start of boot stub
4h 1 BYTE StubSectors Number of logical sectors in the boot stub program
5h 1 BYTE Flags
bit 0 : Dirty Set to 1 when the file system is initialized; set to 0 when
file system is properly shutdown.
bits 1-7 : (reserved) must be 0
6h 1 BYTE ClusterSize Log base 2 of size of a Volume Sector in 512-byte segments
7h 1 BYTE (reserved) must be 0
8h 1 QWORD VolumeSize Number of Volume Sectors in this Volume
10h 1 QWORD SuperBlock Volume sector number where SuperBlock is located
18h 1 DWORD DateCreated Date/Time Volume was created
1Ch 1 DWORD (reserved) must be 0
20h 64 BYTES VolumeLabel Short name associated with the volume (null terminated string)
60h x BYTES BootStub Minimum of 410 bytes of boot stub program
----------------
Total: 512 bytes minimum
At the point in the boot process when the Volume Descriptor is loaded into memory, it is read as a single
Segment Sector, without knowledge of the number of Segment Sectors per volume sector. The
StubSectors field determines the number of Segment Sectors
the boot stub program occupies; this field must always be at least 1 and must be a multiple of the
number of Segment Sectors per volume sector. The first task of the boot stub should be to load
any additional boot stub sectors into memory consecutively after the first boot stub sector; this should
allow for linear program execution.
Volume Sector Allocation
- This layer exists to keep track of which volume sectors have been allocated to hold data, which are available, and which are unusable (bad).
FSDLT Free Space Descriptors (FSDs)
.----.
1 | o--------->|100010010...100101| Allocation map for sectors 0 - 4095
|----|
2 | o--------->|000010101...110001| Allocation map for sectors 4096 - 8091
|----|
3 | o--------->|100100101...001000| Allocation map for sectors 8092 - 12287
|----|
. .
. .
|----|
N | o--------->|001001000...001001| Allocation map for sectors 4096*N - 4096*(N-1)-1
`----'
Where N is the number of entries in the
Free Space Descriptor table (equal to NumVolumeSectors / SectorsPerFSD,
or in this example, NumVolumeSectors / 4096).
| VolumeSectorSize | SectorsPerFSD | Bytes accounted for per FSD | FSD location entries per sector of Free Space Descriptor Location Table | Sectors accountable per sector of Free Space Descriptor Location Table | Bytes accountable per sector of Free Space Descriptor Location Table |
| 256 bytes | 2048 sectors | 524,288 bytes | 32 entries | 65,536 sectors | 16,777,216 bytes |
| 512 bytes | 4096 sectors | 2,097,152 bytes | 64 entries | 262,144 sectors | 134,217,728 bytes |
| 1024 bytes | 8,192 sectors | 8,388,608 bytes | 128 entries | 1,048,576 sectors | 1,073,741,824 bytes |
| 2048 bytes | 16,384 sectors | 33,554,432 bytes | 256 entries | 4,194,304 sectors | 8,589,934,592 bytes |
| 4096 bytes | 32,768 sectors | 134,217,728 bytes | 512 entries | 16,777,216 sectors | 68,719,476,736 bytes |
| 8192 bytes | 65,536 sectors | 536,870,912 bytes | 1024 entries | 67,108,864 sectors | 549,755,813,888 bytes |
| 16384 bytes | 131,072 sectors | 2,147,483,648 bytes | 2048 entries | 268,435,456 sectors | 4,398,046,511,104 bytes |
File Allocation Layer
- This layer exists to group volume sectors into units called files, a file is a collection of information that logically related. In the Phoenix File System, files are described using a tree structure where that tree leafs hold the actual file information. Each node and leaf of the data tree is described using a Tree Node Descriptor:
Structure of a Tree Node Descriptor:
Offset Size Field Name Description
0h QWORD
bit 0 : Type (1) Descriptor refers to SubTree block (internal node)
(0) Descriptor refers to data block (leaf)
bits 1-63 : Location Volume Sector number for block location
8h QWORD Length If Type is 1, is total number of bytes described by SubTree
If Type is 0, is number of bytes in data block
----------------
Total: 16 bytes
Where each block of information is exactly 1 volume sector in size and can hold information
either about the files contents (a tree leaf) or about the location of additional blocks (a
tree node). If a sector is a data block, Length bytes
of the sector are considered to be part of the file's contents. If a sector is a SubTree block,
it is simply a consecutive list of Tree Node Descriptors and the Length
field represents the total number of bytes of the file's information that is described by
all data blocks in or under the SubTree.
Structure of a File Node:
Offset Size Field Name Description
00h DWORD MagicNumber Special number to help discern a File Node from other
data should the file system become corrupt, equal to
31534650h ("PFS1")
04h DWORD HardLinks Number of references made from Directories to this file
08h DWORD Flags Basic file attributes
bit 0 : ArchiveFlag Set whenever the LastModified field is updated
bit 1 : SystemFlag Indicates file is an operating-system related file
bit 2 : HiddenFlag Indicates file should not be listed in default file listings
bit 3 : ReadOnlyFlag Indicates file cannot be written to or deleted
bits 4-7 : (reserved) must be 0
bit 8 : ImmediatePurge Indicates whether file should be immediately purged on delete
bits 9-15 : (reserved) must be 0
--- end general flags, begin internal flags ---
bits 16-18 : FileType Type definition for file data
000 = generic file
001 = directory
010 = symbolic link
bit 19-20 : Compression reserved for file data compression level; must be 0
bit 21 : Encryption reserved for file data encryption flag; must be 0
bit 22 : DeletedFlag Indicates whether or not this file has been deleted
bit 23 : PurgeFlag Indicates whether or not this file is to be purged
bits 24,25 : InternalFlag Indicates what file information, if any, is stored
internally if the File Node
00 = no internal data
01 = internal rights information
10 = internal extended attributes
11 = internal file data
bits 26,27 : (reserved) must be 0
bit 28 : NodeSize Indicates whether or not the File Node occupies the full
volume sector
0 = File Node is half the size of the volume sector
1 = File Node occupies the entire volume sector
bits 29-31 : (reserved) must be 0
0Ch DWORD Owner Object ID of owner of this File Node
10h DWORD Creator Object ID which created this File Node
14h DWORD Modifier Object ID of user who last modified File Node
18h DWORD Created Date/Time File Node was created
1Ch DWORD LastModified Date/Time File Node was last modified
20h DWORD DataAccessed Date/Time file data last accessed
24h DWORD DataModified Date/Time file data last modified
28h QWORD FileSize Total length of file data
30h 1 TND Rights Rights list data tree
40h 2 TNDs EAs Extended Attributes data tree
60h 7 TNDs FileData File contents data tree
D0h x BYTES InternalData minimum of 48 bytes of space specifically set aside for
storing small amounts of data inside the File Node
without using data trees. The InternalFlag determines
which information, if any, is stored internally.
----------------
Total: 256 bytes minimum
A file node is always at most 1 volume sector in size and at least 256 bytes in size, as such,
the amount of space reserved for internal data with a File Node can vary from 48 bytes to
VolumeSectorSize-208
bytes in size. A File Node may occupy an entire sector or only half of a sector, the latter
only being valid for VolumeSectorSizes of 512-bytes or
more (since one half of 512 bytes is 256 bytes, the minimum size of a File Node). Furthermore,
each File Node is identified using a File Node Number of which bits 1-63 indicate the
volume sector number the File Node resides in, and bit 0 is clear if the File Node is in the
first half of the sector and is 1 if the File Node is in the second half of the sector. The
following table summarizes the minimum and maximum sizes of File Nodes and the amount of space
reserved in each File Node for internal data, based on the size of a volume sector.
| VolumeSectorSize | Minimum File Node Size | Maximum File Node Size | Minimum Internal Data Reserve | Maximum Internal Data Reserve |
| 512 bytes | 256 bytes | 512 bytes | 48 bytes | 304 bytes |
| 1024 bytes | 512 bytes | 1024 bytes | 304 bytes | 816 bytes |
| 2048 bytes | 1024 bytes | 2048 bytes | 816 bytes | 1840 bytes |
| 4096 bytes | 2048 bytes | 4096 bytes | 1840 bytes | 3888 bytes |
| 8192 bytes | 4096 bytes | 8192 bytes | 3888 bytes | 7984 bytes |
| 16384 bytes | 8192 bytes | 16384 bytes | 7984 bytes | 16176 bytes |
| VolumeSectorSize | Maximum Internal Data Reserve | Maximum Internal Rights Info | Maximum Internal Extended Attributes | Maximum Internal File Data |
| 512 bytes | 304 bytes | 318 bytes (320 total) | 334 bytes (336 total) | 416 bytes |
| 1024 bytes | 816 bytes | 830 bytes (832 total) | 846 bytes (848 total) | 928 bytes |
| 2048 bytes | 1840 bytes | 1854 bytes (1856 total) | 1870 bytes (1872 total) | 1952 bytes |
| 4096 bytes | 3888 bytes | 3902 bytes (3904 total) | 3918 bytes (3920 total) | 4000 bytes |
| 8192 bytes | 7984 bytes | 7998 bytes (8000 total) | 8014 bytes (8016 total) | 8096 bytes |
| 16384 bytes | 16176 bytes | 16190 bytes (16192 total) | 16206 bytes (16208 total) | 16388 bytes |
Structure of a Directory entry: Offset Size Field Name Description 00h QWORD FileNode File node of file associated with entry 08h WORD NameLength Length of file name in Unicode characters 0Ah x bytes Name Name associated with entry ---------------- Total: 10+x bytes minimumThese entries are packed back-to-back in the directory file.
Structure of a Rights List Entry:
Offset Size Field Name Description
00h DWORD User Object ID of user or group to whom the rights pertain
04h DWORD Rights Determines what rights the User has to this file node
bit 0 : Find user may read file node information
bit 1 : Read user may read file data
For generic files : may read the "contents" of the file
For directories : may scan the directory contents
For symbolic links : may see where the link points to
bit 2 : ReadRights user may read the rights information of the file node
bit 3 : ChangeRights user may modify the rights information in the file node
bit 4 : ChangeEAs user may modify the extended attributes of the file node
bit 5 : ChangeOwner user may change the ownership of a file node
bit 6 : ChangeFlags user may modify the file node's general flags
bit 7 : Unlink user may remove a link to this file node
bit 8 : Write user may modify the file data of a generic file
bit 9 : Redirect user may change the file data of a symbolic link
bit 10 : Create user may add an entry to a directory (modify directory file data)
bit 11 : Rename user may rename an entry in a directory (modify directory file data)
bit 12 : Remove user may remove an entry from a directory (modify directory file data)
bits 13-23 : (reserved) must be 0
24 : Supervisor user has all rights to the file node
bits 25-30 : (reserved) must be 0
bit 31 : InheritMask (1) rights list entry is an inherited rights mask
(0) rights list entry contains access rights
----------------
Total: 8 bytes
Rights for a given user are resolved using the fully qualified path of the file in question.
The rights list of the file are scanned for an entry explicitly defining the rights of the user in question,
if an entry is found then it completely determines the user's rights.
If not, the parent directory's rights list is scanned for an entry explicitly defining the rights of the user,
if an entry is found then it determines the user's rights, modifiable by an inherited rights mask. If an
entry is not found, the process is continued up the directory tree until the root is reached. If the root is
reached and no rights were ever defined, then the entire process is repeated per group the user belongs to.
If no rights entries are found to be applicable to the user, then the user is presumed to have no rights for
the given file node. The fact that directories which constitute the fully qualified path to the file
can determine a file's access rights illustrates that rights for a file may be inherited. In addition, this
also illustrates how through the use of inheritance, two directory entries which are linked to the same
file node may actually have different access rights. Even though the rights list for the file node is the
same for both directory entries (since both entries actually refer to the same file), the inherited rights
can differ.
File or Directory Access Rights Inherited Rights Mask effective Inherited Rights Mask
/example/of/rights 11111111...11
/example/of 11110111...11 11110111...11
/example 11011111...01 10011111...11 10010111...11
/ 00001100...00 10010111...11
So the given user's access to the file node linked to /example/of/rights would be
Access Rights : 11011111...01
effective Inherited Rights Mask : 10010111...11 AND
--------------
user's effective access rights : 10010111...01
One thing worth pointing out is that a user's accessed rights are determined by a single rights list entry
either located in the file node's rights list or in a directory's rights list which forms the fully qualified
path to the given file.
On the other hand, a user's Inheritied Rights Mask is determined by AND'ing appropriate Inherited Rights Mask
entries from each directory that forms the fully qualified path to the given file. Futhermore, a
effective Inheritied Rights Mask is only applied when the user inherited their rights to a file from a
directory above the file (ie. the given file did not explicitly list access rights for the given user or a
group that the user belongs to).
Structure of an Extended Attribute:
Offset Size Field Name Description
00h BYTE NameLength Length of attribute name (x)
01h BYTE Type Base data type stored in EA
0 = Character (8 bits)
1 = Short Integer Value (8 bits)
2 = Integer Value (32 bits)
3 = Integer Value (64 bits)
8 = Floating Point Value (64 bits)
02h WORD ValueLength Length of attribute value data
04h x BYTES Name The name of the Extended Attribute
x+04h y BYTES Value The EA value; format is determined by the data type
----------------
Total: x+y+4 bytes
Where x is equal to the length of the attribute name rounded to a multiple of 4 bytes and
y is equal to the length of the actual attribute value data.
While a value of 0 in the ValueLength field is acceptable,
the NameLength may never be 0.
Floating point numbers are stored as a double-precision ANSI/IEEE Standard 754-1985 binary floating point
value ("64-bit real"). Any number of values of the base data type may be stored in a single extended
attributes's value data (allowing "arrays" of data stored in a single extended attribute). Each entry
in the attribute's value can be referenced given an index. The number of entries of the base data type
that are stored in the extended attribute's value can be determined by examining the ValueLength field;
the ValueLength must always be a multiple of the size of the base type stored.
For example, a 9 character null-terminated string can be stored in an extended attribute by specifying a base Type of
character and a ValueLength of 10 (9 characters and a null character).
All strings should be stored null-terminated using the character data type. The file system should provide
API functions to read and write null-terminated strings as extended attribute values. The file
system can also perform bounds checking on requests to read from an index into the value data
that is invalid by comparing the index with the number of entries actually present.
The SuperBlock
I'm just going to make notes of what fields that will need to be in the superblock as we think of them, we can go back and define the format later.Location of Free Space Descriptor Table Bad Block List File Node number Root Directory File Node number File System Version Number Number of bytes of reserved space in File Nodes for internal data A Features DWORD, now is reserved(0), but later on bits may indicate advanced features Date/Time created Volume sector around which to locate directories (center of directory band) Minimum acceptable percentage of volume sectors that may be free. ? Volume creator (who partitioned it and/or formatted it) ?Backup copies of the SuperBlock are placed immediately after every 16th Free Space Descriptor on the volume. Any writes to the SuperBlock do not complete until all backup copies are also updated.