Contents
Changelog:
- 7 November 2018: Indicate that return value of
FAT_mount
is true on succees and false on error, not 0 or -1. - 9 November 2018: Note that
fat_shell
’slsdir
only prints from the first 1000 directory entries. - 9 November 2018: Mention implementing tracking the current working directory in the “possible order of implementation”; add note that the idea of the step about implementing open/close without pathnames is just about making file descriptor number allocation work and that one might track more than the first cluster number for open files.
- 11 November 2018: Correct
make submit
tomake archive
- 11 November 2018: add note in hints about cluster numbering being more complex than slides
- 12 November 2018: Correct
foo/bar
tobar
as whatfoo/../bar
is equivalent to - 12 November 2018: remove text about directories being required to have a leading
/
, contradicting the idea of paths that are relative to the current working directory. - 13 November 2018: provide updated version of
fat_test.cc
that corrects typos in messages about what the tests are doing (the tests themselves are unchanged).
Your Task
-
Download the skeleton code (link — last updated 1 November 2018) containing library function declarations, the beginnings of some testing code, and a sample Makefile.
- Write a C++ library to read from a FAT32 filesystem, except that
- You may assume filenames only contain 7-bit ASCII characters.
- You may assume that the filesystem is small enough to map into memory, such as with
mmap
. - You do not need to support “long” filenames (filenames longer than 8 characters plus a 3 character extension).
Your library should support a POSIX-like API (but using “file descriptors” your library tracks internally rather than ones tracked by the OS) including the following functions, which are declared in
fat.h
(and much exactly match these prototypes).Whenever a path within the filesystem is required, paths either
- start with a
/
. These paths are relative to the root directory of the filesystem. -
start with a non-
/
character. These paths are relative to the current working directory, which is initially the root directory Each path consists of directory or filenames separated by/
. For paths that name a directory (e.g. forFAT_readdir
), there may be a trailing/
(as in/
, the path of the root directory, or/FooBar/
, a directory calledFooBar
in the root directory). - (required for checkpoint)
bool FAT_mount(const std::string &path)
– open the specified FAT disk image and use it for all subsequentFAT_*
calls. In this function onlypath
is a path on the underlying OS’s filesystem instead of a path in the FAT volume. This should return true if the disk image is successfully opened, and false on any error. int FAT_open(const std::string &path)
– open the specified file. Directories in the path are seperated by “/”. Returns an integer file descriptor (one used by your library, not a real file descriptor on the underlying OS) that your library will accept for other calls. On failure, returns -1 instead; you must fail if you ask to open a file that does not exist. You must support opening at least 128 files at a time; after 128 files are opened you may return an error if more are opened. You may also return an error when a directory is opened.bool FAT_cd(const std::string& name)
– change the current working directory toname
. Returnsfalse
on failure (e.g. no such directory)bool FAT_close(int fd)
– close the specified file descriptor returned byFAT_open
. Returnsfalse
on failure (e.g. file not open).int FAT_pread(int fd, void *buffer, int count, int offset)
– copycount
bytes starting with byte numberoffset
in the previously opened file. Byte offsets start with0
, so reading with an offset of0
and acount
equal to the size of the file will read the entire file. Returns the number of bytes read, which will be less thancount
if the caller asks to read past the end-of-file. Returns -1 if the read fails (e.g.fd
was not opened). Reading 0 bytes succesfully, such as whencount
is 0 or whenoffset
is past the end of file should return0
, not-1
.- (required for checkpoint, but only the path “/” needs to work for checkpoint)
vector<AnyDirEntry> FAT_readdir(const std::string & path)
– return an vector of directory entries, whereAnyDirEntry
is a struct declared in the supplied filefat.h
. These directory entries should be exactly as found on disk. Do not filter out unused entries. In the event of an error, return an empty vector.
You should include a Makefile that builds your library as libfat.a, like our sample Makefile will.
- Create a tar.gz archive like the one our supplied Makefile does with
make archive
, and submit it via the submission site.
Reference
-
Refer to the FAT specification. This specification documents FAT32, FAT16, and FAT12, but you are only required to implement FAT32.
-
https://wiki.osdev.org/FAT may be a convenient reference.
-
You can test with this FAT32 disk image.
Included files
-
We include a
fat.h
which includes the prototypes necessary forAnyDirEntry
.AnyDirEntry
has the exact same layout in memory that the directory entries have on disk, so you do not need to convert field-by-field. -
We include a
fat_internal.h
which includes a struct that has the same layouts as the Bios Parameter Block to keep you from needing to worry about the details of converting that to a C struct. -
The structs declared in
fat.h
andfat_internal.h
are laid out so that values in memory have the exact same layout that have one disk. This allows you to convert the bytes on disk directly to the struct rather than copying variable-by-variable.To make this work, in addition to setting the size of each variable in the struct, we’ve used
__attribute__((packed))
to tell GCC to not add any extra space between each of the values in the struct. -
We include two testing programs that will be built and linked by the supplied Makefile:
-
one called
fat_shell
that lets you have a shell-like interface to the filesystem. Among the commands it supports are:mount FILENAME
lsdir DIRECTORY
— note that this only looks at the first 1000 entries returned by your readdir() (to avoid spamming with too much output)cp FILENAME OUTPUT-FILENAME SIZE
– copy SIZE bytes from a file from within the FAT filesytem to a normal file on diskopen FILENAME
close FILE-DESCRIPTOR-NUMBER
read FILE-DESCRIPTOR-NUMBER SIZE OFFSET
exit
-
one called
fat_test
that runs some fixed tests for the sample FAT32 image linked above. Note that this is not a complete test and does not verify several features we are likely to test you on, like opening multiple files at once, directories with a large number of entries, or filesystem images with different cluster sizes.The original version of the test file has some tests which work correctly but output messages that mistakenly reference a file other than the one being tested; a version that corrects this messages is available here
-
Hints
Viewing FAT filesystems
-
You may find a hex editor like
bless
(which you can install on your VM withsudo apt install bless
) helpful to examine the contents of a FAT32 disk image. -
You can mount a FAT disk image with the mount command. After creating a directory like
/mnt
to mount it to run something likesudo mount the-fat-disk-image.img /mnt
to mount it. You can then unmount it withsudo umount /mnt
. If you’re worried about accidentally modifying the disk image this way, you can also mount it read-only with a command likesudo mount -o ro the-fat-disk-image.img /mnt
.
Possible order of implementation
This is what I used for my reference implementation, you can do something different.
-
Load the BPB (the “header” indicating where the FAT and root directory are) from the disk into the supplied struct.
-
Write a function to read a particular cluster; use this read the first cluster of the root directory (whose location is specified by information in the BPP) and implement
readdir
. -
Write a function to read a particular entry from the FAT. Use this to implement
readdir
for when the root directory takes up multiple clusters. -
Write code for open and close that handles allocating and deallocating file descriptors, tracked by some global data structure. For now, don’t use a path yet, just open a file based on the number of its first cluster (and whatever else you need to track a file descriptor). The purpose of this step is just to have the code which chooses file descriptor numbers working.
-
Write code for open that handles paths in the root directory only and fails if the filename would be in the root directory but does not exist.
-
Write code for pread that handles reading from the first cluster of files only. You will need to modify your open code to store information needed to find this cluster.
-
Write code for pread that handles reading from more than the first cluster of files.
-
Modify
readdir
andopen
to traverse directories. You will probably want to create a shared utility function that looks up directory entries by pathname for both of these. -
Add support for tracking the current the working directory (probably as a cluster number) and implement relative pathname lookups based on this.
C++ tricks
Handy library functions
-
You may find the std::toupper function in
<cctype>
helpful for comparing filenames. -
std::string
has afind
method andsubstr
method you might consider using (see documentation). Or you can simply iterate through character-by-character for string opertaions. -
You can get the size of a file by seeking to the end and reading the location then seeking back to the beginning. You can also use the POSIX functions
fstat
orstat
.
Reading into structs
- Given a
char*
that contains data read from disk that contains the bytes of an array of structsFoo
, you can convert it into aFoo*
by casting. This is how I would recommend reading the array of directory entries from a directory.
Tracking current working directory
- It is probably easiest to keep track of the current working directory by remembering its cluster number.
On paths with .
and ..
-
FAT directories except the root directories should contain
.
and..
entries that point to the itself and the parent directory, so your filesystem should handle paths likefoo/../bar
as equivalent tobar
“for free” — except that:- you will need to treat these paths specially because they don’t have an extension like
.txt
and don’t use the extension part of the filename field of the directory entry - FAT uses cluster number
0
in directory entries to represent the root directory, even if it is stored in a different cluster number.
We do not care if you handle changing directory to
/..
or/.
(though a real OS would do this, even though the root directory does not contain these directory entries) - you will need to treat these paths specially because they don’t have an extension like
Locating clusters
- Our slides suggested that cluster 0 was located at the beginning of the disk. The way the FAT specification specifies to interpret cluster numbers in the FAT data structure is a bit more complex than this.