Building upon the low-level API introduced in an earlier post, we will take a look at the platform-independent high-level API today, which provides support for the things that are to be expected from a game engine file system.
Specifically, Molecule’s file system provides the following features:
- Multiple file devices (native, memory, double-buffered, pak, network, etc.)
- Encryption / decryption
- Compression / decompression (ZIP, LZMA, etc.)
- Synchronous / Asynchronous operations
- Aliases
One thing that should be kept in mind is that all of the above should be implemented as orthogonal features. The following is a short list of possible use cases:
- A pak-file can contain both compressed and uncompressed files, hence compression needs to be something which can piggyback onto other functionality.
- Any file, no matter where it comes from, can be encrypted – hence, decryption also needs to be implemented as a piggyback feature.
- The above should work for both synchronous and asynchronous operations.
- A streamed file might be contained in a pak-file.
Following the Law of Demeter, we would like to implement each feature in its own isolated space, but allow clients to build combinations thereof.
This time, I opted not to use policy-based design, because these kinds of template-based design often force you into compile-time decisions, but a file system is something where I wanted to have the option of run-time changes, configurations, etc.
Instead, the design I came up with for Molecule is the following:
- A file system supports an arbitrary amount of file devices which can be mounted/unmounted to/from the system.
- Each file device takes care of exactly one thing, be it reading from disk, encrypting data, compressing data, sending data over the network, etc.
- A file device is responsible for returning a proper file interface, not the file system itself.
- File devices can be piggybacked onto other file devices, if they need to.
The last bullet point is a very important one, and will be discussed in a minute. Let’s take a look at the relevant parts of the file system API first:
class FileSystem
{
public:
// ...
typedef Flags<internal::FileSystemModeFlags> Mode;
/// Mounts a file device to the file system
void Mount(FileDevice* device);
/// Unmounts a file device from the file system
void Unmount(FileDevice* device);
/// Opens a file for synchronous operations.
/// NOTE: A nullptr is returned if no device for opening the file could be found.
File* Open(const char* deviceList, const char* path, Mode mode);
/// Opens a file for asynchronous operations.
/// NOTE: A nullptr is returned if no device for opening the file could be found.
AsyncFile* OpenAsync(const char* deviceList, const char* path, Mode mode);
/// Closes a file previously returned by a call to Open()
void Close(File* file);
/// Closes a file previously returned by a call to OpenAsync()
void Close(AsyncFile* file);
// ...
};
Whenever a file is opened via a call to FileSystem::Open(), a File instance is returned. This interface offers common functionality for reading, writing, seeking, etc., and serves as an abstract base class. Examples of concrete implementations are the following:
- DiskFile – for reading from HDD, DVD, BluRay, etc.
- MemoryFile – for entirely reading a file first, and then just copying from memory. Uses any other File internally.
- CryptoFile – for decrypting/encrypting data upon reading/writing. Uses any other FIle internally.
As stated above, file devices are responsible for returning a proper File implementation. The way these devices behave is the following:
- A FileDevice is an abstract base class, which offers functionality for opening and closing a file.
- Each file device implementation (e.g. DiskFileDevice, CryptoFileDevice, etc.) takes care of returning the proper File instance.
- The file system walks through the list of mounted file devices, and asks the one corresponding to the device list’s identifier to open a file.
In order to make it easier to understand, let’s walk through a simple example:
// build a simple file system
FileSystem fs(fsArena, 8);
DiskFileDevice diskDevice;
fs.Mount(&diskDevice);
CryptoFileDevice cryptoDevice;
fs.Mount(&cryptoDevice);
// open a file
File* file = fs.Open("crypto:disk", "test.txt", FileSystem::Mode::WRITE | FileSystem::Mode::RECREATE);
// write something into the file
// ...
fs.Close(file);
Upon the call to FileSystem::Open(), the file system internally walks the list of mounted devices, and checks whose ID matches first device in the list (“crypto”). It then asks the CryptoFileDevice to Open() the file.
And here comes the interesting part – the crypto file device is a piggyback-device, which means it never touches files by itself. Instead, it asks other file devices (done via the file system) to open the file instead. This time, the DiskFileDevice (“disk”) is responsible for opening the file, and returns a DiskFile implementation to the caller, which was the CryptoFileDevice.
The CryptoFileDevice in turn takes this DiskFile, and hands it to the CryptoFile implementation, which is returned to the user. Therefore, each time the user calls Read() on the given File, the underlying CryptoFile implementation does something like the following:
unsigned int CryptoFile::DoRead(void* buffer, unsigned int length)
{
const unsigned int bytesRead = m_file->Read(buffer, length);
// very simple crypting
char* b = (char*)buffer;
for (unsigned int i=0; i<length; ++i)
{
b[i] ^= 58;
b[i] ^= 129;
}
return bytesRead;
}
The implementation doesn’t care which File implementation (m_file) it internally uses for reading. It can be any implementation, which makes it possible to arbitrarily piggyback files onto each other, as in the following examples:
// open a crypted, zipped file, read from the network
File* file = fs.Open("zip:crypto:tcp", "test.txt", FileSystem::Mode::READ);
// open a crypted file living on the cartridge (e.g. savegames)
File* file = fs.Open("crypto:cartridge", "test.txt", FileSystem::Mode::READ);
As long as each file device implementation which is to be used as a piggyback device just asks the file system to open a file, which in turn asks the remaining mounted devices to do the job, features can be combined endlessly, even with user-provided file devices.
Additionally, using this system in conjunction with config variables turns out to be really powerful, and offers a whole new set of possibilities:
ConfigSettingString g_sgDevice("g_sgDevice", "The device used for savegames.", "crypto:cartridge");
ConfigSettingString g_defDevice("g_defDevice", "The default device.", "disk");
// open any file on HDD, DVD, etc.
File* file = fs.Open(g_defDevice, "test.txt", FileSystem::Mode::READ);
// open a savegame
File* file = fs.Open(g_sgDevice, "test.txt", FileSystem::Mode::READ);
Because config variables can be configured in either source-code, using configuration files, or by using the in-game console, device lists can now be changed on-the-fly. This is extremely useful during development and debugging.
Developers with a lot of memory available might change their g_defDevice configuration from “disk” to “memory:disk”, resulting in extremely fast loading times. People from the QA department might want to disable encryption of save games during development, so they can just pull down the in-game console mid-game, change the corresponding variable via “set g_sgDevice disk” and have their unencrypted savegames stored to disk, ready to attach them to a bug in the database. During development, programmers will want to switch between “disk” and “pak:disk” (enabling/disabling big pak-files, because those often cause troubles), which can easily be done using the above.
One part of the implementation I haven’t spoken about is the AsyncFile interface. It is somewhat similar to the File interface, but offering facilities for asynchronous operations instead. The underlying piggyback mechanism is exactly the same – OpenAsync() is deferred to mounted file devices.
That’s all there is to the file system, which concludes today’s post!
What do you think of this Stefan?
http://fgiesen.wordpress.com/2011/11/21/buffer-centric-io/
I like the approach, but I’m not yet sure whether it would work for the whole filesystem in an engine.
Hey Stefan,
I’m a bit confused here and have some questions. First of all I like your approach of the file system, but what does the FileDevice do in connection with the File Class? And what does the OsFile Class do from part one?
Lets assume I’d like to load a zip file from my hdd. I’d use two file devices one for loading the zip file from HDD into a buffer and one for loading the actual zip file e.g. creating an instance of a ZipFile Class inherited from the File Class, but what would be the actual task of the FileDevice besides creating the correct instance of the returned file class to the file system? Should it take care of loading all the correct zip headers and stuff? Or should this be done by the ZipFile class itself?
Thanks in advance
The FileDevice implementations simply return an instance of a FIle, e.g. a DiskFileDevice would return a DiskFile*, a ZipFileDevice would return a ZipFile*. Same goes for asynchronous Files (those derive from a different interface). All high-level code deals with FileDevice* and File* only, though.
With this mechanism, you can piggy-back implementations on top of each other, without having to let them know of each other. In your example, you would have a ZipFileDevice, which is responsible for creating a ZipFile. The ZipFile would have a File* as member which it uses for reading data, but it doesn’t matter if it’s a DiskFile or not. It could also be a NetworkFile, so that zip-files could be read from the network. The file system/devices take care of creating the correct instance, so you can either have “zip:disk” or “zip:tcp”, or something entirely different.
Anyway, the ZipFile would take care of reading headers, decompressing, etc. But it only uses its internal File* member for reading data, so the location where the data actually comes from is transparent to the ZipFile itself.
A note on the OsFile: It’s the only platform-dependent part of the filesystem, and used by the DiskFile and AsyncDiskFile for reading/writing data. If you port the filesystem to a new platform, the OsFile is the only thing that needs to be ported.
Ok,
now I understand the part with the OsFile and DiskFile. Thanks a lot for the reply. Hope to hear about some new articles in the future
Best regards,
Gavin
Hi Stefan,
I also have one question regarding ZipFile/PackFile. These files are mainly used to group several files together. How do you open a single file from a zip file using this system? Is ZipFileDevice responsible for finding it and creating correct ZipFile for reading only this single file? If so, is it parsing all the headers every time you want to open a file?
Hi Garnold,
I would probably add a pair of Mount/Unmount functions to the ZipFileDevice which can be used for making files inside a zip-file available to the file system. Mount() could simple open the zip-file and parse the headers once, storing an open handle to the file for later access.
Subsequent Open() requests to the file system would then ask the ZipFileDevice for returning a ZipFile (as you said).
This means you only have to parse the headers once, and can treat files inside zip-files almost the same as files on disk.