I didn't want to post the direct link on the Community Project Board, so here it is in all of its glory.
Before going onto the explanation of it, any data hacker can identify the data and pointers associated with a VFS. But to dump and rebuild it, you -WILL- need a beginner's knowledge of programming to make custom utilities. If you know to do file i/o with binary files, then that alone is enough to write a VFS extractor/rebuilder in most cases.
Here is some general information on the Virtual Filesystem and why it is so important to extract and rebuild it as one of the -VERY FIRST STEPS- in the project.
A Virtual Filesystem is basically a lot of data files in one file on a CD/DVD that contains all of the data that you want to hack. They can be as small as ~50MB up to ToD2's 1.41GB and even larger. When extracted, you can have hundreds to thousands of subfiles (or worse, archives with subfiles) to work with. This is key in keeping hacked files organized, teamwork running smoothly, and the subfiles are much more manageable in size. Furthermore, it's absolutely necessary to dissect for projects with compression, encryption, or where you just need to expand one file that last elusive byte.
So now you know what it is, how can you identify, crack this thing, ooze its goodies out, and then rebuild it?
- Identifying the VFS
This is the easy part. Just search the game disk for a large file. The filename is usually suggestive of whether it's the main game data VFS or (another possibility) a game movie VFS. If there are more than two VFS's in a game, then usually they can be cracked by the same method if you can finish the next, more difficult step.
- Finding the VFS Pointers
Just as a dialogue string has its pointer for the game to locate it, a VFS file has its pointer.
Prime Locations:
- In a separate file of pointers on the CD, usually in the same directory as the VFS file is in. It should be a small file (< 100kb even for a massive file count) and usually have no other data with it.
- In the VFS file itself. Usually at the top.
- In the EXE file. These ones stump me more often than not. Maybe a good debugger would have prevented it or a pointer table finder. They're either in a structured table or the compiler oddly put them into hardcoded ASM pointers.
- VFS Pointer Data Structure
There are the primary things you'll be looking for when you do notice a VFS pointer table.
- File start pointer: This is usually a linear pointer from the start of the VFS. I've always seen them as being 4 bytes. May be different on PS2 because the processor is 64bit.
- File size: This doesn't always appear, but it makes your work easier. This usually gives the exact filesize, rather than the filesize after it's padded to fill the last sector. (More info below in VFS Pointer Types)
- File name: This is a godsend if the game actually has these. But unfortunately, a lot don't.
- There may be more fields, but the above are the most common by far.
- VFS Pointer Types
- Byte positioned: This type is easy. It directly points to the offset where the file starts at. It will be a multiple of 2048 bytes.
- Sector positioned: Still easy. Each CD data sector is 2048 (0x800) bytes and the files are padded to that length for more efficient reading. So you multiply the sector position by 2048 to get the file position. These are usually 4 or 2 bytes each.
- Byte positioned with file size data: This type is a tricky find. Some VFS Pointer Structures will put the size value (either how many bytes into the last sector the data is, or how many padding bytes are) into the byte positioned pointer itself, the last 11 bits that are usually 0's to sector-align the file. It will read the pointer, use & 0xFFFFF800 to chop the bottom bits off, then use & 0x7FF to get the size of the last sector the file takes up. This type effectively saves almost half the size (just one entry less than half in ToP PSX) than a structure with a 32 bit pointer and a 32 bit size.
I hope this answers a lot of questions about the VFS's in general.