Tauwasser
Guest
|
|
« Reply #1 on: March 29, 2010, 01:11:34 pm » |
|
I remember the GBA FFTA compression to be rather odd, but I didn't look at it in full detail either.
From what I gathered from my notes: The stuff basically can either be data, or a code that reuses some data. It seems that the control bytes for the flow are marked as follows:
76543210 CDXXXYYY
Where C is set for codes, D is set for data (when C is not set). For data, XXXYYY will be the number of uncompressed bytes -1 that follow. For codes, which are two bytes, CXXX + 3 will be the number of bytes to copy from memory. The memory address to copy from is YYY ZZZZZZZZ bytes before the current write offset, where ZZZZZZZZ is the byte that follows the initial code byte.
Note how FFTA uses a very bad scheme, where every (English) character is represented by two bytes, most of which start with 0x80 in the top bits. I think some punctuation as well as Japanese used one-byte representation. Again, these are probably marked by the high bit set/reset etc., but I didn't look at this. Anyway, that way, the script is really big but you can usually find at least three bytes to reuse, since a single space followed by a letter will usually yield the top 0x80 for the next letter.
cYa,
Tauwasser
|