Hi
I need help, I couldn't figure out the compression used for the Phantasy Star IV scripts
This is an example of the compression used
Compressed:
Are yo{FF},u a hunter?{/N}{EE}{FE}h{C0}{C7}{F3}{F7}to ex{E9}{87}{E1}mina{F9}{/n}t{EB} m{E3}"ons{EE}s?{FF}Thank{87}{F0}{D4}v{F0}y{E7}uch{7F}"!{/n}I feel{F3} sa{/n}d{F4}r now.{D4}{F8}JagaN{7C}{B3}{/n}fo{E7}{F2}ssi{80}g{B1}{E4}ce{DC}W{/n}{90}{DC}t's go{E0}g{E0}I{86}{F1}ppen{C0}LQ{94}{4F}{F8}Ib{CA}t{51}A{F7}{79}{72}{B3}o,{6E}{FF}{/n}b 7e{A4}{D0}{C6}A {C7}a{9F}{B8}{F3}{C3}{F0}{4E}b{9B}em{B7}t.{/N}C{7C}I'm{6B}{E1}frigh.%{CE}ned{C3}{50}c{78}f{CC}'t{/n}e{3C}
{D2}{80}b{CD}{5A}a{9D}mF{C0}tre{C7}{B9}pCF{FF}Oh{D6}{51}'{DB}{F8}{90}{D1}omple{C2}dO{A2}{A4}jobLi{F8}I
{A3}{82}a{F9}{F3}P{F8}O{8D}P{/n}r{/n}O{F8}
The uncompressed text from the ram:
Are you a hunter?{/N}Are you here to exterminate{/n}the monsters?{FF}Thank you very much!{/n}I feel much safer now.{FF}Thank you again{/n}for you assistance.{FF}What's going to happen now?{FF}{FA}{DA}C{FA}zB{FA}KAAbout a month ago, monsters{/n}began to appear in the basement.{/N}I'm so frightened, I can't{/n}even think about my research!{FF}Oh, you've completed the job!{/n}Thank you so much!{FF}Thank you again for your{/n}assistance.{FF}What's going to happen now?{FF}{FA}{DA}D{FA}zC{FA}KB{FA}HAYou're the hunter commissioned{/n}by the principal?{/N}A kid like you?!{/N}Are you going to be able to{/n}handle it?{FF}Wow, hey gorgeous!{/N}
I have got the above result after building a table, I have used /n to refer to new ling code, and /N to new message box code.
I have figured out that some of the codes refer to the previous characters, the first {EE}{FE} refers to "Are you " like that
FF-EE=11 -> decimal 17, if you return back 17 characters from the uncompressed text you will find it.
and I have also found that by changing the last hex from {FE} the number of bytes will change like below
If I change it to {F9} it will bring only 3 characters "Are", and if I change it to {FF} it will bring 9 characters "Are you a" that means it uses 3bits for the length
from the above I knew it is some sort of LZSS compression, but there are other codes I couldn't find any meaning for them like the "{FF}," that comes at the beginning of the compressed text
I don't know how could I make reverse engineering and analyze the code responsible for the decompression, so I just try to compare the code with the known compression types.