Author
|
Topic: Release: ScriptCrunch v1.0! (Read 1163 times)
|
Klarth
Guest
|
|
« on: December 04, 2006, 09:53:35 pm » |
|
This is the first version of my command line utility to generate DTE, Dictionary, Substring, Dictionary+DTE, and Substring+DTE. Read the ScriptCrunch.txt included in the zip package to figure out how to use it. In my opinion, it one-ups the current tools by several times over. Download it from this pageChangelog: v1.0 1. Better DTE results: One optimization done. DTE(128) on the text file I tried previously has increased by 2.7% better compression to 38.1%. I mistakenly wasn't checking the last two characters of every line. It seems to do even better on Cless's small test files. 2. Faster substring results: About twice as fast now and gained about the same amount of compression as the DTE did. Substring+DTE is a bit better now too, but still is failing to Dictionary+DTE somehow. :\\ 3. Accurate mock script insertion as to the sizes Atlas would produce (minus embedded pointers): Done. It was previously not encoding stuff like <LINE>, <END>, etc which I knew about before the mock insertion idea came to me. I added a new key named IgnoreBlocks to the configuration that ignores for the analysis, but not the encode. 4. Added specific errors when it comes to INI parse failures. 5. Fixed the bug with not filling the DTE/Dictionary tables. 6. Added script-like dumping of result tables for easier insertion.
|
|
« Last Edit: December 10, 2006, 10:06:19 am by Klarth »
|
|
|
|
Spikeman
Guest
|
|
« Reply #1 on: December 05, 2006, 02:41:49 am » |
|
I'll probably end up using this for my current project (Rockman.EXE 4.5). I've already implemented the DTE hacks in ASM for most of the font routines, all I need to do now is write a script dumper (maybe I should just try Atlas, I've heard it works great ) and translate the script. I actually wrote my own little DTE generator, called "DTECount".. but it's probably not nearly efficient as yours. It just counted the number of each pair, took the highest and added to the table, recounted, and did that until the intended filesize was reached.
|
|
|
|
Klarth
Guest
|
|
« Reply #2 on: December 05, 2006, 03:01:40 am » |
|
That's the same method I use in ScriptCrunch. I don't know of a better method without researching something or without using some method of picking the best DTE pair based on both the number of entries and the other DTE entries are are effected and reduced. Anything else twists my mind.
And as for Atlas, you can't use it to dump scripts. However, I can probably clean up the Dumping/Insertion Table libs I have (one version for dumping, one for insertion...it was originally one, then things got out of date and I split them) in the next few days and release them. It'd be fine for you as long as you use C++...and it seems to be bugged on g++ from what Cless says, but I don't have g++ to test it since I only use VC++.
|
|
|
|
Spikeman
Guest
|
|
« Reply #3 on: December 05, 2006, 03:08:38 am » |
|
I actually use C#, but I know the basics of C++, probably (hopefully) enough to get it working. Thanks for all your great utilities man!
|
|
|
|
Klarth
Guest
|
|
« Reply #4 on: December 05, 2006, 04:12:45 am » |
|
There may be a bug with it when it comes to specifying a larger sized dictionary/DTE table than what is possible. I'll fix it for the real release.
And Spikeman, if you know enough C++ to locate the script (or use the pointer table to redirect to strings), then it'll be enough to use my Table lib. I'll aim to release it on Thursday or so, to give me some extra time to slack. After all, I was going to release the initial version 5 years ago...what's a few more days? :p
|
|
|
|
RedComet
Guest
|
|
« Reply #5 on: December 05, 2006, 06:13:45 am » |
|
Wow, wordcount just became obsolete.
|
|
|
|
Nightcrawler
Guest
|
|
« Reply #6 on: December 05, 2006, 12:40:33 pm » |
|
How does this program handle Japanese characters? All of my scripts have the Japanese original lines directly above the English line.
On a side note, how does Atlas deal with script like this? I have yet to really use Atlas.
|
|
|
|
RedComet
Guest
|
|
« Reply #7 on: December 05, 2006, 12:41:20 pm » |
|
With Atlas you just have to precede each line with // and it'll ignore it.
|
|
|
|
Cless
Guest
|
|
« Reply #8 on: December 05, 2006, 02:25:49 pm » |
|
Yeah. In my Atlas scripts and custom dumps I usually leave the Japanese and preceed it with // to "comment it out"... works well for this program as well.
Well, the example ini seemed designed out of the box to parse Atlas scripts without trouble. Should be able to adapt it to other custom script formats, at least if the lines are preceded with a flag that you could define.
|
|
|
|
Ryusui
Guest
|
|
« Reply #9 on: December 05, 2006, 02:32:22 pm » |
|
Yeah. In my Atlas scripts and custom dumps I usually leave the Japanese and preceed it with // to "comment it out"... works well for this program as well.
That's a habit I've picked up on as well. Used to be I'd just rip out the Japanese text and replace it.
|
|
|
|
Klarth
Guest
|
|
« Reply #10 on: December 05, 2006, 04:03:50 pm » |
|
Well, the example ini seemed designed out of the box to parse Atlas scripts without trouble. Should be able to adapt it to other custom script formats, at least if the lines are preceded with a flag that you could define.
Yep, you can define how comments are. The example ini's parse out C-style block comments "/* Comment */", Atlas commands ("#" as first character on a line), C++ line comments "//", and block comments that are usually used for control codes / line breaks, etc "<comment>". But if your script files look like: [Japanese] Jtext here [English] Translation here You'd have to do something whacky like...BlockComment=[Japanese],[English] It seems to work well for DTE, Dictionary, Dictionary+DTE, but the Substring modes need working on because it doesn't give the amount of compression it should get. Does anybody know of any algorithm for doing this? I believe it'd have to be something that not only gets the highest frequency, but also weighs that against the strings it effects. I also need to add a comment type specifically for text that gets ignored for frequency analysis, but still remains in the script for the mock insertions.
|
|
« Last Edit: December 05, 2006, 04:15:18 pm by Klarth »
|
|
|
|
Cless
Guest
|
|
« Reply #11 on: December 06, 2006, 01:04:36 pm » |
|
In a version not yet public, the DTE algorithm has been greatly improved and has yielded rates of 2-5% (possibly even slightly more) better compression in my test over the current preview build (with a 128 entry DTE table).
|
|
|
|
Spikeman
Guest
|
|
« Reply #12 on: December 07, 2006, 04:28:08 am » |
|
I added my utility to the database, if it's accepted (which I hope it is), you can compare it's results to use. Since I didn't really use any complicated algorithm it probably won't have as good results, but it should be easier to use for people not used to DOS, because it has a cool GUI . (Yeah C#!)
|
|
|
|
Klarth
Guest
|
|
« Reply #13 on: December 07, 2006, 02:39:12 pm » |
|
Pfft, GUI programs aren't necessary for things like this. Considering the program won't get used that many times anyways.
Anybody that has completed enough hacking on a project to need a script analysis program like this shouldn't have a problem dealing with a commandline program. It takes 5-10 minutes to read through the readme and understand most of what the program does. I'd like to think of it as a trade...I saved them hours in programming (likely coming out with something of lower quality) for 10 minutes of their time. And that 10 minutes of time saves me hours of programming a GUI.
Everybody wins!
|
|
|
|
Nightcrawler
Guest
|
|
« Reply #14 on: December 07, 2006, 03:07:35 pm » |
|
Pfft to you. I disagree. All of my own utilities are GUI based. I find it quicker when working on multiple games to select the game I am working with via the GUI and what function I want to do to it. Intuitive interface, no reading necessary, and I can do it all with one hand.
Let's not turn this into a GUI vs. Command Line debate. Besides, It takes all of 10-15 minutes to design a simple GUI in an editor. Both ways have their strengths and weaknesses.
|
|
|
|
|