VGMaps
November 24, 2017, 05:49:10 AM *
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
 
   Home   Help Search Login Register  
Pages: [1] 2   Go Down
  Print  
Author Topic: PC Game Hacking and Mapping Tutorial: Xargon  (Read 29876 times)
0 Members and 1 Guest are viewing this topic.
Zerker
Newbie
*
Offline Offline

Posts: 37



« on: November 14, 2012, 06:13:37 PM »

After finishing ROTT and the Shadow Caster maps, I decided to tackle something much simpler, but also to document the process as I go along. That game is Xargon, which can be obtained at Classic Dos Games. According to that site, Allen Pilgrim released the full version available as freeware with a very short license, so we will be able to map all three episodes. However, since I already have it on my computer, I will start with Episode 1. Note that the Source Code is released, so that is an option if I get stuck. However, to hopefuly make this tutorial more useful (and fun), I'm planning to do this as a black-box exercise. Obviously, if the game has a source code release, or is well documented, that can help reduce the guesswork.

As with Shadow Caster and ROTT, I will be using Python and the Python Imaging Library (PIL). I will also need a Hex editor to look through the game resources before digging in, so Bless is my editor of choice. Windows users will obviously need something else.

Generally speaking, creating a map via hacking is composed of three phases:
Phase 1: Figure out the map format
Phase 2: Figure out the image data format(s)
Phase 3: Put it all together and make the map (this the real meat of it)

Phases 1 and 2 can be done in either order, but I will start in the order above.

So, the map format. Xargon is split into a number of different files, and I'm going to take a wild guess that the BOARD_##.XR1 files are the maps. Let's look at one in a hex editor:



The first thing to notice is there's definately a repeating pattern. The second thing is that it looks like a 16 bit pattern. This can mean either 2 individual bytes per location, or a single 16 bit number per location. Either way, we know the size of a location, but not the dimensions of the map. We also know that there is no header, because the pattern starts immediately at the top of the file. Given the nature of the game, we will start with the assumption that the game map is simply a direct listing of tiles without any grouping, compression or other tricks. So, let's take the file size and figure out how many tiles we have total:

19255 bytes / 2 = 9627.5. This already tells us that we must have some sort of footer that isn't part of the map data. Scrolling down to the bottom of the file confirms this, as the map ends in the string "TOO LATE TO TURN BACK!". However, the footer is unlikely to be a huge portion of the file, so let's ignore it for now. Taking the square root of the file size gives us the dimensions if the game used square levels: 98.11. This gives us an order-of-magnitude to guess for. I know that the maps aren't square, but let's run with this and see where it gets us.

The next step is to visualize the map to see if we guessed correctly. 8-bit numbers are easiest, as we can go direct to grayscale. 16 bit numbers present us with two options: if we think the numbers present different information, we can try two parallel grayscale images. If we think it's just one 16 bit tile number, we should just pick two out of the three RGB channels and generate a colour image.  I'm going with the latter option initially. I'm also going to cheat a little bit and copy some boilerplate code from my other scripts as far as grabbing the input file and checking for # of input parameters. Here's the basic visualizer script:

Code:
import struct, sys
from PIL import Image

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print """Usage: python xargonmap.py [Map File]
TODO
"""
    else:
        for filename in sys.argv[1:]:
            mapfile = open(filename, 'rb')
            outname = filename + '.png'

            # Load the map data as a 98 x 98 array of 2-byte positions:
            # This will be switched to proper 16 bit numbers when we
            # actually want to start generating the tile map.
            # struct
            pattern = '<{}B'.format(98*98*2)

            mapdata = struct.unpack(pattern,
                mapfile.read(struct.calcsize(pattern)) )
            mapfile.close()

            # Turn the map data into a list of 3-byte tuples to visualize it.
            # Start by pre-creating an empty list of zeroes then copy it in
            visualdata = [None] * (98*98)
            for index in range(98*98):
                visualdata[index] = (mapdata[index * 2], mapdata[index * 2 + 1], 0)

            # Tell PIL to interpret the map data as a RAW image:
            mapimage = Image.new("RGB", (98, 98) )
            mapimage.putdata(visualdata)
            mapimage.save(outname)

'<{}B' means a little endian pattern of {} bytes, where the actual number is filled in by the format call. Refer to the Python Struct module documention for more information on these strings.

So we run it and get:



That's cute. Obviously wrong too, but there's a clear pattern shift to the image. We should be able to figure out exactly how far we are off on each row. Opening in GIMP and counting the pixel shift shows that we appear to repeat every 64 pixels. Taking our original dimensions, and using 64 as one dimension yields 9627.5/64 = 150.4 for the other. Some of that is going to be footer, but we should be able to clearly see the breakdown when we finish. Our adjusted script becomes:

Code:
import struct, sys
from PIL import Image

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print """Usage: python xargonmap.py [Map File]
TODO
"""
    else:
        for filename in sys.argv[1:]:
            mapfile = open(filename, 'rb')
            outname = filename + '.png'

            # Load the map data as a 98 x 98 array of 2-byte positions:
            # This will be switched to proper 16 bit numbers when we
            # actually want to start generating the tile map.
            # struct
            pattern = '<{}B'.format(64*150*2)

            mapdata = struct.unpack(pattern,
                mapfile.read(struct.calcsize(pattern)) )
            mapfile.close()

            # Turn the map data into a list of 3-byte tuples to visualize it.
            # Start by pre-creating an empty list of zeroes then copy it in
            visualdata = [None] * (64*150)
            for index in range(64*150):
                visualdata[index] = (mapdata[index * 2], mapdata[index * 2 + 1], 0)

            # Tell PIL to interpret the map data as a RAW image:
            mapimage = Image.new("RGB", (64, 150) )
            mapimage.putdata(visualdata)
            mapimage.save(outname)

Yeilding:



That looks WAY better. But it looks sideways. And we can clearly see the garbage at the bottom is the footer and is not tile data (we'll have to figure it out later; I suspect it is item/monster placement). Opening in GIMP again shows that the map is only 128 pixels tall, so let's fix that dimension. Let's also rotate it -90 degrees.

Code:
import struct, sys
from PIL import Image

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print """Usage: python xargonmap.py [Map File]
TODO
"""
    else:
        for filename in sys.argv[1:]:
            mapfile = open(filename, 'rb')
            outname = filename + '.png'

            # Load the map data as a 98 x 98 array of 2-byte positions:
            # This will be switched to proper 16 bit numbers when we
            # actually want to start generating the tile map.
            # struct
            pattern = '<{}B'.format(64*128*2)

            mapdata = struct.unpack(pattern,
                mapfile.read(struct.calcsize(pattern)) )
            mapfile.close()

            # Turn the map data into a list of 3-byte tuples to visualize it.
            # Start by pre-creating an empty list of zeroes then copy it in
            visualdata = [None] * (64*128)
            for index in range(64*128):
                visualdata[index] = (mapdata[index * 2], mapdata[index * 2 + 1], 0)

            # Tell PIL to interpret the map data as a RAW image:
            mapimage = Image.new("RGB", (64, 128) )
            mapimage.putdata(visualdata)
            mapimage.rotate(-90).save(outname)

And the final result (for today) is:

Logged
Zerker
Newbie
*
Offline Offline

Posts: 37



« Reply #1 on: November 15, 2012, 06:09:45 PM »

Yesterday we got the map format decoded. While I can visualize more of the maps the same way, let's go ahead and try to figure out the tile format. Now, Xargon is a 256 colour VGA game, so we're going to need a colour palette. We also need to know what dimension of tile we are looking for. Both of these goals can be accompished by taking a simple screenshot (via Dosbox for me):



Looks like 16 x 16 tiles to me. Looking through the Xargon directory, GRAPHICS.XR1 is the most likely candidate for containing tile information. With the lack of a better idea, I'm going to just assume it contains a whole bunch of RAW 16 x 16 images in order. With a filesize of 697555, and each tile taking up 16*16 = 256 bytes, that is about 2724.8 tiles. Let's cook up a quick looping RAW image importer with PIL and see what we get:

Code:
import struct, sys, os
from PIL import Image

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print """Usage: python xargongraphics.py [Graphics File]
TODO
"""
    else:
        for filename in sys.argv[1:]:
            filesize = os.path.getsize(filename)
            graphicsfile = open(filename, 'rb')

            # Use the screenshot to grab the proper colour palette:
            palimage = Image.open('screeny.png')
            palette = palimage.getpalette()

            for tilenum in range(filesize/(16*16)):
                tile = Image.fromstring("P", (16, 16),
                    graphicsfile.read(16*16))
                tile.putpalette(palette)
                tile.save(os.path.join('output', '{:04}.png'.format(tilenum)) )

            # Since the file does not evenly align with the tiles we are
            # attempting to read, explicitly re-read the very end of the
            # file as a tile.

            graphicsfile.seek(filesize - (16*16) )
            tile = Image.fromstring("P", (16, 16),
                graphicsfile.read(16*16))
            tile.putpalette(palette)
            tile.save(os.path.join('output', 'last.png') )

And the results are:


Not a horrible assumption. There are some properly decoded (be it shifted) tiles among there. But there are also some images that look misaligned (i.e. not 16 x 16), and some random pixels (which imply headers). There's also a repeating pattern at the top of the file with random pixels that may be an overall file header (i.e. Images 0000 and 0002). Zooming the image shows that each record appears to be 4 bytes in length.

Scrolling through each of these 4-byte regions and looking at the decoding panel in my hex editor, each region appears to be a simple 32-bit integer, and each number is larger than the previous. (example below)



Let's decode those numbers. I locally commented out the image code and am excluding it from below for brevity. Counting the bytes in the Hex editor, it appears to end at offset 0xF8 for a total of 62 records. Also note that pdb.set_trace() in the below example invokes the python debugger where we can inspect data interactively. This also saves me the trouble writing a print statement for debug data.
Code:
import struct, sys, os, pdb
from PIL import Image

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print """Usage: python xargongraphics.py [Graphics File]
TODO
"""
    else:
        for filename in sys.argv[1:]:
            filesize = os.path.getsize(filename)
            graphicsfile = open(filename, 'rb')

            header = '<62L'

            headerdata = struct.unpack(header,
                graphicsfile.read(struct.calcsize(header)) )

            pdb.set_trace()

Running that gives us the the python debugger prompt:
Code:
zerker@Iota:/data/GameFiles/Dos/dosroot/Xargon$ python xargongraphics.py GRAPHICS.XR1
> /data/GameFiles/Dos/dosroot/Xargon/xargongraphics.py(28)<module>()
-> for filename in sys.argv[1:]:
(Pdb) headerdata

At which we can simply type the name of the variable we want to inspect to see its content:
Code:
(Pdb) headerdata
(0, 768, 9359, 14366, 15087, 21646, 22560, 54601, 61481, 68227, 74973, 87197, 94720, 98099, 101737, 124062, 131326, 140144, 148185, 155708, 160382, 163761, 171284, 178030, 197467, 204731, 214326, 235317, 253718, 262277, 266692, 296888, 305411, 309719, 321156, 325931, 343340, 353794, 363293, 374943, 395189, 407614, 0, 415826, 433431, 439659, 0, 451922, 473916, 0, 481227, 485490, 488542, 491213, 554159, 563460, 566188, 597635, 629082, 650040, 662219, 664428)

Keep in mind the file size of this graphics file is 697555. The last number is close, but does not exceed this number. This implies to me that these are all file offsets, probably delimiting the start of a region of data in the file. Since image 0002 also looks like header data, let's look at that too:



Well, it's smaller. Suspiciously looks like half as small, in fact. Let's grab that too and decode it as a series of 16 bit integers.

Code:
import struct, sys, os, pdb
from PIL import Image

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print """Usage: python xargongraphics.py [Graphics File]
TODO
"""
    else:
        for filename in sys.argv[1:]:
            filesize = os.path.getsize(filename)
            graphicsfile = open(filename, 'rb')

            header = '<62L'

            headerdata = struct.unpack(header,
                graphicsfile.read(struct.calcsize(header)) )

            graphicsfile.seek(0x200)

            header2 = '<62H'

            headerdata2 = struct.unpack(header2,
                graphicsfile.read(struct.calcsize(header2)) )

            pdb.set_trace()

And compare the two headers:

Code:
(Pdb) headerdata
(0, 768, 9359, 14366, 15087, 21646, 22560, 54601, 61481, 68227, 74973, 87197, 94720, 98099, 101737, 124062, 131326, 140144, 148185, 155708, 160382, 163761, 171284, 178030, 197467, 204731, 214326, 235317, 253718, 262277, 266692, 296888, 305411, 309719, 321156, 325931, 343340, 353794, 363293, 374943, 395189, 407614, 0, 415826, 433431, 439659, 0, 451922, 473916, 0, 481227, 485490, 488542, 491213, 554159, 563460, 566188, 597635, 629082, 650040, 662219, 664428)
(Pdb) headerdata2
(0, 8591, 5007, 721, 6559, 914, 32041, 6880, 6746, 6746, 12224, 7523, 3379, 3638, 22325, 7264, 8818, 8041, 7523, 4674, 3379, 7523, 6746, 19437, 7264, 9595, 20991, 18401, 8559, 4415, 30196, 8523, 4308, 11437, 4775, 17409, 10454, 9499, 11650, 20246, 12425, 8212, 0, 17605, 6228, 12263, 0, 21994, 7311, 0, 4263, 3052, 2671, 62946, 9301, 2728, 31447, 31447, 20958, 12179, 2209, 25423)

Interesting. Record sizes maybe? Let's compare differences between offsets for a few adjacent regions to the second set of data to test this theory:
Code:
(Pdb) headerdata[2]-headerdata[1]
8591
(Pdb) headerdata[3]-headerdata[2]
5007
(Pdb) headerdata[4]-headerdata[3]
721

Perfect match. I'm convinced. Now, going by the debug images I already generated, the first region appears to be a whole lot of blue and will be difficult to determine on its own. Let's start with the second region, at offset 9359 (0x248F hex). Since each picture is 256 bytes (0x100 hex), that puts towards the end of picture 0x24 (36 decimal). Hrm, still too much blue. Let's jump ahead until we get to the images. The next is 14366 -> image 56, which is still blue, but the following is 15087 -> 58 which starts to look like something. It also puts is one pixel before the last row of the image, which looks like another header. Since we're offset by 1, and the next image matches the last pixel in this image, I suspect the header is 16 bytes. I'll just copy the bytes from the hex editor:
24 01 00 D0 06 10 0D 90 19 08 04 00 08 10 00 10

Now we need to guess the bit size of each field in this header. Remember that we are little endian. The first four bytes are:
24 01 00 D0

If we think this is 32 bits, that's either -805306076 (signed) or 3489661220 (unsigned), or -8.590234E+09 (floating). None of those are particularly likely (and floats aren't very common for old DOS games in general).

16 bits yields 292, while 8 bits yield 36. Both are possible.

If we go with 36, 01 is unused, and is obviously just a simple 1. Let's convert the whole sequence to simple 8 bit decimal numbers and see if it makes sense:
36 1 0 208 6 16 13 144 25 8 4 0 8 16 0 16

And as 16 bits (unsigned):
292 53248 4102 36877 2073 4 4104 4096

I think it's likely we have either an 8 bit sequence or some sort of hybrid.

I'm scratching my head, so let's get another sample. The next few sample images look like noise, so they may be some other form of data, or just unrecognizably misaligned. Let's jump ahead to the really recognizable tiles around image 293. 293 is offset 75008, and from above, the closest record starts at 74973 (image 292, pixel 221). Looks like another 16 byte header to me. Hex values are:

2F 01 00 7C 0C 3C 18 BC 2F 08 04 00 10 10 00 0F
Decimal:
47 1 0 124 12 60 24 188 47 8 4 0 16 16 0 15

Gah, let's just grab them all and decode this way, then turn it into a CSV.
Code:
import struct, sys, os, pdb, csv
from PIL import Image

def imageheader(fileref, offset):
    fileref.seek(offset)
    headerstruct = '<16B'
    return struct.unpack(headerstruct,
        fileref.read(struct.calcsize(headerstruct)) )


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print """Usage: python xargongraphics.py [Graphics File]
TODO
"""
    else:
        for filename in sys.argv[1:]:
            filesize = os.path.getsize(filename)
            graphicsfile = open(filename, 'rb')

            header = '<62L'

            headerdata = struct.unpack(header,
                graphicsfile.read(struct.calcsize(header)) )

            graphicsfile.seek(0x200)

            header2 = '<62H'

            headerdata2 = struct.unpack(header2,
                graphicsfile.read(struct.calcsize(header2)) )

            # Create the header records using list comprehension
            imageheaders = [imageheader(graphicsfile, offset) for offset in headerdata if offset > 0]

            # Save as a CSV
            with open('debug.csv', 'wb') as csvfile:
                writer = csv.writer(csvfile)
                writer.writerows(imageheaders)

Code:
128 1   0   0   10  0   18  0   34  2   1   0   8   8   0   1
128 1   0   0   8   0   14  0   26  2   1   0   6   6   0   1
10  1   0   200 0   104 1   168 2   2   1   0   8   8   0   3
36  1   0   208 6   16  13  144 25  8   4   0   8   16  0   16
1   1   0   196 0   132 1   4   3   8   4   0   64  12  0   0
38  1   0   186 31  224 62  32  125 8   0   0   24  40  0   0
27  1   0   28  6   204 11  44  23  8   4   0   8   16  0   120
25  1   0   164 6   228 12  100 25  8   4   0   16  16  0   176
25  1   0   164 6   228 12  100 25  8   4   0   16  16  0   16
47  1   0   124 12  60  24  188 47  8   4   0   16  16  0   15
28  1   0   112 7   112 14  112 28  8   4   0   16  16  0   16
12  1   0   48  3   48  6   48  12  8   4   0   16  16  0   16
13  1   0   116 3   180 6   52  13  8   4   0   16  16  0   250
86  1   0   216 22  88  44  88  87  8   4   0   16  16  0   250
27  1   0   44  7   236 13  108 27  8   4   0   16  16  0   136
33  1   0   196 8   4   17  132 33  8   4   0   16  16  0   16
30  1   0   248 7   120 15  120 30  8   4   0   16  16  0   250
28  1   0   112 7   112 14  112 28  8   4   0   16  16  0   109
17  1   0   132 4   196 8   68  17  8   4   0   16  16  0   25
12  1   0   48  3   48  6   48  12  8   4   0   16  16  0   23
28  1   0   112 7   112 14  112 28  8   4   0   16  16  0   17
25  1   0   164 6   228 12  100 25  8   4   0   16  16  0   34
74  1   0   168 19  40  38  40  75  8   4   0   16  16  0   247
27  1   0   44  7   236 13  108 27  8   4   0   16  16  0   195
36  1   0   144 9   144 18  144 36  8   4   0   16  16  0   250
80  1   0   64  21  64  41  64  81  8   4   0   16  16  0   19
70  1   0   152 18  24  36  24  71  8   4   0   16  16  0   157
32  1   0   128 8   128 16  128 32  8   4   0   16  16  0   23
16  1   0   64  4   64  8   64  16  8   4   0   16  16  0   250
77  1   0   221 30  28  62  216 119 8   0   0   32  28  0   64
36  1   0   80  8   16  21  144 31  8   0   0   12  20  0   0
7   1   0   12  4   252 7   220 15  8   0   0   24  24  0   0
12  1   0   24  11  20  22  208 43  8   0   0   32  28  0   64
8   1   0   224 3   160 7   32  15  8   0   0   32  15  0   21
18  1   0   220 17  112 35  152 70  8   0   0   38  25  0   0
37  1   0   90  10  116 20  172 39  8   0   0   32  16  0   0
36  1   0   56  9   80  18  48  35  8   0   0   16  16  0   0
7   1   0   64  11  0   24  172 44  8   0   0   60  68  0   0
27  1   0   129 20  68  45  192 80  8   0   0   36  22  0   0
30  1   0   195 12  44  25  164 49  8   0   0   30  30  0   0
31  1   0   60  8   252 15  124 31  8   4   0   16  16  0   18
18  1   0   40  17  8   34  200 67  8   0   0   40  24  0   0
23  1   0   28  6   220 11  92  23  8   4   0   16  16  0   120
8   1   0   128 12  160 27  160 49  8   0   0   34  44  0   0
25  1   0   8   22  68  46  244 86  8   0   0   10  16  0   0
16  1   0   96  7   64  15  192 28  8   0   0   16  32  0   0
8   1   0   192 3   96  7   160 14  8   0   0   16  29  0   0
15  1   0   4   3   36  6   92  11  8   0   0   16  16  0   0
16  1   0   168 2   48  5   224 9   8   0   0   16  20  0   0
241 1   0   132 64  68  125 196 246 8   4   0   64  12  0   0
10  1   0   136 9   232 18  168 37  8   0   0   30  30  0   0
7   1   0   220 1   156 3   28  7   8   0   0   16  16  0   16
24  1   0   192 30  224 67  224 121 8   0   0   36  36  0   61
24  1   0   192 30  224 67  224 121 8   0   0   36  36  0   90
21  1   0   116 19  148 38  212 76  8   0   0   40  24  0   0
22  1   0   10  11  124 23  32  43  8   0   0   16  20  0   0
6   1   0   248 1   24  4   152 7   8   0   0   10  16  0   0
16  1   0   64  24  64  48  64  96  8   0   0   32  48  0   0

I'll have to go try some ideas tomorrow. That's enough for today.
« Last Edit: November 15, 2012, 06:11:29 PM by Zerker » Logged
Zerker
Newbie
*
Offline Offline

Posts: 37



« Reply #2 on: November 16, 2012, 04:22:52 PM »

For today's effort, let's try to see if we can get better decoding of each image record. It may not be perfect, but we should be able to get good-enough decoding to get the map tiles ready for use. It's very hard to guess what every number might mean from a black box perspective, so some of those fields will just have to remain unknowns for now. Usually, the best approach is to start with a number that you think should be present and try to find it.

The most likely candidates I see are the 3rd and 4th last numbers. Since we know the tile images are 16 x 16, we see a lot of records with those numbers. What I'm going to do is split each record up into a series of images, using those dimensions for each series. We'll see how well that turns out. I'm also going to need to clean up my code a bit more to support this, so it's time to group things into classes. The updated script is below:

Code:
import struct, sys, os, pdb, csv
from PIL import Image

def createpath(pathname):
    """ Simple utility method for creating a path only if it does
    not already exist.
    """
    if not os.path.exists(pathname):
        os.mkdir(pathname)

class imagefile(object):
    def __init__(self, filename, palette):
        filesize = os.path.getsize(filename)
        graphicsfile = open(filename, 'rb')

        header = '<62L'
        headerdata = struct.unpack(header,
            graphicsfile.read(struct.calcsize(header)) )

        graphicsfile.seek(0x200)

        header2 = '<62H'
        headerdata2 = struct.unpack(header2,
            graphicsfile.read(struct.calcsize(header2)) )

        # Create the image records using list comprehension
        self.records = [imagerecord(graphicsfile, offset, size, palette)
            for (offset, size) in zip(headerdata, headerdata2) if offset > 0]

    def debug_csv(self, filename):
        with open(filename, 'wb') as csvfile:
            writer = csv.writer(csvfile)
            for recnum, record in enumerate(self.records):
                writer.writerow([recnum, record.offset, record.size] + list(record.header))

    def save(self, outpath):
        createpath(outpath)
        for recnum, record in enumerate(self.records):
            record.save(os.path.join(outpath, '{:02}-{}'.format(recnum, record.offset)))


class imagerecord(object):
    def __init__(self, filedata, offset, size, palette):
        self.offset = offset
        self.size = size

        filedata.seek(offset)
        headerstruct = '<16B'
        self.header = struct.unpack(headerstruct,
            filedata.read(struct.calcsize(headerstruct)) )

        self.width = self.header[12]
        self.height = self.header[13]

        self.images = []

        for tilenum in range(size/(self.width*self.height)):
            tile = Image.fromstring("P", (self.width, self.height),
                filedata.read(self.width*self.height))
            tile.putpalette(palette)
            self.images.append(tile)


    def save(self, outpath):
        createpath(outpath)
        for tilenum, tile in enumerate(self.images):
            tile.save(os.path.join(outpath, '{:04}.png'.format(tilenum)) )


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print """Usage: python xargongraphics.py [Graphics File]
TODO
"""
    else:
        # Use the screenshot to grab the proper colour palette:
        palimage = Image.open('screeny.png')
        palette = palimage.getpalette()

        for filename in sys.argv[1:]:
            xargonimages = imagefile(filename, palette)
            xargonimages.debug_csv('debug.csv')
            xargonimages.save('output')

So now we have an imagefile class for the overall file, and an imagerecord for each record in the file's table of contents. I also moved my save CSV code and load/save image code into functions for each class.

Anyone unfamiliar with zip, enumerate or Python classes in general can refer to the corresponding Python documentation sections linked for each. I used one yesterday, but if you are unfamiliar with list comprehensions, I recommend looking that up too. I'm sure I'll use more going forward.

The first few sets of images start to look more like something, but they're still a bit weird. Set #5, however, is where things start to get interesting. Here we have an almost-correct decoding of the player sprite!



Guess the guess wasn't too far off. Since we're shifting between each sub-image, I suspect there may be a small header between each image. Let's see what we can find. I'm going to use image set #9, because it's 16 by 16 and very easy to see image alignment. The first image seems to be shifted by 1 pixel, and the the second image seems to be shifted by 2 pixels. Hrm. 3 pixel header? Let's find this in a hex editor. My debug code nicely names the directories by their offsets, so start from offset 74973 + 16 for header + 256 for first image -1 for first image error gets me to 75244. And this decodes to:
16 16 0

So, it looks like the end of the previous header was actually the start of the first image's header. And it also appears I was accidentally grabbing the first pixel from the image. Let's go fix all this, and decode the headers for each image. Also, looking at record 9, I ended up with 47 images. The header for that record has 47 as the first number. Since I know now that we can have variable size images, I'm going to assume this is the number of images in a record.

On my first attempt, I got the following exception
Code:
Traceback (most recent call last):
  File "xargongraphics.py", line 100, in <module>
    xargonimages = imagefile(filename, palette)
  File "xargongraphics.py", line 46, in __init__
    for (offset, size) in zip(headerdata, headerdata2) if offset > 0]
  File "xargongraphics.py", line 78, in __init__
    filedata.read(width*height))
  File "/usr/lib/python2.7/dist-packages/PIL/Image.py", line 1797, in fromstring
    im.fromstring(data, decoder_name, args)
  File "/usr/lib/python2.7/dist-packages/PIL/Image.py", line 590, in fromstring
    d.setimage(self.im)
ValueError: tile cannot extend outside image

Which implies something is mixed up. Debugger time.
Code:
-> tile = Image.fromstring("P", (width, height),
(Pdb) width
1
(Pdb) height
0
(Pdb) self.header
(128, 1, 0, 0, 10, 0, 18, 0, 34, 2)
(Pdb)

Oops. I think I subtracted too many bytes when I shrunk the header. Here's the code with that fixed.
Code:
import struct, sys, os, pdb, csv
from PIL import Image

def createpath(pathname):
    """ Simple utility method for creating a path only if it does
    not already exist.
    """
    if not os.path.exists(pathname):
        os.mkdir(pathname)

class imagefile(object):
    def __init__(self, filename, palette):
        filesize = os.path.getsize(filename)
        graphicsfile = open(filename, 'rb')

        header = '<62L'
        headerdata = struct.unpack(header,
            graphicsfile.read(struct.calcsize(header)) )

        graphicsfile.seek(0x200)

        header2 = '<62H'
        headerdata2 = struct.unpack(header2,
            graphicsfile.read(struct.calcsize(header2)) )

        # Create the image records using list comprehension
        self.records = [imagerecord(graphicsfile, offset, size, palette)
            for (offset, size) in zip(headerdata, headerdata2) if offset > 0]

    def debug_csv(self, filename):
        with open(filename, 'wb') as csvfile:
            writer = csv.writer(csvfile)
            for recnum, record in enumerate(self.records):
                writer.writerow([recnum, record.offset, record.size] + list(record.header))

    def save(self, outpath):
        createpath(outpath)
        for recnum, record in enumerate(self.records):
            record.save(os.path.join(outpath, '{:02}-{}'.format(recnum, record.offset)))


class imagerecord(object):
    def __init__(self, filedata, offset, size, palette):
        self.offset = offset
        self.size = size

        filedata.seek(offset)
        headerstruct = '<12B'
        self.header = struct.unpack(headerstruct,
            filedata.read(struct.calcsize(headerstruct)) )

        self.numimages = self.header[0]
        self.images = []

        for tilenum in range(self.numimages):
            (width, height, unknown) = struct.unpack('<3B',
                filedata.read(3))

            tile = Image.fromstring("P", (width, height),
                filedata.read(width*height))
            tile.putpalette(palette)
            self.images.append(tile)


    def save(self, outpath):
        createpath(outpath)
        for tilenum, tile in enumerate(self.images):
            tile.save(os.path.join(outpath, '{:04}.png'.format(tilenum)) )


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print """Usage: python xargongraphics.py [Graphics File]
TODO
"""
    else:
        # Use the screenshot to grab the proper colour palette:
        palimage = Image.open('screeny.png')
        palette = palimage.getpalette()

        for filename in sys.argv[1:]:
            xargonimages = imagefile(filename, palette)
            xargonimages.debug_csv('debug.csv')
            xargonimages.save('output')

And the output for a random record with variable-sized images:


Nailed it!

Now, to top things off, let's mask transparent areas of sprites. It looks like colour 0 (black) is used for transparency. Let's just add a routine to loop through the image data, create a mask, then convert each to RGBA. This will make our lives much easier going forward. I'm actually going to just copy a routine I already wrote for my Shadow Caster maps.

And done...
Code:
import struct, sys, os, pdb, csv
from PIL import Image

def createpath(pathname):
    """ Simple utility method for creating a path only if it does
    not already exist.
    """
    if not os.path.exists(pathname):
        os.mkdir(pathname)

class imagefile(object):
    def __init__(self, filename, palette):
        filesize = os.path.getsize(filename)
        graphicsfile = open(filename, 'rb')

        header = '<62L'
        headerdata = struct.unpack(header,
            graphicsfile.read(struct.calcsize(header)) )

        graphicsfile.seek(0x200)

        header2 = '<62H'
        headerdata2 = struct.unpack(header2,
            graphicsfile.read(struct.calcsize(header2)) )

        # Create the image records using list comprehension
        self.records = [imagerecord(graphicsfile, offset, size, palette)
            for (offset, size) in zip(headerdata, headerdata2) if offset > 0]

    def debug_csv(self, filename):
        with open(filename, 'wb') as csvfile:
            writer = csv.writer(csvfile)
            for recnum, record in enumerate(self.records):
                writer.writerow([recnum, record.offset, record.size] + list(record.header))

    def save(self, outpath):
        createpath(outpath)
        for recnum, record in enumerate(self.records):
            record.save(os.path.join(outpath, '{:02}-{}'.format(recnum, record.offset)))


class imagerecord(object):

    @staticmethod
    def maskimage(inimage):
        """ Masks colour 0 in the given image, turning those pixels
        transparent. Returns the resulting RGBA image.
        """
        tempmask = Image.new('L', inimage.size, 255)
        maskdata = list(tempmask.getdata())
        outimage = inimage.convert("RGBA")

        for pos, value in enumerate(inimage.getdata()):
            if value == 0:
                maskdata[pos] = 0

        tempmask.putdata(maskdata)
        outimage.putalpha(tempmask)
        return outimage

    def __init__(self, filedata, offset, size, palette):
        self.offset = offset
        self.size = size

        filedata.seek(offset)
        headerstruct = '<12B'
        self.header = struct.unpack(headerstruct,
            filedata.read(struct.calcsize(headerstruct)) )

        self.numimages = self.header[0]
        self.images = []

        for tilenum in range(self.numimages):
            (width, height, unknown) = struct.unpack('<3B',
                filedata.read(3))

            tile = Image.fromstring("P", (width, height),
                filedata.read(width*height))
            tile.putpalette(palette)
            self.images.append(self.maskimage(tile))


    def save(self, outpath):
        createpath(outpath)
        for tilenum, tile in enumerate(self.images):
            tile.save(os.path.join(outpath, '{:04}.png'.format(tilenum)) )


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print """Usage: python xargongraphics.py [Graphics File]
TODO
"""
    else:
        # Use the screenshot to grab the proper colour palette:
        palimage = Image.open('screeny.png')
        palette = palimage.getpalette()

        for filename in sys.argv[1:]:
            xargonimages = imagefile(filename, palette)
            xargonimages.debug_csv('debug.csv')
            xargonimages.save('output')

A quick audit of the map tile groups show that there does not appear to be any inadvertent masking, so it looks like we're done decoding the image data. We have a full set of tiles and sprites that we can use to generate our tile map. There's a few parameters we didn't decode on each image record, but they do not appear to be totally necessary for our purposes.

Here's a sample world map sprite with proper masking:

Logged
Zerker
Newbie
*
Offline Offline

Posts: 37



« Reply #3 on: November 17, 2012, 08:59:50 AM »

Before we start identifying tiles and creating the image map, there are two things I noticed yesterday about the image data that I want to investigate.

1) The GRAPHICS file table of contents had a few blank entries. My current algorithm skips those positions entirely. However, when we start identifying tile mappings, the fact that there was a field allocated for them may be important. I will therefore update my algorithm to specifically create blank places in order to keep the record alignment accurate. It also appears to have space for more records, so I'm going to at least process those regions in case there are more records in the registered version.

2) Record 4 and Record 49 have a garbage 64*12 image in each of them. Also, record 49 appears to have the wrong colour palette. The interesting thing about a 64*12 image is that it takes up 768 bytes. A full 256 colour palette also takes up 768 bytes, so I suspect these records are actually colour palettes. If so, we don't need our sample screenshot and we can pull the colour palette direct from the game data. It also means we can correct the palette for the Record 49 images, if this is accurate.

Also, as the source files are going to get a bit more plentiful (and of increasing length) I'm also going to change to only posting snippets of code and include the full listing in a zip at the end of this day's post.

Finally, because I'm paranoid, I'm going to put in a warning of a record still has data after we read the last image in that record. I *think* I'm decoding that right, but if there's more data, we don't want to miss it.


For the blank entries, it couldn't be easier. Increasing the size of the record removes the need for the seek command and the IF on the list comprehension:
Code:
header = '<128L'
headerdata = struct.unpack(header,
    graphicsfile.read(struct.calcsize(header)) )
header2 = '<128H'
headerdata2 = struct.unpack(header2,
    graphicsfile.read(struct.calcsize(header2)) )

# Create the image records using list comprehension
self.records = [imagerecord(graphicsfile, offset, size, palette)
    for (offset, size) in zip(headerdata, headerdata2)]

Instead, the IF needs to go inside the image record instead. It also needs to be added to the debug output so we don't get a ton of empty folders. I'll let the debug CSV actually attempt to write out data for these blank records so it's easier to see the breakdown. I'm also going to add my paranoid sanity check to the __init__ method now. Here is the updated imagerecord class:

Code:
class imagerecord(object):

    @staticmethod
    def maskimage(inimage):
        """ Masks colour 0 in the given image, turning those pixels
        transparent. Returns the resulting RGBA image.
        """
        tempmask = Image.new('L', inimage.size, 255)
        maskdata = list(tempmask.getdata())
        outimage = inimage.convert("RGBA")

        for pos, value in enumerate(inimage.getdata()):
            if value == 0:
                maskdata[pos] = 0

        tempmask.putdata(maskdata)
        outimage.putalpha(tempmask)
        return outimage

    def __init__(self, filedata, offset, size, palette):
        self.offset = offset
        self.size = size

        self.images = []

        if offset > 0:
            filedata.seek(offset)
            headerstruct = '<12B'
            self.header = struct.unpack(headerstruct,
                filedata.read(struct.calcsize(headerstruct)) )

            self.numimages = self.header[0]

            for tilenum in range(self.numimages):
                (width, height, unknown) = struct.unpack('<3B',
                    filedata.read(3))

                tile = Image.fromstring("P", (width, height),
                    filedata.read(width*height))
                tile.putpalette(palette)
                self.images.append(self.maskimage(tile))

            # Check to see if we actually loaded all data from this record
            leftover = self.offset + self.size - filedata.tell()
            if leftover > 0:
                print "Record at offset {} has {} bytes unaccounted for.".format(self.offset, leftover)
            elif leftover < 0:
                print "Record at offset {} read {} bytes beyond its boundary.".format(self.offset, -leftover)

        else:
            self.numimages = 0
            self.header = []


    def save(self, outpath):
        if self.numimages > 0:
            createpath(outpath)
            for tilenum, tile in enumerate(self.images):
                tile.save(os.path.join(outpath, '{:04}.png'.format(tilenum)) )

And running again...

Code:
Record at offset 768 has 3 bytes unaccounted for.
Record at offset 9359 has 3 bytes unaccounted for.
Record at offset 14366 has 39 bytes unaccounted for.
Record at offset 15087 has 39 bytes unaccounted for.
Record at offset 21646 has 131 bytes unaccounted for.
Record at offset 22560 has 39 bytes unaccounted for.
Record at offset 54601 has 963 bytes unaccounted for.
Record at offset 61481 has 259 bytes unaccounted for.
Record at offset 68227 has 259 bytes unaccounted for.
Record at offset 74973 has 39 bytes unaccounted for.
Record at offset 87197 has 259 bytes unaccounted for.
Record at offset 94720 has 259 bytes unaccounted for.
Record at offset 98099 has 259 bytes unaccounted for.
Record at offset 101737 has 39 bytes unaccounted for.
Record at offset 124062 has 259 bytes unaccounted for.
Record at offset 131326 has 259 bytes unaccounted for.
Record at offset 140144 has 259 bytes unaccounted for.
Record at offset 148185 has 259 bytes unaccounted for.
Record at offset 155708 has 259 bytes unaccounted for.
Record at offset 160382 has 259 bytes unaccounted for.
Record at offset 163761 has 259 bytes unaccounted for.
Record at offset 171284 has 259 bytes unaccounted for.
Record at offset 178030 has 259 bytes unaccounted for.
Record at offset 197467 has 259 bytes unaccounted for.
Record at offset 204731 has 259 bytes unaccounted for.
Record at offset 214326 has 259 bytes unaccounted for.
Record at offset 235317 has 259 bytes unaccounted for.
Record at offset 253718 has 259 bytes unaccounted for.
Record at offset 262277 has 259 bytes unaccounted for.
Record at offset 266692 has 259 bytes unaccounted for.
Record at offset 296888 has 531 bytes unaccounted for.
Record at offset 305411 has 243 bytes unaccounted for.
Record at offset 309719 has 243 bytes unaccounted for.
Record at offset 321156 has 899 bytes unaccounted for.
Record at offset 325931 has 243 bytes unaccounted for.
Record at offset 343340 has 531 bytes unaccounted for.
Record at offset 353794 has 515 bytes unaccounted for.
Record at offset 363293 has 259 bytes unaccounted for.
Record at offset 374943 has 259 bytes unaccounted for.
Record at offset 395189 has 171 bytes unaccounted for.
Record at offset 407614 has 171 bytes unaccounted for.
Record at offset 415826 has 259 bytes unaccounted for.
Record at offset 433431 has 259 bytes unaccounted for.
Record at offset 439659 has 259 bytes unaccounted for.
Record at offset 451922 has 259 bytes unaccounted for.
Record at offset 473916 has 147 bytes unaccounted for.
Record at offset 481227 has 515 bytes unaccounted for.
Record at offset 485490 has 243 bytes unaccounted for.
Record at offset 488542 has 147 bytes unaccounted for.
Record at offset 491213 has 3 bytes unaccounted for.
Record at offset 554159 has 259 bytes unaccounted for.
Record at offset 563460 has 903 bytes unaccounted for.
Record at offset 566188 has 259 bytes unaccounted for.
Record at offset 597635 has 259 bytes unaccounted for.
Record at offset 629082 has 1299 bytes unaccounted for.
Record at offset 650040 has 1299 bytes unaccounted for.
Record at offset 662219 has 323 bytes unaccounted for.
Record at offset 664428 has 787 bytes unaccounted for.
Record at offset 689851 has 1539 bytes unaccounted for.

Well, we are missing data. Now, is this actual data, or blank regions? Let's investigate one. Offset 54601 is record #7, and according to my debug CSV has a size of 6880. If we have 969 unaccounted bytes, that puts at offset 54601+6880-963 = 60518. And it looks like there's a header for a 24 * 40 pixel image here. This implies that we are off by 1 on the number of images in a region. However, the 3 byte misalignments look a bit too small. Sure enough, investigating one of those shows that it actually is a 0, 0, 0 header. So, let's expand to grab that last image, but provide the ability to ignore an empty image if needed.

Code:
self.numimages = self.header[0] + 1

for tilenum in range(self.numimages):
    (width, height, unknown) = struct.unpack('<3B',
        filedata.read(3))
    if width > 0 and height > 0:
        tile = Image.fromstring("P", (width, height),
            filedata.read(width*height))
        tile.putpalette(palette)
        self.images.append(self.maskimage(tile))

No more warnings, but some of these extra images seem a bit out of place. Hrm, maybe it is just garbage data after all. It doesn't hurt to decode it though.

Now for palette time! This will require delaying loading the picture data a bit later, so we can load the palette explicitly first. Here are the updates to the imagerecord class:

Code:
    def __init__(self, filedata, offset, size):
        self.offset = offset
        self.size = size

        self.images = []
        # Store the file handle for future use
        self.filedata = filedata

        if offset > 0:
            filedata.seek(offset)
            headerstruct = '<12B'
            self.header = struct.unpack(headerstruct,
                filedata.read(struct.calcsize(headerstruct)) )

            self.numimages = self.header[0] + 1

        else:
            self.numimages = 0
            self.header = []

    def loadimages(self, palette, skipimages=0):
        if self.offset > 0:
            self.filedata.seek(self.offset + 12)

            for tilenum in range(self.numimages):
                (width, height, unknown) = struct.unpack('<3B',
                    self.filedata.read(3))
                # Skip past this image if requested (i.e. for palettes)
                if skipimages > 0:
                    self.filedata.seek(width*height, os.SEEK_CUR)
                    skipimages = skipimages - 1
                elif width > 0 and height > 0:
                    tile = Image.fromstring("P", (width, height),
                        self.filedata.read(width*height))
                    tile.putpalette(palette)
                    self.images.append(self.maskimage(tile))

            # Check to see if we actually loaded all data from this record
            leftover = self.offset + self.size - self.filedata.tell()
            if leftover > 0:
                print "Record at offset {} has {} bytes unaccounted for.".format(self.offset, leftover)
            elif leftover < 0:
                print "Record at offset {} read {} bytes beyond its boundary.".format(self.offset, -leftover)

    def getpalette(self):
        """ Loads the first image in this record as a palette."""
        self.filedata.seek(self.offset + 12)
        (width, height, unknown) = struct.unpack('<3B',
            self.filedata.read(3))
        if width*height != 768:
            raise Exception('This image is not a palette!')
        else:
            return self.filedata.read(width*height)

And the main load routine:

Code:
# Create the image records using list comprehension
self.records = [imagerecord(graphicsfile, offset, size)
    for (offset, size) in zip(headerdata, headerdata2)]

palette1 = self.records[5].getpalette()
palette2 = self.records[53].getpalette()

for recnum, record in enumerate(self.records):
    if recnum == 53:
        record.loadimages(palette2, skipimages=1)
    elif recnum == 5:
        record.loadimages(palette1, skipimages=1)
    else:
        record.loadimages(palette1, skipimages=1)

Weird. Everything looks like the correct colours, just VERY dark. Either it's in some strange format, or it's an alternate palette. A quick comparison using GIMP and a Hex editor doesn't show any direct correlation. Bah. It's not worth it right now. I'll keep the code in, but I will limit it for group 53 data and leave the main data using the screenshot for the palette. Let's get to the mapping!

In order to get started on the mapping, I need to refactor how I load the map. Previously, we loaded the map byte-at-a-time to make visualization easy. Now we should load as a series of 16-bit halfwords instead. I should also add a quick routine to generate a map csv so I can see the data values for each tile easily. Before I do that, let's get the previous functionality refactored and try it on a few other maps for comparison.
Code:
import struct, sys, os
from PIL import Image

class xargonmap(object):
    def __init__(self, filename):
        # Grab the map from the file name (sans ext)
        # TODO: Maps may have embedded names. To consider
        (temppath, tempfname) = os.path.split(filename)
        (self.name, tempext) = os.path.splitext(tempfname)

        # Load the map data as a 98 x 98 array of 2-byte positions:
        mapfile = open(filename, 'rb')
        pattern = '<{}H'.format(64*128)

        self.tiles = struct.unpack(pattern,
            mapfile.read(struct.calcsize(pattern)) )
        mapfile.close()

    def debugcsv(self):
        # Remember that the map is height-first. We need to convert to
        # width-first
        pass

    def debugimage(self):
        # Turn the map data into a list of 3-byte tuples to visualize it.
        # Start by pre-creating an empty list of zeroes then copy it in
        visualdata = [None] * (64*128)
        for index in range(64*128):
            visualdata[index] = (self.tiles[index]%256, self.tiles[index]/256, 0)

        # Tell PIL to interpret the map data as a RAW image:
        mapimage = Image.new("RGB", (64, 128) )
        mapimage.putdata(visualdata)
        mapimage.rotate(-90).save(self.name + '.png')


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print """Usage: python xargonmap.py [Map File]
TODO
"""
    else:
        for filename in sys.argv[1:]:
            themap = xargonmap(filename)
            themap.debugcsv()
            themap.debugimage()



Hold the phone. Those red squares are NOT tile data; they are providing some other information about the map. So I think the real tile data must what we are loading into the red layer (note the red squares vary the shade of red), while the green layer must be something else. Let's unroll this into two separate layers (and yes, switch back to loading byte-by-byte).

Updated class:
Code:
class xargonmap(object):
    def __init__(self, filename):
        # Grab the map from the file name (sans ext)
        # TODO: Maps may have embedded names. To consider
        (temppath, tempfname) = os.path.split(filename)
        (self.name, tempext) = os.path.splitext(tempfname)

        # Load the map data as a 98 x 98 array of 2-byte positions:
        mapfile = open(filename, 'rb')
        pattern = '<{}B'.format(64*128*2)

        tempdata = struct.unpack(pattern,
            mapfile.read(struct.calcsize(pattern)) )

        self.tiles = [tileval for index, tileval in enumerate(tempdata) if index%2 == 0]
        self.meta  = [tileval for index, tileval in enumerate(tempdata) if index%2 == 1]
        mapfile.close()

    def debugcsv(self):
        # Remember that the map is height-first. We need to convert to
        # width-first
        pass

    def debugimage(self):
        # Tell PIL to interpret the map data as a RAW image:
        mapimage1 = Image.new("L", (64, 128) )
        mapimage1.putdata(self.tiles)
        mapimage1.rotate(-90).save(self.name + '_tile.png')
        mapimage1 = Image.new("L", (64, 128) )
        mapimage1.putdata(self.meta)
        mapimage1.rotate(-90).save(self.name + '_meta.png')

And images:


Now to add the CSV routine. Note that we need to convert to a row-oriented format for this to work (yay, more list comprehensions):
Code:
def debugcsv(self):
    # Remember that the map is height-first. We need to convert to
    # width-first. This only outputs tile data for now.
    with open(self.name + '.csv', 'wb') as csvfile:
        writer = csv.writer(csvfile)
        for y in range(64):
            writer.writerow([self.tiles[x*64+y] for x in range(128)])

Hrm, I may have forgotten to properly compensate for height-first when generating the images. I think I need to also flip them horizontally as well as rotating. They appear backwards to the CSV output. Let me sanity check by jumping in game quickly and confirming the stage layout.

Yup. Fixing that below:
Code:
    def debugimage(self):
        # Tell PIL to interpret the map data as a RAW image:
        mapimage1 = Image.new("L", (64, 128) )
        mapimage1.putdata(self.tiles)
        ImageOps.mirror(mapimage1.rotate(-90)).save(self.name + '_tile.png')
        mapimage1 = Image.new("L", (64, 128) )
        mapimage1.putdata(self.meta)
        ImageOps.mirror(mapimage1.rotate(-90)).save(self.name + '_meta.png')

I guess we didn't get to actually start the mapping today. However, we have all the information we need to get started on that tomorrow. day4.zip is available for anyone who wants the scripts and debug map data we generated today.
Logged
Revned
Hero Member
*****
Offline Offline

Posts: 1091



« Reply #4 on: November 17, 2012, 02:59:07 PM »

This is great stuff, thanks for posting!

I tried to do something similar with Mega Man 9 before the Dolphin emulator was able to play it, but sadly, the graphics tiles weren't laid out so nicely in the data file. I stared at them in a hex editor for hours and hours before I finally gave up and eventually just fixed Dolphin so I could assemble the maps manually.
Logged

Zerker
Newbie
*
Offline Offline

Posts: 37



« Reply #5 on: November 18, 2012, 10:27:57 AM »

Yeah, I'm thinking the black box method is only really practical with old DOS games. The newer games are more likely to be using strange compression schemes and alternate image formats. That said, some games may just use standard image formats (like PNG), so it may be entirely possible to start looking for headers of known image files.

Then again, the entire resource file is probably compressed too.

On to Day 5: the day where I stop teasing and actually create a map.

To support this, we need to start a lookup routine in the graphics file to get the corresponding tile for a map tile index. Unfortunately, we don't know the mapping yet, so we need to pre-populate with debug tiles. Since we found out that the tile index is only an 8-bit number, we should be able to fit it (in hex notation) in the 16 x 16 pixel debug tile. This may make the CSV we made obsolete, but what can you do.

We need a font to generate the text, so I'm going to copy in the Android favourite, DroidSans. Here's the routine to create a simple debug tile (adapted from my Shadow Caster mapper) and the font declaration:

Code:
debugfont = ImageFont.truetype("DroidSans.ttf", 12)

@staticmethod
def debugimage(colour, text):
    """ Creates a 16x16 debug image for tiles:

    colour -- the background colour for the image.
    text -- the text to display
    """
    tempimage = Image.new("RGBA", (16, 16), colour)
    textcolor = (255, 255, 255) if colour[0] < 96 else (0, 0, 0)
    pen = ImageDraw.Draw(tempimage)
    pen.text((1, 2), text, font=imagefile.debugfont, fill=textcolor)
    return tempimage

Here's the addition to the __init__ method for the graphics file:

Code:
# Group by map tile:
self.tilelookup = [self.debugimage( (i, i, i), '{:02X}'.format(i) )
    for i in range(256) ]

And the lookup method, which will change once we figure out real mappings:
Code:
def gettile(self, tilenum):
    return self.tilelookup(tilenum)

Now, we start a new file for generating the map. In this file, we're going to import the other two files we wrote in order to use their features in tandem. This will start out extremely simple: we will create a large Image object to hold the map, then paste every debug tile according to its coordinates. Since the map is a linear set of tiles, we need to use integer division and modulus (i.e. the "remainder") to split that up into X, Y coordinates.

Here's the initial mapper file. I set up the mapper as a class, but you can do it as a standalone set of functions as well. Personally, I like to split it up into phases. the Shadow Caster had initialize, generate and save phases, but I think we can combine the first two in this case.

Code:
import sys
from PIL import Image
from xargonmap import xargonmap
from xargongraphics import imagefile

class xargonmapper(object):
    def __init__(self, graphics, mapdata):
        self.mappicture = Image.new("RGB", (128*16, 64*16) )
        self.name = mapdata.name

        for index, tileval in enumerate(mapdata.tiles):
            # Remember: maps are height first
            (x, y) = (index/64, index%64)
            self.mappicture.paste(graphics.gettile(tileval),
                (x*16, y*16) )

    def save(self):
        self.mappicture.save(self.name + '.png')


if __name__ == "__main__":
    if len(sys.argv) < 3:
        print """Usage: python xargonmapper.py [Graphics File] [Map File(s)...]
TODO
"""
    else:
        xargonimages = imagefile(sys.argv[1])
        for filename in sys.argv[2:]:
            themap = xargonmap(filename)

            print "Generating Map '{}'".format(themap.name)
            mapper = xargonmapper(xargonimages, themap)
            print "Saving Map '{}'".format(themap.name)
            mapper.save()

And here's our first output with no tiles identified:



Remember this screenshot from Day 2?



Well, that's same area. So now we need to go look through our extracted tiles and match a few to IDs. Usually we can figure out a trend to unlock the rest for us; otherwise it can become a bit tedious. I see a few tiles in graphics record #8, so I'm going to go ahead and map those. Actually, I already see a pattern: it looks like map tile values 1 to at least 0A correspond to the first few tiles in group 8. But there appears to be a discontinuity at index 6. GRR. Let's do what we can with it.

Code:
    def gettile(self, tilenum):
        if tilenum >= 2 and tilenum <=5:
            return self.records[8].images[tilenum-2]
        elif tilenum >= 7 and tilenum <=10:
            return self.records[8].images[tilenum-3]
        else:
            return self.tilelookup[tilenum]

Not too bad so far, but I can see this getting awkward after a bit. Let's see what this does to the map:



Okay, that's progress. Now let me go ahead and try to identify some of the other tiles... gah, what is that??



Looks like I was wrong about the 8-bit tiles. This all falls under the red area in the original visualisations. Looking a bit closer at my grayscale meta files in GIMP show that even the non-black regions appear to have a few very-close-but-not-identical colour values. Time so switch back to 16-bit tiles (grumble). I'm not quite sure how to make 4-digits visible on a 16 pixel tile though. I'll try, but I may need to rely more on the CSV.

Code:
debugfont = ImageFont.truetype("DroidSans.ttf", 8)

@staticmethod
def debugimage(index):
    """ Creates a 16x16 debug image for tiles """
    colour = (index%256, index/256, 0)
    tempimage = Image.new("RGBA", (16, 16), colour)
    textcolor = (255, 255, 255) if colour[0] < 96 and colour[1] < 96 else (0, 0, 0)
    pen = ImageDraw.Draw(tempimage)
    pen.text((4, 0), '{:02X}'.format(index/256), font=imagefile.debugfont, fill=textcolor)
    pen.text((4, 8), '{:02X}'.format(index%256), font=imagefile.debugfont, fill=textcolor)
    return tempimage

8-point font FTW. Just readable. It takes a noticeable delay to generate 65535 debug tiles though Sad. Now to adapt my previous identifications (+ a couple more):

Code:
def gettile(self, tilenum):
    if tilenum == 0xC000:
        return self.records[9].images[15]
    elif tilenum == 0xC001:
        return self.records[8].images[24]
    elif tilenum >= 0xC002 and tilenum <=0xC005:
        return self.records[8].images[tilenum-0xC002]
    elif tilenum >= 0xC007 and tilenum <=0xC00A:
        return self.records[8].images[tilenum-0xC003]
    elif tilenum >= 0xC1AF and tilenum <=0xC1B3:
        return self.records[14].images[tilenum-0xC1AF+18]
    elif tilenum >= 0xC0F5 and tilenum <=0xC0F6:
        return self.records[25].images[tilenum-0xC0F5+2]
    elif tilenum >= 0xC0D3 and tilenum <=0xC0D6:
        return self.records[25].images[tilenum-0xC0D3+9]
    elif tilenum >= 0xC2DE and tilenum <=0xC2DF:
        return self.records[25].images[tilenum-0xC2DE+13]
    elif tilenum >= 0xC2E2 and tilenum <=0xC2E3:
        return self.records[25].images[tilenum-0xC2E2+19]
    elif tilenum >= 0xC2EE and tilenum <=0xC2F1:
        return self.records[25].images[tilenum-0xC2EE+31]
    else:
        return self.tilelookup[tilenum]



No more misidentified tiles, but we aren't exactly coming up with much of a trend, even if we were to calculate out the offsets (i.e. -0xC0F5+2 = -0xC0F3, but -0xC0D3+9 = -0xC0CA). It seems that almost every group of tiles, even those in the same record, ends up discontinuous. At this point, we have two options. Either keep identifying tiles manually (and up with some sort of mapping database), or see if there is some information we are missing. Hey now, there's a file called TILES.XR1. Let's see what's in it. Looks like a whole bunch of strings with some supplementary information. It's a very regular file, so it should be fairly simple to decode.

Here's the last record in the file (with ASCII characters substituting the obvious text portion):
81 03 3A 5B 01 00 07 MPBRDG1

The last byte appears to be the string length, so it's just the first 6 to guess about. Since we are dealing with 16-bit tile values, I'm going to decode each of these at 16 bits on a hunch.

Here's our new file:
Code:
import struct, sys, os, pdb, csv
from PIL import Image, ImageFont, ImageDraw

class tilefile(object):
    debugfont = ImageFont.truetype("DroidSans.ttf", 8)

    def __init__(self, filename):
        filesize = os.path.getsize(filename)
        infile = open(filename, 'rb')

        self.tiles = []

        commonheader = '<3HB'
        while infile.tell() < filesize:
            headerdata = struct.unpack(commonheader,
                infile.read(struct.calcsize(commonheader)) )
            stringlen = headerdata[3]
            print headerdata
            self.tiles.append(headerdata[0:3] +
                struct.unpack('<{}s'.format(stringlen), infile.read(stringlen)) )

    def debug_csv(self, filename):
        with open(filename, 'wb') as csvfile:
            writer = csv.writer(csvfile)
            for recnum, tiledata in enumerate(self.tiles):
                writer.writerow((recnum,) + tiledata)


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print """Usage: python xargontiles.py [Tiles File]
TODO
"""
    else:
        for filename in sys.argv[1:]:
            xargontiles = tilefile(filename)
            xargontiles.debug_csv('tiles.csv')

And part of the output. First column is the index in the file, the rest is data from the file:
Code:
0   0   18704   1   0
1   1   18432   1   SK1
2   2   18433   1   SK2
3   3   18434   1   SK3
4   4   18435   1   SK4
5   5   18436   1   SK5
6   6   18707   1   CLD1
7   7   18437   1   VL
8   8   18438   1   VR
9   9   18439   0   S1L
10  10  18440   0   S1R
11  11  18708   1   CLD2
12  12  18709   1   CLD3
13  13  18441   1   R1
14  14  18442   1   R2
15  15  18443   1   R3
16  16  18444   1   R4
17  17  18445   3   GS1
18  18  18446   3   GS2
19  19  18447   1   R5
20  20  18448   1   R6
21  21  18449   0   S2
22  22  18450   0   S2C
23  23  18451   0   S2M
24  24  18452   0   S2B
25  25  18453   1   S2U
26  26  18454   1   S2BK
27  27  18710   1   CLD4
28  28  18711   1   CLD5
29  29  18712   1   CLD6

Still no direct correlation to the values we see on the map, as C000 (for example) is 49152 in decimal. Let me do one more thing today. I'm going to re-interpret some of our unknown GRAPHICS record header values as 16-bit numbers.

Code:
0   0   0
1   768     8591    128 1   2560    4608    8704    2   1   0
2   9359    5007    128 1   2048    3584    6656    2   1   0
3   14366   721     10  1   200     360     680     2   1   0
4   15087   6559    36  1   1744    3344    6544    8   4   0
5   21646   914     1   1   196     388     772     8   4   0
6   22560   32041   38  1   8122    16096   32032   8   0   0
7   54601   6880    27  1   1564    3020    5932    8   4   0
8   61481   6746    25  1   1700    3300    6500    8   4   0
9   68227   6746    25  1   1700    3300    6500    8   4   0
10  74973   12224   47  1   3196    6204    12220   8   4   0
11  87197   7523    28  1   1904    3696    7280    8   4   0
12  94720   3379    12  1   816     1584    3120    8   4   0
13  98099   3638    13  1   884     1716    3380    8   4   0
14  101737  22325   86  1   5848    11352   22360   8   4   0
15  124062  7264    27  1   1836    3564    7020    8   4   0
16  131326  8818    33  1   2244    4356    8580    8   4   0
17  140144  8041    30  1   2040    3960    7800    8   4   0
18  148185  7523    28  1   1904    3696    7280    8   4   0
19  155708  4674    17  1   1156    2244    4420    8   4   0
20  160382  3379    12  1   816     1584    3120    8   4   0
21  163761  7523    28  1   1904    3696    7280    8   4   0
22  171284  6746    25  1   1700    3300    6500    8   4   0
23  178030  19437   74  1   5032    9768    19240   8   4   0
24  197467  7264    27  1   1836    3564    7020    8   4   0
25  204731  9595    36  1   2448    4752    9360    8   4   0
26  214326  20991   80  1   5440    10560   20800   8   4   0
27  235317  18401   70  1   4760    9240    18200   8   4   0
28  253718  8559    32  1   2176    4224    8320    8   4   0
29  262277  4415    16  1   1088    2112    4160    8   4   0
30  266692  30196   77  1   7901    15900   30680   8   0   0
31  296888  8523    36  1   2128    5392    8080    8   0   0
32  305411  4308    7   1   1036    2044    4060    8   0   0
33  309719  11437   12  1   2840    5652    11216   8   0   0
34  321156  4775    8   1   992     1952    3872    8   0   0
35  325931  17409   18  1   4572    9072    18072   8   0   0
36  343340  10454   37  1   2650    5236    10156   8   0   0
37  353794  9499    36  1   2360    4688    9008    8   0   0
38  363293  11650   7   1   2880    6144    11436   8   0   0
39  374943  20246   27  1   5249    11588   20672   8   0   0
40  395189  12425   30  1   3267    6444    12708   8   0   0
41  407614  8212    31  1   2108    4092    8060    8   4   0
42  0   0
43  415826  17605   18  1   4392    8712    17352   8   0   0
44  433431  6228    23  1   1564    3036    5980    8   4   0
45  439659  12263   8   1   3200    7072    12704   8   0   0
46  0   0
47  451922  21994   25  1   5640    11844   22260   8   0   0
48  473916  7311    16  1   1888    3904    7360    8   0   0
49  0   0
50  481227  4263    8   1   960     1888    3744    8   0   0
51  485490  3052    15  1   772     1572    2908    8   0   0
52  488542  2671    16  1   680     1328    2528    8   0   0
53  491213  62946   241 1   16516   32068   63172   8   4   0
54  554159  9301    10  1   2440    4840    9640    8   0   0
55  563460  2728    7   1   476     924     1820    8   0   0
56  566188  31447   24  1   7872    17376   31200   8   0   0
57  597635  31447   24  1   7872    17376   31200   8   0   0
58  629082  20958   21  1   4980    9876    19668   8   0   0
59  650040  12179   22  1   2826    6012    11040   8   0   0
60  662219  2209    6   1   504     1048    1944    8   0   0
61  664428  25423   16  1   6208    12352   24640   8   0   0
62  689851  7704    11  1   1616    3548    6332    8   0   0

Not really much better, is it? I did notice something of use, however. The second-last value appears to be a flag to indicate whether a record contains tile data (4), font data (1) or sprite data (0). That's something.

However, I tried something else. Looking at the TILES file, I sorted my output by the third column (the large number) and it appears that that corresponds to the position of a tile in the record file. I looked over some of the WATER tiles and compared them to record #23, and there appears to be a correlation there.

Code:
364 22272   24617   WATERFL1
365 22273   24617   WATERFL2
366 22274   24617   WATERFL3
367 22275   24617   WATERFL4
368 22276   24617   WATER0
369 22277   1       WATERFLS1
370 22278   1       WATERFLS2
371 22279   24617   WATERFLT
574 22280   24617   WATEREND1
575 22281   24617   WATEREND2
576 22282   24617   WATEREND3
577 22283   24617   WATEREND4
578 22284   24617   WATEREND5
579 22285   24617   WATEREND6
580 22286   24617   WATEREND7
581 22287   24617   WATEREND8
582 22288   24617   WATERMST1
583 22289   24617   WATERMST2
584 22290   24617   WATERMST3
585 22291   24617   WATERMST4
586 22292   24617   WATERMST5
587 22293   24617   WATERMST6
588 22294   24617   WATERMST7
589 22295   24617   WATERMST8
590 22296   24617   WATERWAV1
591 22297   24617   WATERWAV2
592 22298   24617   WATERWAV3
593 22299   24617   WATERWAV4
594 22300   24617   WATERWAV5
595 22301   24617   WATERWAV6
596 22302   24617   WATERWAV7
597 22303   24617   WATERWAV8
375 22304   24633   WATERBUBL
376 22312   24617   WATERT1
377 22313   24617   WATERT2
378 22314   24617   WATERT3
379 22315   24617   WATERT4
380 22316   24617   WATERT5
381 22317   24617   WATERT6
382 22318   24617   WATERT7
383 22319   24617   WATERT8
384 22320   24633   SEAWEEDR
385 22324   24633   SEAWEEDL
386 22328   24617   CORALVEGL
387 22329   24617   CORALVEGR
388 22330   24617   CORALTL
389 22331   0       CORALM
390 22332   24617   CORALTR
372 22333   24617   CORALBL
373 22334   24617   CORALBR
374 22335   0       ALGAEB
393 22336   24617   ALGAE1
394 22337   24617   ALGAE2
395 22338   24617   ALGAE3
396 22339   24617   ALGAE4
397 22340   0       CORALM2
398 22341   0       CORALM3
360 22342   0       CORALSHL
361 22343   0       CORALSHD
362 22344   0       CORALT
363 22345   0       CORALT2



So I did some more calculations with the area that appears to correspond to record 25 against my identified values:

Code:
217 245 22787   3   GRS1    C0F5    25  2   49152   22785
218 246 22788   3   GRS2    C0F6    25  3   49152   22785
219 247 22789   1   BMSH
220 248 22790   1   BMSHT
221 249 22791   1   BMSHB
691 754 22792   3   LIMB1
692 755 22793   3   LIMB2
183 211 22794   0   DRT1    C0D3    25  9   49152   22785
184 212 22795   1   DRT2    C0D4    25  10  49152   22785
185 213 22796   1   DRT3    C0D5    25  11  49152   22785
186 214 22797   0   DRT4    C0D6    25  12  49152   22785
695 758 22797   1   DRT4B           12      22785
671 734 22798   3   S6L     C2DE    25  13  49152   22785
672 735 22799   3   S6R     C2DF    25  14  49152   22785
187 215 22800   0   DRTBL           15      22785
188 216 22801   0   DRTBR           16      22785
673 736 22802   0   GRSL            17      22785
674 737 22803   0   GRSR            18      22785
675 738 22804   1   DRTUL   C2E2    25  19  49152   22785
676 739 22805   1   DRTUR   C2E3    25  20  49152   22785
677 740 22806   0   DRTL1           21      22785
693 756 22806   1   DRTL1B          21      22785
678 741 22807   0   DRTR1           22      22785
694 757 22807   1   DRTR1B          22      22785
679 742 22808   0   DRTL2           23      22785
680 743 22809   0   DRTR2           24      22785
681 744 22810   0   DRTB1           25      22785
682 745 22811   0   DRTB2           26      22785
683 746 22812   1   RT1             27      22785
684 747 22813   1   RT2             28      22785
685 748 22814   0   DRTB3           29      22785
686 749 22815   1   RT3             30      22785
687 750 22816   1   DRTRT1  C2EE    25  31  49152   22785
688 751 22817   1   DRTRT2  C2EF    25  32  49152   22785
689 752 22818   1   DRTRT3  C2F0    25  33  49152   22785
690 753 22819   1   DRTRT4  C2F1    25  34  49152   22785

There is definitely a correlation in both directions. Column 2 appears to be some equivalent of the tile ID in the map. Column 3 appears to be some equivalent of the tile ID in the GRAPHICS file. The manually-entered columns, in order, are: actual map tile ID, actual record group, actual record index, difference between column 2 and actual tile ID, difference between column 3 and actual record group. Notice the differences are constant for all the identified tiles in this region. Next session, we will look into the exact relationship across groups.

And finally, day5.zip is available for anyone who wants it.
Logged
TerraEsperZ
Hero Member
*****
Offline Offline

Posts: 2225



« Reply #6 on: November 18, 2012, 10:47:46 AM »

Fascinating. Definitely out of my league though, and not really usable for the console/handheld games I prefer to map, but fascinating none the less Smiley.
Logged

Current project that I really should try to finish:
-Drill Dozer (GBA)
-Sonic 3D Blast (Genesis)
-Naya's Quest (PC)
-Lilly Looking Through (PC)

Pending project:
-A ton of stuff that will never be finished
DarkWolf
Hero Member
*****
Offline Offline

Posts: 621



« Reply #7 on: November 19, 2012, 08:01:42 AM »

Something to note about old DOS games as well: sprites are often stored with some sort of run-length encoding.  It's an outdated method of reducing the size of bitmaps. What it does is translate the drawing of the image into commands similar to:

Move X pixels
New row
Draw X pixels of color

Each game seems to be slightly different in the encoding, but with trial and error, it can be figured out.
Logged
Zerker
Newbie
*
Offline Offline

Posts: 37



« Reply #8 on: November 19, 2012, 06:03:11 PM »

Thanks for the feedback guys. To expand on Darkwolf's comment, Xargon is a very simple game with some basic data structures. It makes it a good example, but note that other games may be more difficult. Both games I did prior to this (ROTT and ShadowCaster) had more complicated sprite schemes. ROTT used an RLE encoding scheme as DarkWolf described, while ShadowCaster actually had a simple scheme of specifying the start and end pixel in a column, but not otherwise compressing the image. That said, a lot of games have had some prior investigation, or even a source code release, which can help reduce the guesswork immensely. As I mentioned, the source code for Xargon has been released, but I'm intentionally avoiding it to make this guide (hopefully) more useful.

To start off today's exercise, I'm going to play around in a spreadsheet. Exciting stuff. Basically, for every known sprite thus far, I want to figure out the exact relationship between the TILES file, the Record ID, and the Map ID. One thing I noticed yesterday was that the Record ID column in the TILES file appeared to shift by a 256 boundary for each record. We should be able to use the classic divide and modulus pair to split out the record number. I just need to confirm to make sure we don't start drifting for whatever offset I end up calculating.

Records appear to be easy. The record number is just the TILES entry / 256 - 64, and the entry in the record is the TILES entry % 256 -1. The only two things that we need to pay special attention to are:
1) The "extra" tile we mentioned before (i.e. that I thought might be garbage data) appears to be referred to by position -1. That said, index -1 in Python already refers to the last element in a list, so we don't actually need to handle this explicitly.
2) The tiles file refers to record 49, which does not exist. We will just need to keep an eye out if this happens. I expect this is just placeholder data (maybe for the registered version).

The Tile IDs appear to just be offset by C000, which is pretty simple. I'll need to figure out the tiles that are in the 0 to FF range on the map, but that can wait. I'll just put an IF statement to avoid processing them for now.

So let's expand our xargontiles.py file to do the lookup for us. We have two options for linking together the graphics and tile files:
1) Pass in the Graphics file when creating the Tiles file and store a reference for future use
2) Pass in the Graphics file for the lookup operation to the Tiles file

Since I prefer to keep each class tailored to just the file it is responsible for, I will be doing the second option. The xargonmapper.py is the only file that will know about all the related files and will tie them together.

First, we need to set up xargontile to do easier lookups. I will expand the initialization routine to populate a dictionary of tile to record mappings:

Code:
def __init__(self, filename):
    filesize = os.path.getsize(filename)
    infile = open(filename, 'rb')

    self.tiles = []
    self.lookup = {}

    commonheader = '<3HB'
    while infile.tell() < filesize:
        headerdata = struct.unpack(commonheader,
            infile.read(struct.calcsize(commonheader)) )
        stringlen = headerdata[3]
        self.tiles.append(headerdata[0:3] +
            struct.unpack('<{}s'.format(stringlen), infile.read(stringlen)) )

        self.lookup[headerdata[0]] = headerdata[1]

Then we write the lookup routine itself to use this lookup and take the offsets into account:

Code:
def gettile(self, graphics, tilenum):
    if tilenum < 0xC000:
        return graphics.tilelookup[tilenum]
    else:
        graphindex = self.lookup[tilenum - 0xC000]

        recnum = graphindex / 256 - 64
        recindex = graphindex % 256 - 1

        return graphics.records[recnum].images[recindex]

And this results in a very minor change to xargonmapper:
Code:
def __init__(self, graphics, tiledata, mapdata):
    self.mappicture = Image.new("RGB", (128*16, 64*16) )
    self.name = mapdata.name

    for index, tileval in enumerate(mapdata.tiles):
        # Remember: maps are height first
        (x, y) = (index/64, index%64)
        self.mappicture.paste(tiledata.gettile(graphics, tileval),
            (x*16, y*16) )

Then we run it!



And would you look at that? THAT is progress. But it's not perfect, and it has some glitches we need to investigate. Notably, we haven't looked into the map index 0 -> FF area yet. I also noticed a couple glitches over in this part of the map which we need to investigate further:



But first, the region that doesn't start with C000:



And let's start identifying some tiles. I already have a theory that these are simply direct indicies into the TILES file, so let's test that out with the black tiles (00AD it looks like). 0xAD -> 173 decimal, which has the name of 0NO and record 9, index 16. Sure enough, that tile is totally black. Let's implement it and see how it turns out:

Code:
def gettile(self, graphics, tilenum):
    if tilenum < 0xC000:
        graphindex = self.lookup[tilenum]
    else:
        graphindex = self.lookup[tilenum - 0xC000]

    recnum = graphindex / 256 - 64
    recindex = graphindex % 256 - 1

    return graphics.records[recnum].images[recindex]



Almost there, but we ran into the same glitch again. Let's look into that. I'm just going to open up the debug CSV to grab the tile value that screwed up... and it looks like 174. In the TILES array, 174 is called RNSLDGM... and it's in record 10, index -1. ARGH. Looks like the out-of-place "extra" graphics really are out of place! And I can't figure out a clear reason why the "extra" tile looks wrong. However, this tile in particular looks like record 10, index 3 (although that is also assigned a tile name).

What I'm going to is remove the debug image list from before, since we have the tiles nominally identified. Instead, I will create a much shorter list to handle the cases where we run into a -1 tile and re-direct it to the correct index (if, in fact, such a tile exists). Hopefully we can get something working.

Code:
neg1mapping = {8: (8, 24), 10: (10, 3)}

def gettile(self, graphics, tilenum):
    if tilenum < 0xC000:
        graphindex = self.lookup[tilenum]
    else:
        graphindex = self.lookup[tilenum - 0xC000]

    recnum = graphindex / 256 - 64
    recindex = graphindex % 256 - 1

    # Negative 1 index needs special handling. The correct image does
    # not always appear in the correct spot.
    if recindex == -1:
        if recnum in self.neg1mapping:
            (recnum, recindex) = self.neg1mapping[recnum]
        else:
            return graphics.unknown[recnum]

    return graphics.records[recnum].images[recindex]

And a modification to the debug tile code:
Code:
@staticmethod
def debugimage(index):
    """ Creates a 16x16 debug image for tiles """
    colour = (index, index, index)
    tempimage = Image.new("RGBA", (16, 16), colour)
    textcolor = (255, 255, 255) if index < 96 else (0, 0, 0)
    pen = ImageDraw.Draw(tempimage)
    pen.text((1, 2), '{:02}'.format(index), font=imagefile.debugfont, fill=textcolor)
    return tempimage

So yes, it looks like all the glitched tiles ARE the -1 tiles. Unfortunately, the tile we picked was wrong; the correct tile should have more of a highlight. We also need to find a suitable rope tile, it looks like.



And I can't find either of them! *cries*.

Wait a minute. I am an idiot. Raise your hand if you see the bug below:
Code:
for recnum, record in enumerate(self.records):
    if recnum == 53:
        record.loadimages(palette2, skipimages=1)
    elif recnum == 5:
        record.loadimages(palette1, skipimages=1)
    else:
        record.loadimages(palette1, skipimages=1)

That's right, I ALWAYS skip the first image! GAH! This explains the whole subtract 1 thing (which, obviously now, is wrong). Let's fix both problems and re-run to generate:



Yay! For run, I also ran it on the rest of the maps, and most everything looks okay, except for a few maps:



Looks like there may be some sort of transparency behaviour with the sky colour on other maps? Or a different colour palette? I'll need to investigate when I decode more of the map format. That will be tomorrow's topic.

And finally, day6.zip is available for anyone who wants it. It includes the full output images thus far if anyone wants to see what the current versions of the rest of the maps look like.
Logged
Zerker
Newbie
*
Offline Offline

Posts: 37



« Reply #9 on: November 20, 2012, 06:23:00 PM »

Hello and welcome back. Today we will be looking at the rest of the map format. We're looking for two major things to include in our map:
1) some sort of header information that will hopefully give us more information on background colours
2) Monsters and Pickups!

Remember the garbage at the bottom of this image?


Well, let's try to figure out what it means. There's obviously a repeating pattern here, but it's a bit skewed because we were decoding two bytes at a time. From the image, it looks about 15-16 pixels, or probably 31 bytes in length. Let's see if we can find the bounds of this record in a hex editor. The map is 128*64*2 bytes long, so we should start at offset 0x4000.



Looks about right. But now we need to find the bounds of this region, and to figure out what comes before and after it. Let's start at the end of the file instead.



Well, that's very clearly delineated. There is a number followed by a string then the next string, etc. The number appears to be the string length. The previous section also has a clear empty buffer region between, so I'm going to assume that the last record ends with a non-zero number. If I then grab the last record, it looks like:
F0 02 00 00 00 00 10 00 0C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 05

That seems like a fairly likely record alignment. The first two characters correspond to a 16 number of 752, which is a reasonable decoding. The F0 is at offset 0x4A8E, so let's keep going back in intervals of 31 until we reach a reasonable first record. (0x4000 - 0x4A8E) / 0x57.29, so it looks like we have approximately 0x57 (87) records here. My first guess places the start of these records at address 0x4005.

Lining up the first few records:
Code:
80 00 00 00 00 00 18 00 28 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 11 20 00
90 00 00 00 00 00 10 00 10 00 01 00 00 00 00 00 01 00 00 00 04 00 A6 48 00 00 00 00 3E D0 02
E0 00 00 00 00 00 10 00 10 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 3D D0 02
F0 00 00 00 00 00 10 00 10 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 3D 80 06
00 03 00 00 00 00 10 00 10 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 3E 80 06
F0 02 00 00 00 00 10 00 10 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 3F 20 00

Looks right to me. That means we have 5 unique bytes:
58 00 00 20 00

0x58 = 88, which appears to be the number of records we have. That leaves us with 0x20 (i.e. 32). Let's look at the headers for a couple different stages set together:
Code:
58 00 00 20 00
58 00 00 20 00
6C 00 00 30 00
AD 00 00 B0 03
B6 00 00 30 00
85 00 00 B0 01
A1 00 00 90 00
B6 00 00 E0 01
9A 00 00 40 07
A1 00 00 90 00
B5 00 00 80 00
2E 00 00 30 07
70 00 00 30 02

Well, I can't figure out a correlation right now, but I think this is a 16 bit number.

Let's assign bit decoding characters for the repeating record. I'm going to grab a few more examples just to get a clearer picture. This may be incorrect, but we need to start with something:
Code:
H     H     H     H     H     H     H     H     H     H     H     B  B  H     H     B  H
80 00 00 00 00 00 18 00 28 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 11 20 00
90 00 00 00 00 00 10 00 10 00 01 00 00 00 00 00 01 00 00 00 04 00 A6 48 00 00 00 00 3E D0 02
E0 00 00 00 00 00 10 00 10 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 3D D0 02
F0 00 00 00 00 00 10 00 10 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 3D 80 06
00 03 00 00 00 00 10 00 10 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 3E 80 06
F0 02 00 00 00 00 10 00 10 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 3F 20 00
A0 00 01 00 00 00 00 00 00 00 03 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 48 90 07
B0 00 00 00 00 00 10 00 10 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 11 A0 07
B0 00 00 00 00 00 10 00 10 00 00 00 00 00 00 00 00 00 00 00 04 00 A5 48 00 00 00 00 48 A0 07
B0 00 00 00 00 00 10 00 10 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 21 40 07
A0 01 00 00 00 00 10 00 10 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 21 50 07
90 01 00 00 00 00 10 00 10 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 21 60 07

Or, condensed, '<11H2B2HBH'. Time to update our map script to decode this information and write it out to a debug CSV so we can hopefully figure something useful out of it (like an object ID and coordinates).

Code:
class xargonmap(object):
    def __init__(self, filename):
        # Grab the map from the file name (sans ext)
        (temppath, tempfname) = os.path.split(filename)
        (self.name, tempext) = os.path.splitext(tempfname)

        # Load the map data as a 64*128 array of 16 bit values:
        mapfile = open(filename, 'rb')
        pattern = '<{}H'.format(64*128)

        self.tiles = struct.unpack(pattern,
            mapfile.read(struct.calcsize(pattern)) )

        # Decode the object header then the object list
        objheader = '<HBH'
        objrecord = '<11H2B2HBH'

        (numobjs, blank, self.unknown) = struct.unpack(objheader,
            mapfile.read(struct.calcsize(objheader)) )

        self.objs = [struct.unpack(objrecord,
            mapfile.read(struct.calcsize(objrecord)) )
            for i in range(numobjs)]

        # There always appears to be a 0x5E spacer region
        mapfile.read(0x5E)

        # Capture any strings until the end of the file
        self.strings = []
        sizebytes = mapfile.read(2)
        while (len(sizebytes) == 2):
            (stringlen,) = struct.unpack('<H', sizebytes)
            self.strings.append(mapfile.read(stringlen))
            mapfile.read(1)
            sizebytes = mapfile.read(2)

        mapfile.close()


    def debugcsv(self):
        # Remember that the map is height-first. We need to convert to
        # width-first. This only outputs tile data for now.
        with open(self.name + '.csv', 'wb') as csvfile:
            writer = csv.writer(csvfile)
            for y in range(64):
                writer.writerow([self.tiles[x*64+y] for x in range(128)])

        # Next, output the object list:
        with open(self.name + '_objs.csv', 'wb') as csvfile:
            writer = csv.writer(csvfile)
            writer.writerows(self.objs)

        # Finally, the header and strings list:
        with open(self.name + '_info.csv', 'wb') as csvfile:
            writer = csv.writer(csvfile)
            writer.writerow([self.unknown] + self.strings)

And a subset of the object csv output:

Code:
176 0   0       16  16  4   0   0   0   0   0   0   0   0   0   17  1952
176 0   0       16  16  0   0   0   0   0   4   165 72  0   0   72  1952
176 0   0       16  16  4   0   0   0   0   0   0   0   0   0   33  1856
416 0   0       16  16  2   0   0   0   0   0   0   0   0   0   33  1872
400 0   0       16  16  2   0   0   0   0   0   0   0   0   0   33  1888
400 0   0       16  16  2   0   0   0   0   0   0   0   0   0   33  1904
416 0   0       16  16  2   0   0   0   0   0   0   0   0   0   21  1936
240 0   0       16  16  0   0   0   0   0   0   0   0   0   0   7   1584
788 9   65535   184 8   0   0   0   0   0   4   55  68  0   0   22  1584
752 0   0       16  12  0   0   0   0   0   0   0   0   0   0   22  1600
736 0   0       16  12  0   0   0   0   0   0   0   0   0   0   22  1600
704 0   0       16  12  0   0   0   0   0   0   0   0   0   0   22  1584
720 0   0       16  12  0   0   0   0   0   0   0   0   0   0   22  1744
704 0   0       16  12  0   0   0   0   0   0   0   0   0   0   22  1760
720 0   0       16  12  0   0   0   0   0   0   0   0   0   0   22  1744
736 0   0       16  12  0   0   0   0   0   0   0   0   0   0   22  1760
752 0   0       16  12  0   0   0   0   0   0   0   0   0   0   21  1440
464 0   0       16  16  0   0   0   0   0   0   0   0   0   0   72  1936
160 0   0       32  16  11  0   0   0   0   0   0   0   0   0   51  160
128 1   0       80  16  0   0   0   0   0   0   0   0   0   0   51  1824

Well, whatever is in that 65535's column, it is probably signed. However, it's unlikely that data will be relevant for us (since it's normally zero). We need to pick out likely coordinate and item ID columns. The easiest way to do this is to locate a known object (or pattern of objects) in the game world and try to find their representation.

This classic screenshot provides some useful objects. All of the same type in close proximity:


Opening up the pixelmap image in GIMP, drawing in the item locations, then checking their coordinates results in the following zero-based indicies:
(3, 11) (3, 12) (2, 13) (4, 13)

1-based:
(4, 12) (4, 13) (3, 14) (5, 14)

And I'm coming up pretty empty. I'm not even sure what could look like coordinates or an index in the data I have. Nothing is obviously a coordinate system, and a direct index would require a number between 0 and 8192, and nothing goes anywhere near this value. But I have a strange idea that the indicies could possibly be given in absolute pixel coordinates. Let's divide the first and last columns by 16 and see if we see any patterns.

And.... (drum roll):
Code:
176 0   0   16  16  0   0   0   0   0   0   0   0   0   0   33  32 | 11  2
208 0   0   16  16  5   0   0   0   0   0   0   0   0   0   33  48 | 13  3
176 0   0   16  16  5   0   0   0   0   0   0   0   0   0   33  64 | 11  4
208 0   0   16  16  5   0   0   0   0   0   0   0   0   0   33  48 | 13  3

Hrm, not a perfect match, but close. Also, every entry appears to be within the bounds for a 64 x 128 map, and appears sane. I'm not going to implement today, but I have a theoretical decoding. I will number the columns below:
Code:
1   2   3   4   5   6   7   8   9   10  11  12  13  14  15  16  17
176 0   0   16  16  0   0   0   0   0   0   0   0   0   0   33  32 | 11  2
208 0   0   16  16  5   0   0   0   0   0   0   0   0   0   33  48 | 13  3
176 0   0   16  16  5   0   0   0   0   0   0   0   0   0   33  64 | 11  4
208 0   0   16  16  5   0   0   0   0   0   0   0   0   0   33  48 | 13  3

Column 1: Y coordinate * 16
Column 17: X coordinate * 16
Columns 4 and 5: Nominal sprite dimensions
Column 16: Sprite type

And a few tenative identifications:
17 : Player Start
25: Wandering Monster thing
33 : Lolly Pop pickup

The nominal sprite dimensions entry I guessed because I noticed the coordinates 2, 8 would put something pretty close to the start of the level and the dimensions were 24, 40 (same as the player sprite). Also, the first-level monster sprite dimensions are 38, 25, which shows up several times in this file.

day7.zip is available for anyone who wants it.
Logged
Zerker
Newbie
*
Offline Offline

Posts: 37



« Reply #10 on: November 21, 2012, 06:02:06 PM »

Looking at the decoding from yesterday again, I think I misaligned the record start after all. I think the 20 00 at the start of the record area is actually the X coordinate for the first record. That means that the trailing number of 05 is its own thing. Let's move that around and see if that fixes our "almost correct" alignment for the pickups at the start. Remember, we expected either
0-based:
(3, 11) (3, 12) (2, 13) (4, 13)

or 1-based:
(4, 12) (4, 13) (3, 14) (5, 14)

And we get:
Code:
32  208 0   0   16  16  5   0   0   0   0   0   0   0   0   0   33  2   13
48  176 0   0   16  16  5   0   0   0   0   0   0   0   0   0   33  3   11
64  208 0   0   16  16  5   0   0   0   0   0   0   0   0   0   33  4   13
48  192 0   0   16  16  5   0   0   0   0   0   0   0   0   0   59  3   12

Coordinates match up, but the item ID is off for the last one. Looks like we are off by TWO fields. Let's fix that again and get:

Code:
33  32  208 0   0   16  16  5   0   0   0   0   0   0   0   0   0   2   13
33  48  176 0   0   16  16  5   0   0   0   0   0   0   0   0   0   3   11
33  64  208 0   0   16  16  5   0   0   0   0   0   0   0   0   0   4   13
33  48  192 0   0   16  16  5   0   0   0   0   0   0   0   0   0   3   12

I also switched the half-words to signed due to the strange 65535 entries. Nothing appears to "normally" approach the 16 bit boundary, so this should be safe. I also changed how I skip/capture the apparently unused map region. Hopefully I can find a pattern for the maps with different background colours, when I get there. Here's the updated code:

Code:
# Decode the object header then the object list
objrecord = '<B12h2B2H'

(numobjs,) = struct.unpack('<H', mapfile.read(2) )

self.objs = [struct.unpack(objrecord,
    mapfile.read(struct.calcsize(objrecord)) )
    for i in range(numobjs)]

# There always appears to be a 0x61 byte unknown region between
# the records and strings. Let's just collect it as bytes for now.
unknownregion = '<97B'
self.unknown = struct.unpack(unknownregion,
    mapfile.read(struct.calcsize(unknownregion)) )

Okay, now we think we have a reasonable decoding, and a few sample mappings (corrected due to the misalignment):
0 - Player start
33 - Lolly Pop
25 - Monster

Unfortunately, pickup/monster/object sprites are typically defined in code due the additional logic that is typically needed. It's unlikely we will find a direct mapping like we did for the Tiles. That said, there usually isn't an overwhelming number of interactable objects in a game, so we should be able to identify them as-we-go. We know there aren't more than 256 of them, after all. In order to track the mappings, we should start a new Python file for the sprite database. This will contain a lookup table for record entries, and a method to grab the correct sprite for a given sprite ID. We will also re-use the debug code from unknown tiles to provide placeholders for unknown sprites.

Finally, we need to make a decision on how sprites get drawn into the world. There are two options: have the mapper handle everything and add additional logic there, or defer processing to the sprite file, and pass in a reference to the map-in-progress. I'm going to choose the later, because it helps keep all the sprite processing in one place. Here's the first cut at the sprite file:

Code:
lookup = {  0: (6, 9),
           25: (35, 2),
           33: (37, 5)
         }

def drawsprite(mappicture, graphics, spriterecord):
    (spriteid, x, y) = spriterecord[0:3]
    (width, height) = spriterecord[5:7]

    if spriteid in lookup:
        (recnum, imagenum) = lookup[spriteid]
        spriteimage = graphics.records[recnum].images[imagenum]
    else:
        spriteimage = graphics.unknown[spriteid]

    # When pasting masked images, need to specify the mask for the paste.
    # RGBA images can be used as their own masks.
    mappicture.paste(spriteimage, (x, y), spriteimage)

And the addition to the mapper:

Code:
for objrecord in mapdata.objs:
    spritedb.drawsprite(self.mappicture, graphics, objrecord)



Looking good. A couple things to note that we will need to take care of:
1) There are some objects that appear functional but not visual in nature (object IDs 17 and 63). We could provide markup to indicate what we think they mean, but I think a better approach would be to simply not draw anything.
2) The player start appears slightly in the ground. We will either need to figure out an alignment algorithm to correct this sort of thing, or make a special case for the player sprite.
3) The sprites have Dimensions. We should take this into account for our debug images to make them easier to identify by shape as well as location.

To make this sort of thing easier, I'm going to expand our spritedb file into an actual class which will keep a copy of the corresponding sprite images. This way it can generate debug images on-the-fly when a sprite is not found, and we can make specific entries for empty sprites. To consolidate functionality, I made the debug image function return empty sprites if called with 0 width and height.

Code:
class spritedb(object):
    def __init__(self, graphics):
        self.sprites = {}

        # Simple sprite mapping:
        for (spriteid, recnum, imagenum) in [(0, 6, 9), (25, 35, 2),
                (33, 37, 5)]:
            self.sprites[spriteid] = graphics.records[recnum].images[imagenum]

        # Empty sprites:
        # For future reference, possible meanings are:
        # 17: Respawn point
        # 63: Start??
        for spriteid in [17, 63]:
            self.sprites[spriteid] = graphics.debugimage(spriteid, 0, 0)

        # Cache a reference to the graphics object for future use
        self.graphics = graphics

    def drawsprite(self, mappicture, spriterecord):
        (spriteid, x, y) = spriterecord[0:3]
        (width, height) = spriterecord[5:7]

        if spriteid not in self.sprites:
            self.sprites[spriteid] = self.graphics.debugimage(spriteid, width, height)

        # When pasting masked images, need to specify the mask for the paste.
        # RGBA images can be used as their own masks.
        mappicture.paste(self.sprites[spriteid], (x, y), self.sprites[spriteid])

Updated debug image function:
Code:
def debugimage(index, width, height):
    """ Creates a debug image for sprites """
    if width > 0 and height > 0:
        colour = (index, index, index)
        tempimage = Image.new("RGBA", (width, height), colour)
        textcolor = (255, 255, 255) if index < 96 else (0, 0, 0)
        pen = ImageDraw.Draw(tempimage)
        pen.text((width/2 - 7, height/2 - 6), '{:02}'.format(index),
            font=imagefile.debugfont, fill=textcolor)
        return tempimage
    else:
        # 1 pixel transparent image
        return Image.new("RGBA", (width, height))



And now, we get to identifying. Time to fire up DosBox. Those 51 sprites look like movable clouds, so I'll go find their sprite first. Once I get in-game, I'm going to go ahead and start identifying world map sprites too.

Hrm, it actually looks like the map goes by different sprite IDs!



That's fine, we can maintain two lookups. I'm going to go ahead and split it, then identify a few sprites. Then again, that doesn't look to be enough info. For instance, the 88 sprites on the map screen appear to be mountains, but mountains have two sides. I think there is some sort of sub-ID going on here. And I think I found it:
Code:
88  288     112 0   0   48  53  2   0   0   0   0   0   0   0   0   0
88  336     112 0   0   48  53  3   0   0   0   0   0   0   0   0   0
88  1232    592 0   0   48  53  6   0   0   0   0   0   0   0   0   0
88  768     112 0   0   48  53  2   0   0   0   0   0   0   0   0   0
88  816     112 0   0   48  53  3   0   0   0   0   0   0   0   0   0
88  976     176 0   0   48  53  0   0   0   0   0   0   0   0   0   0
88  1024    176 0   0   48  53  1   0   0   0   0   0   0   0   0   0

I think it's that column that counts 2, 3, 6, 2, 3, 0, 1. Next I'm going to need to expand the sprite identification to take that into account. But that's a task for tomorrow. I'll also need to figure out some way to clearly indicate the sub ID in the debug image. I may need to move back to 8 point font Smiley

day8.zip is available for anyone who wants it.
Logged
Zerker
Newbie
*
Offline Offline

Posts: 37



« Reply #11 on: November 22, 2012, 06:27:28 PM »

Hello folks. Let's get things restructured to allow item varients. I'm also going to introduce two new classes to make things more flexable: an object record class so I don't need to manually specify decoding every time, and a sprite class. The sprite class will be used to allow custom handling for certain sprites, like the player sprite (which needs an offset).

The objrecord class gets added to the xargonmap file:
Code:
class objrecord(object):
    def __init__(self, record):
        self.rawdata = record
        (self.sprtype, self.x, self.y) = record[0:3]
        (self.width, self.height, self.subtype) = record[5:8]

With minor changes to init and debug output methods to compensate.

Next we need to update our sprite db and graphics file (i.e. debug image) to support the two identifiers. Let's do the debug image first. Note that I'm changing this to draw a slightly larger image if needed, and make the image rectangle semi-transparent. I also stopped bothering to pick different shades of gray.
Code:
@staticmethod
def debugimage(index, subindex, width, height):
    """ Creates a debug image for sprites """
    if width > 0 and height > 0:
        # Provide sufficient space to display text
        imgwidth = max(width, 32)
        imgheight = max(height, 16)
        tempimage = Image.new("RGBA", (imgwidth, imgheight))
        pen = ImageDraw.Draw(tempimage)
        pen.rectangle(((0, 0), (width, height)), fill=(64, 64, 64, 128))
        pen.text((imgwidth/2 - 15, imgheight/2 - 6), '{}:{}'.format(index,subindex),
            font=imagefile.debugfont, fill=(255,255,255))
        return tempimage
    else:
        # 1 pixel transparent image
        return Image.new("RGBA", (width, height))

And the spritedb changes for both the two identifiers, and the dedicated sprite class. I also get rid of the two record structures, since that appears to be (another) misinterpretation.

Code:
class spritedb(object):
    def addsprite(self, sprtype, subtype, sprite):
        if sprtype not in self.sprites:
            self.sprites[sprtype] = {}
        self.sprites[sprtype][subtype] = sprite

    def __init__(self, graphics):
        self.sprites = {}
        self.mapsprites = {}

        # Manually-defined sprites (i.e. special handling needed
        self.addsprite(0, 4, sprite(graphics.records[6].images[9], yoffs=-8))

        # Simple sprite mapping. Stage sprites, then Map sprites
        for (sprtype, subtype, recnum, imagenum) in [(25, 0, 35, 2), # Monsters
                (33, 5, 37, 5), # Pickups
                (51, 0, 36, 33), # Clouds
                (5, 0, 47, 8)]: # Map Player
            self.addsprite(sprtype, subtype, sprite(graphics.records[recnum].images[imagenum]))

        # Empty sprites:
        # For future reference, possible meanings are:
        # 17-1: Respawn point
        # 63-3: Start??
        for sprtype, subtype in [(17,1), (63,3), (19,0)]:
            self.addsprite(sprtype, subtype, sprite(graphics.debugimage(sprtype, subtype, 0, 0)))

        # Cache a reference to the graphics object for future use
        self.graphics = graphics

    def drawsprite(self, mappicture, objrec):
        if objrec.sprtype not in self.sprites or \
                objrec.subtype not in self.sprites[objrec.sprtype]:
            self.addsprite(objrec.sprtype, objrec.subtype, sprite(
                self.graphics.debugimage(objrec.sprtype, objrec.subtype,
                objrec.width, objrec.height)))

        self.sprites[objrec.sprtype][objrec.subtype].draw(mappicture, objrec)

class sprite(object):
    def __init__(self, image, xoffs=0, yoffs=0):
        self.image = image
        self.xoffs = xoffs
        self.yoffs = yoffs

    def draw(self, mappicture, objrec):
        # When pasting masked images, need to specify the mask for the paste.
        # RGBA images can be used as their own masks.
        mappicture.paste(self.image, (objrec.x +self.xoffs,
            objrec.y +self.yoffs), self.image)



Excellant. Now it's time to ACTUALLY get decoding things, since we won't end up with overlapping definitions any more. Things a pretty straightforward at this point. Here's the properly identified first section of the map:



And a bit further into level 1:



But it looks like our interior room has a special sprite for the in-game text.



That's going to require something special to implement. Also, we're going to want to indicate what is INSIDE a present, since we can decode that information. That too, will require special handling. With the architecture we have now, we can simply set up subclasses of the sprite class to draw more complicated information. Unfortunately, we're out of time for today, so that will have to be tomorrow.

day9.zip is available.
Logged
Zerker
Newbie
*
Offline Offline

Posts: 37



« Reply #12 on: November 23, 2012, 06:20:24 PM »

Good evening. Time to tackle the two enhancements we wanted yesterday. Enhancement #1 is for present contents. For these I'm just going to expand the capabilities of the basic sprite class and not even make a separate class. I will simply add an optional parameter for the content sprite. When drawing the sprite, if contents are assigned, they will be drawn immediately above the sprite itself.

Code:
class sprite(object):
    def __init__(self, image, xoffs=0, yoffs=0, contents=None):
        self.image = image
        self.xoffs = xoffs
        self.yoffs = yoffs
        self.contents = contents

    def draw(self, mappicture, objrec):
        # When pasting masked images, need to specify the mask for the paste.
        # RGBA images can be used as their own masks.
        mappicture.paste(self.image, (objrec.x +self.xoffs,
            objrec.y +self.yoffs), self.image)

        if contents != None:
            # Place contents immediately above the current sprite
            mappicture.paste(self.contents, (objrec.x +self.xoffs,
                objrec.y +self.yoffs - self.contents.size[1]), self.contents)


And minor changes to the database:
Code:
# Presents have contents:
for (sprtype, subtype, recnum, imagenum, crecnum, cimagenum) in [
        (26, 2, 37, 27, 37, 6), # Treasure (Cherry)
        (26, 0, 37, 27, 37, 33) # Treasure (Health)
        ]:
    self.addsprite(sprtype, subtype, sprite(graphics.records[recnum].images[imagenum],
        contents=graphics.records[crecnum].images[cimagenum]))



That was easy. Now for text. There are two approaches we can use for text:
1) Use the raw text images we have to reproduce the in-game text
2) Find a close match and just go with it

That said, what we got directly out of the reference file appears a bit strange and would need some post-processing to get it to appear correct. The second option is certainly simpler, and I will go with that for now. As long as I can find a fixed-width font with reasonably similar weight and size, it should be pretty equivalent. For reference, here is the direct font data from the GRAPHICS file:



Stage 1 is easy, but we need to figure out the general solution for picking the correct text string. For that, we will refer to stages 3 and 33 (aka the ending).

Let me include the records for object 7 from each stage together below (stage # is the first column). I will also number the columns from the original file as a zero-based index along top.
Code:
    0   1       2   3   4   5   6   7   8   9   10  11  12  13  14  15  16
1   7   1584    788 9   -1  184 8   0   0   0   0   0   4   55  68  0   0
3   7   1824    560 6   -1  112 8   0   0   0   0   0   4   95  72  0   0
3   7   1856    576 6   -1  32  8   0   0   0   0   0   4   94  72  0   0
33  7   1104    36  7   -1  160 8   0   0   0   0   0   4   71  72  0   0
33  7   1440    36  7   -1  160 8   0   0   0   0   0   4   69  72  0   0
33  7   1120    130 7   -1  120 8   0   0   0   0   0   4   55  72  0   0
33  7   1456    130 7   -1  120 8   0   0   0   0   0   4   44  72  0   0
33  7   112     484 7   -1  136 8   0   0   0   0   0   4   31  72  0   0
33  7   112     578 7   -1  120 8   0   0   0   0   0   4   25  72  0   0
33  7   144     708 7   -1  56  8   0   0   0   0   0   4   24  72  0   0
33  7   80      48  6   -1  176 8   0   0   0   0   0   4   5   72  0   0
33  7   112     64  6   -1  112 8   0   0   0   0   0   4   3   72  0   0
33  7   112     96  7   -1  112 8   0   0   0   0   0   4   1   72  0   0
33  7   512     820 8   -1  120 8   0   0   0   0   0   4   250 71  0   0

We've already identified indices 0 to 2 and 5 - 7, although the obvious choice of the subtype does not appear to be used to select the specific string to be displayed. The only one that appears different for each string is column 13, but that doesn't make much sense on its own. That said, I think column 3 might be the colour index, of the first 16 colours in the palette. 9 is blue, 7 is white, 6 is yellow, 8 is dark gray.

Hrm, giving it a bit more of a look, it's possible that it might be part of a 16-bit number. The wrap around from 1 72 to 250 71 seems to imply that. Let me fix my decoding and try again. Here are just the numbers, (+ some in hex)
Code:
1   17463   0x4437
3   18527   0x485F
3   18526   0x485E
33  18503
33  18501
33  18487
33  18476
33  18463
33  18457
33  18456
33  18437
33  18435
33  18433
33  18426

I don't get it. I'm just going to have to hack something together. The ONLY pattern I think I can figure out is that these numbers are decreasing compared to the order of the strings. I also noticed that apparently object #17 also uses strings, as does object 6. Object 17 appears to be for internal use, so I won't actually draw those, but object 6 appears to be another string region (different font maybe?). Either way, here's what my hacked up approach will be:
1) Look for all numbers in this column, and collect them in a list in descending order.
2) When I have a string to grab, I will pull the string that corresponds to the order within the strings array corresponding to this number.

That should work for maps 1 and 3, but I have no idea how well it will sort itself out for map 33. At least it will get us un-stuck for now. First, let's add the new fields to the record:

Code:
class objrecord(object):
    def __init__(self, record):
        self.rawdata = record
        (self.sprtype, self.x, self.y, self.colour) = record[0:4]
        (self.width, self.height, self.subtype) = record[5:8]
        self.stringref = record[13]

Now, let's get that lookup table implemented:
Code:
# String reference lookup table. This is a bit of a hack for now.
# Sort all known string references in reverse order:
self.stringlookup = [record.stringref for record in self.objs if record.stringref > 0]
self.stringlookup.sort(reverse=True)

And a lookup method to get the correct string:
Code:
def getstring(self, stringref):
    strindex = self.stringlookup.index(stringref)
    return self.strings[strindex]

To get the right text colour, I will need a palette lookup method for the graphics object:
Code:
def getcolour(self, index):
    return tuple(self.palette[index*3:index*3+3])

To be able to do the text lookup, I need to pass a reference to the map into my sprite draw routine. Then I need to actually write the text sprite class to take advantage of all this new infrastructure. For the font, I'll just go with DroidSansMono for now. I'll pick 10 point font for object 7 and 8 point font for object 6. Here's how the text sprite class ended up:

Code:
class textsprite(sprite):
    def __init__(self, font, graphics):
        self.font = font
        self.graphics = graphics

    def draw(self, mappicture, objrec, mapdata):
        pen = ImageDraw.Draw(mappicture)
        pen.text((objrec.x, objrec.y), mapdata.getstring(objrec.stringref),
                font=self.font, fill=self.graphics.getcolour(objrec.colour))

More importantly, it works!



And even works for Board 33!



But the font is a bit too thin. Let me see if I can find a monospaced font that has a dedicated bold version. I'll also tweak the alignment a bit.

Here's FreeMonoBold, with a bit larger size. It'll do for a while. It's not that great, so I think we will want to get some handling for the ingame font. But that can wait now that we have the actual text display working nicely.



Now on to identifying items. Stage 1 is done, so stage 2 is next.

But it looks like we hit a snag on level 2. There is a centipede monster in this stage, which is stored as a series of segments in the graphics file, so we can't just draw a single sprite. We need a method for creating a compound image to draw him properly. Also, sprite 73 appears to be a hidden pickup. We will want to indicate these some way (i.e. by making them semi-transparent, probably). I will tackle all this tomorrow. Both cases should be just a matter of pre-processing the images before creating the corresponding sprite object and shouldn't require their own types of sprites.

day10.zip is available.
Logged
Zerker
Newbie
*
Offline Offline

Posts: 37



« Reply #13 on: November 24, 2012, 01:31:56 PM »

Today I'm going to handle the slightly more complicated sprites we want to handle. First, here's a screenshot of the centipede monster we want to re-create from the individual segment sprites:



I count a head, six segments, then a tail. The map data tells me that the bounding box is 76 x 22, although the actual sprite dimensions vary from 16x20 (head) to 8x17 (segment) to 12x8 (tail). However, each segment appears to connect directly to the next with no padding on the sprite image itself, so that will make things easy. Adding up the required widths adds up to exactly 76, so there's some additional confirmation. Let's make up a method for making a composite sprite. I'm adding this to the graphics file, since it owns the original images. It could have also been added to the spritedb file, but it would need a copy of the graphics file, so that seems somewhat silly.

Code:
def compositeimage(self, dimensions, imgrequests):
    tempimage = Image.new("RGBA", dimensions)
    for (x, y, recnum, imgnum) in imgrequests:
        pasteimage = self.records[recnum].images[imgnum]
        tempimage.paste(pasteimage, (x, y), pasteimage)
    return tempimage

Code:
self.addsprite(52, 7, sprite(graphics.compositeimage((76, 22), [(0, 0, 52, 0),
    (16, 5, 52, 1), (24, 5, 52, 2), (32, 5, 52, 3), (40, 5, 52, 4),
    (48, 5, 52, 5), (56, 5, 52, 6), (64, 7, 52, 7)] )))

And that worked out well enough:


On to transparency. Again, I'll add this to the graphics file, although it does not really need to go there. It's going to be a static method that simply takes the image to make transparent and the desired alpha value. We're simply going to use the multiply channel operation to perform this:

Code:
@staticmethod
def semitransparent(inimage, alpha):
    alphaimage = Image.new("RGBA", inimage.size, (255, 255, 255, alpha))
    return ImageChops.multiply(inimage, alphaimage)

And expanding our pickup population loop:

Code:
# Pickups appear to be in the same order as their corresponding record.
# There are two types of pickups: normal and hidden.
for subtype in range(24):
    self.addsprite(33, subtype, sprite(graphics.records[37].images[subtype]))
    self.addsprite(73, subtype, sprite(graphics.semitransparent(
        graphics.records[37].images[subtype], 128) ))

Looking good:


Stage 2 is done now. On to stage 3.

Well now. Stage 3 had a black background (that sometimes flashes with lightning). That is a far cry from the blue the map generates for it. Taking a look at the colour palette of the screenshot, and it is indeed different. Board 6 and 7 also look like they should have black backgrounds. However, looking at the map data doesn't seem to imply anything different between those maps and the maps with blue backgrounds. I think I'm just going to have to by the map number for this.

Oh yeah, and re-architect my graphics and sprite routines to allow switching palettes. Piece of cake, right?

Also, we have another problem in stage 3:


Alternate present boxes! Back to the object csv:
Code:
26  864     800 0   0   16  16  0   0   0   2   0   0   0   0   0
26  560     784 1   0   16  16  2   0   0   3   0   0   0   0   0
26  608     768 1   0   16  16  1   0   0   3   0   0   0   0   0
26  496     720 1   0   16  16  2   0   0   3   0   0   0   0   0
26  528     736 1   0   16  16  2   0   0   3   0   0   0   0   0
26  128     112 0   0   16  16  13  0   0   4   0   0   0   0   0
26  112     128 0   0   16  16  11  0   0   4   0   0   0   0   0
26  144     112 0   0   16  16  11  0   0   4   0   0   0   0   0
26  160     96  0   0   16  16  11  0   0   4   0   0   0   0   0
26  176     128 0   0   16  16  11  0   0   4   0   0   0   0   0
26  192     112 0   0   16  16  13  0   0   4   0   0   0   0   0
26  864     272 0   -1  16  16  0   0   0   0   0   0   0   0   0
26  1936    144 0   -1  16  16  0   0   0   0   0   0   0   0   0
26  1824    896 0   -1  16  16  11  0   0   0   0   0   0   0   0
26  1472    896 1   -1  16  16  2   0   0   0   0   0   0   0   0
26  1856    528 0   -1  16  16  11  0   0   0   0   0   0   0   0
26  1376    608 1   0   16  16  2   0   0   5   0   0   0   0   0
26  1328    624 1   0   16  16  13  0   0   5   0   0   0   0   0
26  1280    608 1   0   16  16  13  0   0   5   0   0   0   0   0
26  1264    624 1   0   16  16  2   0   0   5   0   0   0   0   0
26  1168    624 1   0   16  16  2   0   0   5   0   0   0   0   0
26  1040    736 1   0   16  16  2   0   0   5   0   0   0   0   0
26  1936    992 0   -1  16  16  0   0   0   0   0   0   0   0   0
26  272     944 0   -1  16  16  0   0   0   0   0   0   0   0   0
26  1312    992 1   -1  16  16  4   0   0   0   0   0   0   0   0
26  704     816 1   0   16  16  12  0   0   3   0   0   0   0   0

I'm going to guess it's that column that goes 2, 3, 4, 0, 5, 3. To avoid totally re-architecting my sprite algorithm for ONE special case, I'm going to create a new type of sprite for treasure boxes. And yank the contents handling from the normal sprite class, since it doesn't get used in any other context.

Code:
class treasuresprite(sprite):
    def __init__(self, graphics, contents):
        # Create a lookup of possible boxes
        self.types = {0 : graphics.records[37].images[25],
            1 : graphics.debugimage('T', 1, 16, 16),
            2 : graphics.debugimage('T', 2, 16, 16),
            3 : graphics.records[37].images[27],
            4 : graphics.debugimage('T', 4, 16, 16),
            5 : graphics.debugimage('T', 5, 16, 16)
            }
        self.xoffs = 0
        self.yoffs = 0
        self.contents = contents

    def draw(self, mappicture, objrec, mapdata):
        # Pick the correct image then use the parent routine to draw the box
        self.image = self.types[objrec.info]
        super(treasuresprite, self).draw(mappicture, objrec, mapdata)

        # Place contents immediately above the current sprite
        mappicture.paste(self.contents, (objrec.x +self.xoffs,
            objrec.y +self.yoffs - self.contents.size[1]), self.contents)

Well, that works and all, but it misidentifies several treasure boxes. Hrm, comparing map 1 to this map, it appears that map 1 always uses '3' in the 'colour' field, while this map uses 0 and 1. Let's assume it's the colour field instead and re-do this.



Fixed that. Next is the palette thing. To do this, we need to defer the masking operation in the graphics file. We should store the images in index format initially, then create the masked version on-demand. Since we already have other functions directly referencing graphics.records[n].images[m], we should store the raw data in a different location:
Code:
tile = Image.fromstring("P", (width, height),
    self.filedata.read(width*height))
tile.putpalette(palette)
self.origimages.append(tile)
self.images.append(self.maskimage(tile))

Then we need a method of switching palettes:
Code:
def changepalette(self, palette):
    for imagepos, image in enumerate(self.origimages):
        image.putpalette(palette)
        self.images[imagepos] = self.maskimage(image)

Finally, some control at the graphics file level to switch between a choice of palettes. I will also add a check to save time by not switching to the same palette if it is already loaded:

Code:
def changepalette(self, palnum):
    if self.activepal != palnum:
        self.activepal = palnum
        for record in self.records:
            record.changepalette(self.palette[self.activepal])

And don't forget the getcolour method:

Code:
def getcolour(self, index):
    return tuple(self.palette[self.activepal][index*3:index*3+3])

Then we just need to change palettes for each map. We could go by the map name, but I noticed the first byte in the "unknown" region of the map data appears to be the map number. I'm going to go by that.

Code:
if mapdata.mapnum in [3, 6, 7, 10]:
    graphics.changepalette(1)
else:
    graphics.changepalette(0)
sprites = spritedb(graphics)

And it works as expected. Now we don't have the garish blue background in the dark levels. Time to continue identifying sprites in level 3.

Well, I ran into another interesting thing. Sprite 12 can sometimes be visible! We need to determine what decides this and adjust accordingly. Checking the object CSV, it looks like it has a 1 in what I called the "colour" column. Looks like I should expand my treasure box sprite into something that can also handle other similar variable sprites. I will then have to make the contents optional again. Let's see how that turns out:

Code:
class variablesprite(sprite):
    def __init__(self, imagelookup, contents=None):
        # Create a lookup of possible boxes
        self.types = imagelookup
        self.xoffs = 0
        self.yoffs = 0
        self.contents = contents

    def draw(self, mappicture, objrec, mapdata):
        # Pick the correct image then use the parent routine to draw the box
        self.image = self.types[objrec.colour]
        super(variablesprite, self).draw(mappicture, objrec, mapdata)

        # Place contents immediately above the current sprite
        if self.contents != None:
            mappicture.paste(self.contents, (objrec.x +self.xoffs,
                objrec.y +self.yoffs - self.contents.size[1]), self.contents)

The actual lookup gets pulled out into the sprite db prior to creating the treasure boxes:
Code:
# Treasures (+ contents)
treasurelookup = {0 : graphics.records[37].images[24],
    1 : graphics.records[37].images[25],
    2 : graphics.debugimage('T', 2, 16, 16),
    3 : graphics.records[37].images[27] }

for (sprtype, subtype, crecnum, cimagenum) in [
        (26, 0, 37, 33), # Health
        (26, 1, 37, 2), # Grapes
        (26, 2, 37, 6), # Cherry
        (26, 4, 37, 14), # Orange
        (26, 11, 30, 28), # Emerald
        (26, 12, 48, 2), # Nitro!
        (26, 13, 36, 29) # Empty
        ]:
    self.addsprite(sprtype, subtype, variablesprite(treasurelookup,
        contents=graphics.records[crecnum].images[cimagenum]))

And the switches. FYI: 30, 19 is an empty sprite
Code:
# Switches:
self.addsprite(12, 0, variablesprite({
    0 : graphics.records[30].images[19],
    1 : graphics.records[51].images[0]}))

Oh yeah, and there's a series of hidden platforms that are exposed by this switch. Let's create them as a composite AND semi-transparent sprite:

Code:
self.addsprite(11, 0, sprite(graphics.semitransparent(
    graphics.compositeimage((32, 16), [(0, 0, 25, 14),
    (16, 0, 25, 15)]), 128) ))

And here's how the switch and hidden platforms look:


With that, and a few more sprites identified, stage 3 is done. day11.zip is available.
Logged
Zerker
Newbie
*
Offline Offline

Posts: 37



« Reply #14 on: November 25, 2012, 09:47:47 AM »

Good morning folks. When doing the switches and treasure boxes yesterday, I came to a realization. The column that I first attempted as the treasure box appearance actually seems to be the link between a switch and what it affects. Why don't we try drawing this identifier into the world to clearly indicate the effects of switches? If we don't like how it turns out, we can always disable it again.

To do this, I'm simply going to create a new method in the spritedb file to draw this onto the map. Since this is NOT debug text, we're going to want to draw this in a way that is more visible. I'm going to re-use the code from my Shadow Caster map to draw it first offset a couple times in black, then in white to get a bordered effect.

Code:
def drawsprite(self, mappicture, objrec, mapdata):
    if objrec.sprtype not in self.sprites or \
            objrec.subtype not in self.sprites[objrec.sprtype]:
        self.addsprite(objrec.sprtype, objrec.subtype, sprite(
            self.graphics.debugimage(objrec.sprtype, objrec.subtype,
            objrec.width, objrec.height)))

    self.sprites[objrec.sprtype][objrec.subtype].draw(mappicture, objrec, mapdata)

    if objrec.info != 0:
        self.drawlabel(mappicture, (objrec.x -8, objrec.y -8), str(objrec.info))


def drawlabel(self, mappicture, coords, text):
    # Draw the text 5 times to create an outline
    # (4 x black then 1 x white)
    pen = ImageDraw.Draw(mappicture)
    for offset, colour in [( (-1,-1), (0,0,0) ),
            ( (-1,1), (0,0,0) ),
            ( (1,-1), (0,0,0) ),
            ( (1,1), (0,0,0) ),
            ( (0,0), (255,255,255) )]:
        pen.text((coords[0] + offset[0], coords[1] + offset[1]),
            text, font=self.markupfont, fill=colour)



Well, it seems to show up in too many places. I'm going to comment it out for now, but let's keep it in the back of our minds. We may still want something similar, but only for specific scenarios. This also appears to provide links for doorways, which is always handy.

In any case, I'm just going to go ahead and identify items in stage 4 now. Though I'm getting really tired of going in and out of subfolders to try to find sprites. I'm going to flatten my export structure from the graphics file, and remove the record offset.

Code:
    def save(self, outpath):
        createpath(outpath)
        for recnum, record in enumerate(self.records):
            record.save(outpath, recnum)
Code:
def save(self, outpath, recnum):
    if self.numimages > 0:
        createpath(outpath)
        for tilenum, tile in enumerate(self.images):
            tile.save(os.path.join(outpath, '{:02}-{:04}.png'.format(recnum, tilenum)) )

The first few sprites weren't anything special, although I found the sprite number of an illusionary wall, which I made semi-transparent. However, I ran into the following location on the map:



Which doesn't contain any unknown sprites in my current map. This must be another object I'm not drawing. Let me comment out my hidden objects and find which one it is. But even turning off every identified sprite STILL doesn't draw anything in this location. It doesn't make any sense. I will finish mapping this stage, then double-check my map decoding to make sure I'm not missing any sprites somehow.

I also ran into ceiling sprites, which are apparently the same ID as floor spikes. Probably differentiated by the "colour" field. I'm going to go ahead and rename that field to appearance. Though looking at the objs file, it appears to be the next field one over. I'll tentatively call this one "direction" and update the variablesprite class to specify which field to look up. I'll have to use access the raw underlying python field dictionary to make it work, though.

Code:
class variablesprite(sprite):
    def __init__(self, imagelookup, contents=None, field='apperance'):
        # Create a lookup of possible boxes
        self.types = imagelookup
        self.xoffs = 0
        self.yoffs = 0
        self.contents = contents
        self.field = field

    def draw(self, mappicture, objrec, mapdata):
        # Pick the correct image then use the parent routine to draw the box
        self.image = self.types[objrec.__dict__[self.field]]
        super(variablesprite, self).draw(mappicture, objrec, mapdata)

        # Place contents immediately above the current sprite
        if self.contents != None:
            mappicture.paste(self.contents, (objrec.x +self.xoffs,
                objrec.y +self.yoffs - self.contents.size[1]), self.contents)
Code:
# Spikes:
self.addsprite(59, 0, variablesprite({
    0 : graphics.records[36].images[28],
    1 : graphics.records[36].images[32]},
    field='direction'))

Looking over the objects file, I figured out what was going on. Those bouncing ball traps reported themselves as having no size in the objects file! This malfunctioned by creating an invisible debug image. I'm going to go ahead and switch to always generate a debug image. For any sprites I did not want to draw, I'm going to redirect to record 36, image 28 instead (i.e. an empty sprite).

With that, stage 4 is complete.



day12.zip is available. I've moved the CSVs and flat maps into a sub-folder in order to better organize things.
Logged
Pages: [1] 2   Go Up
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.20 | SMF © 2013, Simple Machines Valid XHTML 1.0! Valid CSS!