Monday 8 September 2008

STPPC2X - "Mines" is now working!

Ah-ha!

Right, FINALLY. I got you, you mines-bast**d. FIXED. FIXED. FIXED. Version 1.0 will have working "mines"! Guaranteed.



Okay, for some reason, the particular version of gcc on ARM that the Open2x development environment uses does not treat an relatively-undocumented (and technically "broken") use of memset on the ARM platform the same as it does on x86. I think it's the alignment of ARM but I can't be bothered to check the exact technical details know that I have found it.

void * memset ( void * ptr, int value, size_t num);
Sets the first num bytes of the block of memory pointed by ptr to the specified value (interpreted as an unsigned char).


Notice the "unsigned char" bit. So when you do what mines.c was doing: memset(a pointer, -2, a number) it works perfectly on x86 (and all the other supported platforms, including MacOS, Palm, etc.) but does this on ARM:

"-1 -2 -2 -2 -1 -2 -2 -2 -1 -2 -2 -2"


Which buggers up mines! The fix is not to use memset but to fill the array manually - you get the proper result and mines no longer crashes but "just works". There were three places, only in mines.c, that do similar things. It seems to work fine for the other uses which are mostly -1's.

This fix was found by the following method:

1) Write original author of mines.c a long begging letter with lots of technical details on the problem
2) Recieve reply with basically says "No idea, can you print out the contents of X at line Y for me to look at?"
3) Edit code to print out contents of X at line Y on both PC and GP2X.
4) Gather debug logs from both platforms and compare.
5) Draft another long begging letter which says "This is what I got".
6) Just before hitting Send with large debug logs attached, notice that there is a pattern to the what-should-be-empty array.
7) On a hunch, gradually keep shifting the debug check higher and higher in the code, eventually placing it immediately after a memset.
8) Spot the problem (memset arrays aren't actually uniformly memset)
9) Google the definition of memset which includes the "unsigned char" line.
10) BELLS RING.
11) Search for all memsets with negative parameters and replace with for() loops.
12) Test on PC and GP2X.
13) Rewrite email to Simon Tatham, including a patch and problem description and press Send.
14) Kick self repeatedly and hard.

Anyway... it's fixed. Mines works. Finally. Can't believe it was something so stupid. Multiple human error, because obviously nobody who's touched the mines code had ever noticed the error before and only a compile on an ARM-based device showed it up. Patch winging its way to Simon Tatham for inclusion in the main code. Sigh of relief. Celebration game of mines.

Now, who's got the champagne?

1 comment:

BATTLEFINCH said...

good work. what a weird error.