View Issue Details

IDProjectCategoryView StatusLast Update
0002513Dwarf FortressMiscellaneous Crashespublic2010-09-24 05:55
Reportersmariot Assigned ToLogical2u  
PrioritynormalSeverityminorReproducibilityalways
Status resolvedResolutionno change required 
Platform64bit LinuxOSGentooOS Versionover 9000!!!!!!!
Product Version0.31.08 
Summary0002513: Fails to start.
DescriptionI updated my system, and Dwarf Fortress stopped working; it dumps the following message to stderr:

*** glibc detected *** ./libs/Dwarf_Fortress: realloc(): invalid next size: 0x0b01dc48 ***
======= Backtrace: =========
/lib32/libc.so.6(+0x6c5d1)[0xf70705d1]
/lib32/libc.so.6(+0x71d35)[0xf7075d35]
/lib32/libc.so.6(realloc+0xdd)[0xf707601d]
//usr/lib32/opengl/nvidia/lib/libGL.so.1(+0x2f831)[0xf61c8831]
(plus some junk I don't expect you'll find useful)


I noticed it worked properly under Valgrind, so I theorized it might be caused by some strange memory alignment issue.

After some playing around, I came up with this bash script:


#!/bin/bash
gcc -x c - -shared -fpic -m32 -o /tmp/df-workaround.so << EOF
#include <stdlib.h>
void *malloc(size_t size) {
  return realloc(0, size);
}
EOF
LD_PRELOAD=/tmp/df-workaround.so ./libs/Dwarf_Fortress


This hack obviously shouldn't work, but for whatever reason, does. It makes no sense, and I'm very tempted to blame on glibc. I'm using 2.11.2 if you want to try to reproduce this.
TagsNo tags attached.

Relationships

related to 0002997 resolvedLogical2u malloc memory corruption (heap dump) 

Activities

user6

2010-06-29 12:56

  ~0009315

Have you tried updating your graphics drivers?

smariot

2010-06-29 13:01

reporter   ~0009317

Thinking about it, it probably really is a memory alignment issue, and my hack fixes it for no other reason than that it manages to re-arrange things so that the offending allocation falls in a safe location entirely by chance.

smariot

2010-06-29 13:26

reporter   ~0009318

Last edited: 2010-06-29 13:57

My drivers are the latest version available in portage, but admittedly, that's about two months behind the bleeding edge stuff on Nvidia's site.

I concede that it could in fact be the driver's fault, but I'm not quite willing to try installing them - I expect their installer will try to do something stupid like overwrite my Mesa OpenGL headers.

-- Edit --

I extracted the 256.35 Nvidia OpenGL libraries from the bleeding edge driver and set DF to use them - it crashes in exactly the same way. I have no intention of attempting compile and use the 256.35 kernel module.

smariot

2010-07-11 15:27

reporter   ~0009965

0.31.10 works without modification for me, unlike 0.31.8 which still explodes if I don't replace malloc.

However, I'm not convinced it's actually fixed, and instead have decided to assign a 98.4% probability to it being fixed if Toady can get to version 0.31.16 without it breaking again.

oliver

2010-07-11 15:51

reporter   ~0009967

That class of error usually means something somewhere has overrun its allocated block, trashing the heap control data. Horrible type of error to find, because it could be caused by essentially anything before the crash, not necessarily code in the immediate call path.

As you say, your realloc hack probably just juggles alignment so through random chance there happens to be a bit of "safe" memory at the location in question, and the overrun doesn't hurt anything. Maybe you could try running with Electric Fence / DUMA?

Baughn

2010-07-12 02:05

manager   ~0010011

If that was the problem, Valgrind really should have caught it. I regularly run df under valgrind for testing, too..

I saw libGL in your backtrace. Does this crash also happen in 2D mode?

smariot

2010-07-12 11:47

reporter   ~0010052

Last edited: 2010-07-12 18:46

Valgrind didn't have any serious complaints about DF, and it should be more than capable of detecting a buffer overflow.

2D mode seems to be the default, at least my init.txt file contained [PRINT_MODE:2D] when I looked at it.

I tried changing it to [PRINT_MODE:TEXT], and it still breaks, still with libGL.so.1 on the bottom of the stack. It really is using text mode though, it shows up on my terminal using the malloc hack.

---

I created a new user on my computer, with the intention of maybe letting you use my computer remotely, but it works under the new user. >_<

And no amount of fiddling is breaking it.

---

Copied .bashrc to new user and sourced it, in case it was an environment variable. Still works.
Tried using original user remotely. Still broken.
Deleted ~/.* under original user; Still broken.
Killed X and started DF from console in text mode as original user. **Works**
Did a proper graphical login under new user. Still works.

In conclusion, I can't make this stupid bug manifest itself reliably. The game was obviously programmed to explode when USER=smariot, X is running, and the minor version is a power of two, for no other reason than to mess with my head.

---

It occurred to me that the new user wasn't part of the audio group on my computer, and that therefore the sound system wasn't being initialized. I added the new user to the audio group, and now it crashes under that user. Woot!

smariot

2010-09-24 01:04

reporter   ~0012938

The previous 6 versions have all worked properly for me, so I'm inclined to conclude that whatever the problem was with 0.31.08 has since been corrected.

Logical2u

2010-09-24 05:55

manager   ~0012942

I'll hold off on marking it 'fixed' for now, but if it's not occurring any more, I'll at the very least resolve it. Reopen this if it starts up again.

The child off of this seems to be somewhat different, so I've replaced that relationship.

Issue History

Date Modified Username Field Change
2010-06-29 12:45 smariot New Issue
2010-06-29 12:56 user6 Note Added: 0009315
2010-06-29 13:01 smariot Note Added: 0009317
2010-06-29 13:26 smariot Note Added: 0009318
2010-06-29 13:57 smariot Note Edited: 0009318
2010-07-11 15:27 smariot Note Added: 0009965
2010-07-11 15:51 oliver Note Added: 0009967
2010-07-12 02:05 Baughn Note Added: 0010011
2010-07-12 11:47 smariot Note Added: 0010052
2010-07-12 13:20 smariot Note Edited: 0010052
2010-07-12 13:52 smariot Note Edited: 0010052
2010-07-12 18:46 smariot Note Edited: 0010052
2010-08-10 06:09 user6 Relationship added parent of 0002997
2010-09-24 01:04 smariot Note Added: 0012938
2010-09-24 05:53 Logical2u Relationship replaced related to 0002997
2010-09-24 05:55 Logical2u Note Added: 0012942
2010-09-24 05:55 Logical2u Status new => resolved
2010-09-24 05:55 Logical2u Resolution open => no change required
2010-09-24 05:55 Logical2u Assigned To => Logical2u