View Issue Details

IDProjectCategoryView StatusLast Update
0010061Dwarf FortressTechnical -- Generalpublic2019-02-12 12:32
ReporterSolra Bizna Assigned Tolethosor  
PrioritynormalSeveritycrashReproducibilityalways
Status resolvedResolutionduplicate 
PlatformLinuxOSDebianOS VersionStretch
Product Version0.43.05 
Summary0010061: This save is seconds away from crashing DF
DescriptionI'm running DF on Debian Linux. A thousand or so ticks after loading this save, the game crashes.
Steps To Reproduce1. Load save.
2. Wait. DF will crash soon after the yak is slaughtered, if not before.
Additional InformationSave is here: http://dffd.bay12games.com/file.php?id=12542

64-bit, PRINT_MODE 2D: Xlib abort with a multithreading-related error message... unless I'm running DF through gdb, in which case there's a floating point exception
64-bit, dfstream: Floating point exception
32-bit, PRINT_MODE 2D: Floating point exception

Changing Z-levels seems to slightly alter the timing of the crash.

All of the floating point exceptions take place with a backtrace similar to this:

#0 0x08c749a1 in ?? ()
0000001 0x08d87946 in ?? ()
0000002 0x0897d591 in ?? ()
0000003 0x0897dad5 in ?? ()
0000004 0x083f93e2 in ?? ()
0000005 0xf7b29ea1 in interfacest::loop() ()
   from /home/sbizna/df_linux32/libs/libgraphics.so
0000006 0x08665d4f in mainloop() ()
0000007 0xf7b0cb92 in enablerst::async_loop() ()
   from /home/sbizna/df_linux32/libs/libgraphics.so
0000008 0xf7b0cf8d in call_loop(void*) ()
   from /home/sbizna/df_linux32/libs/libgraphics.so
0000009 0xf7efb155 in ?? () from /usr/lib/i386-linux-gnu/libSDL-1.2.so.0
0000010 0xf7f3f048 in ?? () from /usr/lib/i386-linux-gnu/libSDL-1.2.so.0
0000011 0xf735e2da in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
0000012 0xf781691e in clone () from /lib/i386-linux-gnu/libc.so.6
Tags0.44.09, 0.44.12

Relationships

duplicate of 0008410 resolvedlethosor Crash due to zero-size weasel 
has duplicate 0010859 resolvedlethosor Constant Crashes 

Activities

Solra Bizna

2016-11-03 12:59

reporter   ~0036028

With the help of dwarf_fortress_unfuck I did a little investigating. The exceptions occur within interfacest::loop()'s call to currentscreen->logic(). I hacked in a SIGFPE handler that prints the type of error that occurred (for diagnostic purposes), and throws a catchable exception, then wrapped the currentscreen->logic() call in a try/catch block for that exception. With this in place, if currentscreen->logic() triggers a SIGFPE, it is simply skipped until the next time the loop runs.

From this, I learned two things:

1. As I suspected, the SIGFPE is being caused by an integer divide by zero. (The si_code is FPE_INTDIV.)
2. With this hack in place, instead of crashing, the game... "freezes". The interface is responsive, and it seems to believe it's still processing ticks, but creatures stop moving. Days still seem to pass normally, and at random intervals the game "unfreezes" for a few dozen ticks.

Interestingly, if I stop blocking the river (by pulling the northmost lever in Asob's office), fluid flow continues during the "freeze".

The outpost liason is visiting the fort. I suspected he might be the cause of the problem, so I ordered my crummy militia to bludgeon him to death. This didn't prevent the divide by zero from eventually occurring, and _did_ cause a loyalty cascade.

Deleting all of my work orders in the new manager didn't change anything either.

If I just power through the "freezes", the SIGFPEs eventually stop happening.

lethosor

2018-04-12 09:22

manager   ~0038154

Last edited: 2018-04-18 08:54

Confirmed in vanilla 0.44.09 on OS X.
For what it's worth, it appears to crash consistently at 0x0000000100c966e1 (at least 3 times now), although that's probably not very useful.
This was brought to my attention by https://github.com/DFHack/dfhack/issues/1257 , which may be the same issue as this.

lethosor

2018-08-08 09:39

manager   ~0038709

Save from 0.44.12, 0010859: http://dffd.bay12games.com/file.php?id=13950

risusinf

2019-02-07 02:19

reporter   ~0039194

Last edited: 2019-02-10 21:36

Both saves from OP and from the comment above stop crashing after
[DFHACK]# exterminate weasel
which is very similar to 0008410. It says something about zero body size, how do i check that?

Also see 0010253

lethosor

2019-02-12 12:27

manager   ~0039208

Last edited: 2019-02-12 12:31

Highlighting the weasel and running "lua ~unit.body.size_info" in DFHack prints
size_cur               	 = 0
size_base              	 = 1
area_cur               	 = 0
area_base              	 = 1
length_cur             	 = 0
length_base            	 = 21


Killing the weasel with DFHack stops the crash. From running "for _,u in ipairs(world.units.all) do if u.body.size_info.size_cur == 0 then print(u.id) end end" in Lua, this is the only zero-sized unit.

Thanks for investigating! I'll close this as a duplicate of 0008410.

Issue History

Date Modified Username Field Change
2016-11-02 11:47 Solra Bizna New Issue
2016-11-03 12:59 Solra Bizna Note Added: 0036028
2018-04-12 09:22 lethosor Note Added: 0038154
2018-04-12 09:22 lethosor Assigned To => lethosor
2018-04-12 09:22 lethosor Status new => confirmed
2018-04-12 09:22 lethosor Tag Attached: 0.44.09
2018-04-18 08:54 lethosor Note Edited: 0038154
2018-08-08 09:38 lethosor Relationship added has duplicate 0010859
2018-08-08 09:39 lethosor Note Added: 0038709
2018-08-08 09:39 lethosor Tag Attached: 0.44.12
2019-02-07 02:19 risusinf Note Added: 0039194
2019-02-10 21:36 risusinf Note Edited: 0039194
2019-02-12 12:27 lethosor Note Added: 0039208
2019-02-12 12:31 lethosor Note Edited: 0039208
2019-02-12 12:32 lethosor Relationship added duplicate of 0008410
2019-02-12 12:32 lethosor Status confirmed => resolved
2019-02-12 12:32 lethosor Resolution open => duplicate