December 23, 2009

Unbreaking Mono on FreeBSD 6.4

Categories: FreeBSD, Mono.

A few days ago ports/140916: lang/mono (2.4.2.3) installation fails was opened. The reported problem had already been reported a few time but with insufficient feedback so far and I was not able to diagnose the problem, not mentioning providing a fix. But this time, the reporter could spot that the problem was happening for him only with FreeBSD-6.4. I did the test on a fresh FreeBSD-6.4 install and could trigger the error:

arthur# portsnap fetch
[...]
arthur# portsnap extract
[...]
arthur# cd /usr/ports/lang/mono
arthur# make -V PKGNAME
mono-2.4.2.3_1
arthur# make
[...]
if test -w ../mcs; then :; else chmod -R +w ../mcs; fi
cd ../mcs && gmake NO_DIR_CHECK=1 PROFILES='net_1_1 net_2_0 net_3_5 net_2_1' CC='cc' all-profiles
gmake[3]: Entering directory `/usr/ports/lang/mono/work/mono-2.4.2.3/mcs'
gmake profile-do--net_1_1--all profile-do--net_2_0--all profile-do--net_3_5--all profile-do--net_2_1--all
gmake[4]: Entering directory `/usr/ports/lang/mono/work/mono-2.4.2.3/mcs'
gmake PROFILE=basic all
gmake[5]: Entering directory `/usr/ports/lang/mono/work/mono-2.4.2.3/mcs'
gmake[6]: *** [build/deps/basic-profile-check.exe] Error 1
gmake[6]: Entering directory `/usr/ports/lang/mono/work/mono-2.4.2.3/mcs'
*** The compiler 'false' doesn't appear to be usable.
*** Trying the 'monolite' directory.
gmake[7]: Entering directory `/usr/ports/lang/mono/work/mono-2.4.2.3/mcs'
gmake[8]: *** [build/deps/basic-profile-check.exe] Error 138
gmake[8]: Entering directory `/usr/ports/lang/mono/work/mono-2.4.2.3/mcs'
*** The contents of your 'monolite' directory may be out-of-date
*** You may want to try 'make get-monolite-latest'
gmake[8]: *** [do-profile-check-monolite] Error 1
gmake[8]: Leaving directory `/usr/ports/lang/mono/work/mono-2.4.2.3/mcs'
gmake[7]: *** [do-profile-check] Error 2
gmake[7]: Leaving directory `/usr/ports/lang/mono/work/mono-2.4.2.3/mcs'
gmake[6]: *** [do-profile-check-monolite] Error 2
gmake[6]: Leaving directory `/usr/ports/lang/mono/work/mono-2.4.2.3/mcs'
gmake[5]: *** [do-profile-check] Error 2
gmake[5]: Leaving directory `/usr/ports/lang/mono/work/mono-2.4.2.3/mcs'
gmake[4]: *** [profile-do--basic--all] Error 2
gmake[4]: Leaving directory `/usr/ports/lang/mono/work/mono-2.4.2.3/mcs'
gmake[3]: *** [profiles-do--all] Error 2
gmake[3]: Leaving directory `/usr/ports/lang/mono/work/mono-2.4.2.3/mcs'
gmake[2]: *** [all-local] Error 2
gmake[2]: Leaving directory `/usr/ports/lang/mono/work/mono-2.4.2.3/runtime'
gmake[1]: *** [all-recursive] Error 1
gmake[1]: Leaving directory `/usr/ports/lang/mono/work/mono-2.4.2.3'
gmake: *** [all] Error 2
*** Error code 1

Stop in /usr/ports/lang/mono.
*** Error code 1

Stop in /usr/ports/lang/mono.

A first look leads us on a wrong way (lane?): *** The compiler 'false' doesn't appear to be usable. This message is actually not an error message: the mono C# compiler (mcs(1)) is written in C# so some bootstrapping may be required if no version of mono is present on the system. The mono port adds EXTERNAL_MCS=false to MAKE_ARGS so that the mono port will be build exactly the same way regardless of an available mcs(1) in the path.

Hopefully, I tested on an old machine and could see that something was running for a couple seconds before failing. Ctrl+T in tcsh(1) told me more:

load: 0.08  cmd: mono 10910 [ksesigwait] 0.13u 0.10s 0% 24492k
load: 0.08  cmd: mono 10910 [ksesigwait] 0.13u 0.10s 0% 24964k

The good news is that some mono application is running. But wait, what is this ksesigwait state? A first look at the FreeBSD Kernel Cross Reference reveals that this state is defined in kern/kern_kse.c; the man page is kse(2); Wikipedia states KSEs were mandatory at introduction; made optional at kernel build time in the 7.0 release and removed from the 8.0 release with a compatibility library.

The second good news is that a 22MB mono.core file is written in the mcs directory.

arthur# gdb -q /usr/ports/lang/mono/work/mono-2.4.2.3/mono/mini/mono mono.core
Core was generated by `mono'.
Program terminated with signal 10, Bus error.
Reading symbols from /usr/local/lib/libgthread-2.0.so.0...done.
Loaded symbols for /usr/local/lib/libgthread-2.0.so.0
Reading symbols from /usr/local/lib/libglib-2.0.so.0...done.
Loaded symbols for /usr/local/lib/libglib-2.0.so.0
Reading symbols from /usr/local/lib/libintl.so.8...done.
Loaded symbols for /usr/local/lib/libintl.so.8
Reading symbols from /usr/local/lib/libiconv.so.3...done.
Loaded symbols for /usr/local/lib/libiconv.so.3
Reading symbols from /usr/local/lib/libpcre.so.0...done.
Loaded symbols for /usr/local/lib/libpcre.so.0
Reading symbols from /lib/libm.so.4...done.
Loaded symbols for /lib/libm.so.4
Reading symbols from /lib/libpthread.so.2...done.
Loaded symbols for /lib/libpthread.so.2
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /libexec/ld-elf.so.1...done.
Loaded symbols for /libexec/ld-elf.so.1
#0  0x2856bf2a in signalcontext () from /lib/libc.so.6
[New Thread 0x8319c00 (sleeping)]
[New Thread 0x8319800 (sleeping)]
[New Thread 0x8319600 (LWP 100098)]
[New Thread 0x82fd000 (runnable)]
[New LWP 100058]
(gdb) thread apply all bt

Thread 5 (LWP 100058):
#0  0x2856bf2a in signalcontext () from /lib/libc.so.6
#1  0x28527fc4 in pthread_mutexattr_init () from /lib/libpthread.so.2
#2  0x28303480 in ?? ()

Thread 4 (Thread 0x82fd000 (runnable)):
#0  0x285eb0ac in __vfprintf () from /lib/libc.so.6
#1  0x2855ff90 in vasprintf () from /lib/libc.so.6
[...]
#33814 0x081464b8 in mono_runtime_invoke (method=0x82fd81c, obj=0x0, params=0xbfbfe510, exc=0x0) at object.c:2401
#33815 0x0814785b in mono_runtime_exec_main (method=0x82fd81c, args=0x28657f60, exc=0x0) at object.c:3301
#33816 0x0814736c in mono_runtime_run_main (method=0x82fd81c, argc=2, argv=0xbfbfe74c, exc=0x0) at object.c:3089
#33817 0x080cdf7d in mono_jit_exec (domain=0x28653ee0, assembly=0x833c880, argc=3, argv=0xbfbfe748) at driver.c:924
#33818 0x080ce0e8 in main_thread_handler (user_data=0xbfbfe690) at driver.c:972
#33819 0x080cfc3a in mono_main (argc=6, argv=0xbfbfe73c) at driver.c:1647
#33820 0x08058f90 in main (argc=6, argv=0xbfbfe73c) at main.c:34

Thread 3 (Thread 0x8319600 (LWP 100098)):
#0  0x2852f5f3 in pthread_testcancel () from /lib/libpthread.so.2
#1  0x28527fc4 in pthread_mutexattr_init () from /lib/libpthread.so.2
#2  0x28303480 in ?? ()

Thread 2 (Thread 0x8319800 (sleeping)):
#0  0x28528097 in pthread_mutexattr_init () from /lib/libpthread.so.2
#1  0x28521ade in _nanosleep () from /lib/libpthread.so.2
#2  0x28521c42 in nanosleep () from /lib/libpthread.so.2
#3  0x081e32ca in collection_thread (unused=0x0) at collection.c:34
#4  0x28520449 in pthread_create () from /lib/libpthread.so.2
#5  0x285deecb in _ctx_start () from /lib/libc.so.6

Thread 1 (Thread 0x8319c00 (sleeping)):
#0  0x28528097 in pthread_mutexattr_init () from /lib/libpthread.so.2
#1  0x2852822b in pthread_mutexattr_init () from /lib/libpthread.so.2
#2  0x2852c839 in _pthread_cond_wait () from /lib/libpthread.so.2
#3  0x2852cd82 in pthread_cond_wait () from /lib/libpthread.so.2
#4  0x081e86d8 in _wapi_handle_timedwait_signal_handle (handle=0x1d05, timeout=0x0, alertable=0, poll=0) at handles.c:1605
#5  0x081e846e in _wapi_handle_wait_signal_handle (handle=0x1d05, alertable=0) at handles.c:1548
#6  0x08205f2e in WaitForSingleObjectEx (handle=0x1d05, timeout=4294967295, alertable=0) at wait.c:205
#7  0x081608a0 in finalizer_thread (unused=0x0) at gc.c:1061
#8  0x081806f0 in start_wrapper (data=0x8326380) at threads.c:623
#9  0x08200367 in thread_start_routine (args=0x832c230) at threads.c:286
#10 0x08223adf in GC_start_routine (arg=0x2865aec0) at pthread_support.c:1382
#11 0x28520449 in pthread_create () from /lib/libpthread.so.2
#12 0x285deecb in _ctx_start () from /lib/libc.so.6
#0  0x2852f5f3 in pthread_testcancel () from /lib/libpthread.so.2

Yes, you read it well, the thread 4 backtrace is 33820 frame long! Looks like we have a stack overflow... Since the problem occurs on FreeBSD 6 (where KSE is enabled by default) and not on FreeBSD 7 (where KSE is disabled by default) nor FreeBSD 8 (No KSE at all), it is likely to be related to the thread implementation.

Let's launched one more time gmake(1) and hit Ctrl+Z to suspend all when mono was filling-in it's stack in order to have some hints about the exact command being executed. Reducing it to a simple test is trivial, e.g.:

arthur# pwd
/usr/ports/lang/mono/work/mono-2.4.2.3/mcs
arthur# env MONO_PATH="$PWD/class/lib/monolite" ../mono/mini/mono --config ../runtime/etc/mono/config ./class/lib/monolite/mcs.exe
Bus error (core dumped)
arthur# 

Now we know what is being run. Let's get some info about this binary file:

arthur# ldd ../mono/mini/mono
../mono/mini/mono:
	libgthread-2.0.so.0 => /usr/local/lib/libgthread-2.0.so.0 (0x28304000)
	libglib-2.0.so.0 => /usr/local/lib/libglib-2.0.so.0 (0x28309000)
	libintl.so.8 => /usr/local/lib/libintl.so.8 (0x283c9000)
	libiconv.so.3 => /usr/local/lib/libiconv.so.3 (0x283d2000)
	libpcre.so.0 => /usr/local/lib/libpcre.so.0 (0x284c7000)
	libm.so.4 => /lib/libm.so.4 (0x284fc000)
	libpthread.so.2 => /lib/libpthread.so.2 (0x28512000)
	libc.so.6 => /lib/libc.so.6 (0x28537000)

pthread(3) list all thread libraries available on the system:

On my FreeBSD 8.0-STABLE machine, ldd(1) reports (mono-2.6 instead of 2.4):

marvin# ldd `which mono`
/usr/local/bin/mono:
	libgthread-2.0.so.0 => /usr/local/lib/libgthread-2.0.so.0 (0x8008b4000)
	libglib-2.0.so.0 => /usr/local/lib/libglib-2.0.so.0 (0x8009b8000)
	libicui18n.so.38 => /usr/local/lib/libicui18n.so.38 (0x800b74000)
	libintl.so.8 => /usr/local/lib/libintl.so.8 (0x800dca000)
	libiconv.so.3 => /usr/local/lib/libiconv.so.3 (0x800ed3000)
	libpcre.so.0 => /usr/local/lib/libpcre.so.0 (0x8010cd000)
	libm.so.5 => /lib/libm.so.5 (0x8011fd000)
	libthr.so.3 => /lib/libthr.so.3 (0x80131c000)
	libc.so.7 => /lib/libc.so.7 (0x801434000)
	libicuuc.so.38 => /usr/local/lib/libicuuc.so.38 (0x80166e000)
	libicudata.so.38 => /usr/local/lib/libicudata.so.38 (0x80189f000)
	libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x80248e000)
	libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x802699000)

So let's try to use libthr on FreeBSD 6. This can be achieved using libmap.conf(5): instead of /lib/libpthread.so(.2) we want to use /usr/lib/libthr.so(.2):

arthur# cat >> /etc/libmap.conf << EOT
libpthread.so.2  libthr.so.2
libpthread.so    libthr.so
EOT

then...

arthur# make
[...]
arthur# make tests
[...]
363 test(s) passed. 0 test(s) did not pass.
[...]
arthur# echo $?
0

\o/

No Comments Yet

Comments RSS feed | Leave a Reply…

top