abort openmp application
Dear All,
At MacOS X 10.7 Lion sudden aborts of our Huygens application was encountered by various customers. After investigation of the the log reports it appeared to happen at OpenMP pragmas and especially at rapid close-reopens of the OpenMP team. This abort behavior could also be fond with the following small test application where the OpenMP pragmas are used from within a loop, simulating in his way the rapid close-reopens of OpenMP:
#include <stdlib.h>
#include <stdio.h>
#include <omp.h>
#define NR_RUNS 1000
#define NR_SUMS 20
int main (int argc, char *argv[]) {
int runIndex, sumIndex;
int sum;
for (runIndex = 0; runIndex < NR_RUNS; runIndex++) {
sum = 0;
#pragma omp parallel for reduction(+:sum) shared(runIndex) private(sumIndex)
for (sumIndex = 1; sumIndex < NR_SUMS; sumIndex++) {
sum += sumIndex;
printf("Run: %d; thread: %d; index: %d; sum: %d\n",
runIndex, omp_get_thread_num(), sumIndex, sum);
fflush(stdout);
}
printf ("Final sum: %d\n", sum);
fflush(stdout);
}
return 0;
}
This application was compiled with the default GCC version for Lion: i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1.
A typical log of such a crash is:
Process: | omp_test [878] |
Path: | /Users/*/omp_test |
Identifier: | omp_test |
Version: | ??? (???) |
Code Type: | X86-64 (Native) |
Parent Process: bash [850]
Date/Time: | 2012-03-07 14:57:22.720 +0100 |
OS Version: | Mac OS X 10.7.3 (11D50d) |
Report Version: 9
Crashed Thread: 0 Dispatch queue: com.apple.main-thread
Exception Type: EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib | 0x00007fff83362bca __psynch_cvwait + 10 |
1 libsystem_c.dylib | 0x00007fff85fd4274 _pthread_cond_wait + 840 |
2 omp_test | 0x000000010c560d6b gomp_sem_wait + 59 |
3 omp_test | 0x000000010c560e4c gomp_barrier_wait_end + 108 |
4 omp_test | 0x000000010c560b6c gomp_team_end + 44 |
5 omp_test | 0x000000010c560fa4 main + 84 (omp_test.c:21) |
6 omp_test | 0x000000010c55fbc4 start + 52 |
Thread 1:
0 libsystem_kernel.dylib | 0x00007fff83362bca __psynch_cvwait + 10 |
1 libsystem_c.dylib | 0x00007fff85fd4274 _pthread_cond_wait + 840 |
2 omp_test | 0x000000010c560d6b gomp_sem_wait + 59 |
3 omp_test | 0x000000010c560e4c gomp_barrier_wait_end + 108 |
4 omp_test | 0x000000010c560b16 gomp_thread_start + 310 |
5 libsystem_c.dylib | 0x00007fff85fd08bf _pthread_start + 335 |
6 libsystem_c.dylib | 0x00007fff85fd3b75 thread_start + 13 |
Thread 2:
0 libsystem_kernel.dylib | 0x00007fff83362bca __psynch_cvwait + 10 |
1 libsystem_c.dylib | 0x00007fff85fd4274 _pthread_cond_wait + 840 |
2 omp_test | 0x000000010c560d6b gomp_sem_wait + 59 |
3 omp_test | 0x000000010c560e4c gomp_barrier_wait_end + 108 |
4 omp_test | 0x000000010c560b0e gomp_thread_start + 302 |
5 libsystem_c.dylib | 0x00007fff85fd08bf _pthread_start + 335 |
6 libsystem_c.dylib | 0x00007fff85fd3b75 thread_start + 13 |
Thread 3:
0 libsystem_kernel.dylib | 0x00007fff83362bca __psynch_cvwait + 10 |
1 libsystem_c.dylib | 0x00007fff85fd4274 _pthread_cond_wait + 840 |
2 omp_test | 0x000000010c560d6b gomp_sem_wait + 59 |
3 omp_test | 0x000000010c560e4c gomp_barrier_wait_end + 108 |
4 omp_test | 0x000000010c560b16 gomp_thread_start + 310 |
5 libsystem_c.dylib | 0x00007fff85fd08bf _pthread_start + 335 |
6 libsystem_c.dylib | 0x00007fff85fd3b75 thread_start + 13 |
Thread 0 crashed with X86 Thread State (64-bit):
rax: 0x0000000000000000 rbx: 0x00007fff71f77960 rcx: 0x00007fff6c15eb28 rdx: 0x0000000000000000
rdi: 0x00007fefe04000f8 rsi: 0x0000000000000300 rbp: 0x00007fff6c15ebe0 rsp: 0x00007fff6c15eb28
r8: 0x0000000000000000 r9: 0x0000000000000060 r10: 0x0000000000000000 r11: 0x0000000000000202
r12: 0x0000000000000000 r13: 0x0000000000000300 r14: 0x00007fefe0400118 r15: 0x00007fefe0400110
rip: 0x00007fff83362bca rfl: 0x0000000000000202 cr2: 0x000000010bc3a000
Logical CPU: 0
Binary Images:
0x10c55f000 - | 0x10c561fff +omp_test (??? - ???) <0EB26D40-2B53-3947-8B5E-8FB5C2655034> /Users/*/omp_test | |
0x7fff6c15f000 - | 0x7fff6c193baf dyld (195.6 - ???) <0CD1B35B-A28F-32DA-B72E-452EAD609613> /usr/lib/dyld | |
0x7fff81f23000 - | 0x7fff81f27fff libdyld.dylib (195.5.0 - compatibility 1.0.0) <380C3F44-0CA7-3514-8080-46D1C9DF4FCD> /usr/lib/system/libdyld.dylib | |
0x7fff81f28000 - | 0x7fff81f2afff libquarantine.dylib (36.2.0 - compatibility 1.0.0) <48656562-FF20-3B55-9F93-407ACA7341C0> /usr/lib/system/libquarantine.dylib | |
0x7fff81f86000 - | 0x7fff81fc8ff7 libcommonCrypto.dylib (55010.0.0 - compatibility 1.0.0) <BB770C22-8C57-365A-8716-4A3C36AE7BFB> /usr/lib/system/libcommonCrypto.dylib | |
0x7fff8332b000 - | 0x7fff83339fff libdispatch.dylib (187.7.0 - compatibility 1.0.0) <712AAEAC-AD90-37F7-B71F-293FF8AE8723> /usr/lib/system/libdispatch.dylib | |
0x7fff83341000 - | 0x7fff8334bff7 liblaunch.dylib (392.35.0 - compatibility 1.0.0) <8F8BB206-CECA-33A5-A105-4A01C3ED5D23> /usr/lib/system/liblaunch.dylib | |
0x7fff8334c000 - | 0x7fff8336cfff libsystem_kernel.dylib (1699.24.8 - compatibility 1.0.0) <C56819BB-3779-3726-B610-4CF7B3ABB6F9> /usr/lib/system/libsystem_kernel.dylib | |
0x7fff840a4000 - | 0x7fff840a4fff libkeymgr.dylib (23.0.0 - compatibility 1.0.0) <61EFED6A-A407-301E-B454-CD18314F0075> /usr/lib/system/libkeymgr.dylib | |
0x7fff846df000 - | 0x7fff846fcfff libxpc.dylib (77.18.0 - compatibility 1.0.0) <26C05F31-E809-3B47-AF42-1460971E3AC3> /usr/lib/system/libxpc.dylib | |
0x7fff8474a000 - | 0x7fff84753ff7 libsystem_notify.dylib (80.1.0 - compatibility 1.0.0) <A4D651E3-D1C6-3934-AD49-7A104FD14596> /usr/lib/system/libsystem_notify.dylib | |
0x7fff84754000 - | 0x7fff84781fe7 libSystem.B.dylib (159.1.0 - compatibility 1.0.0) <7BEBB139-50BB-3112-947A-F4AA168F991C> /usr/lib/libSystem.B.dylib | |
0x7fff84bcb000 - | 0x7fff84bccff7 libsystem_blocks.dylib (53.0.0 - compatibility 1.0.0) <8BCA214A-8992-34B2-A8B9-B74DEACA1869> /usr/lib/system/libsystem_blocks.dylib | |
0x7fff85dea000 - | 0x7fff85df0fff libmacho.dylib (800.0.0 - compatibility 1.0.0) <165514D7-1BFA-38EF-A151-676DCD21FB64> /usr/lib/system/libmacho.dylib | |
0x7fff85f82000 - | 0x7fff8605ffef libsystem_c.dylib (763.12.0 - compatibility 1.0.0) <FF69F06E-0904-3C08-A5EF-536FAFFFDC22> /usr/lib/system/libsystem_c.dylib | |
0x7fff8609f000 - | 0x7fff860a4fff libcache.dylib (47.0.0 - compatibility 1.0.0) <1571C3AB-BCB2-38CD-B3B2-C5FC3F927C6A> /usr/lib/system/libcache.dylib | |
0x7fff860e1000 - | 0x7fff860e2ff7 libremovefile.dylib (21.1.0 - compatibility 1.0.0) <739E6C83-AA52-3C6C-A680-B37FE2888A04> /usr/lib/system/libremovefile.dylib | |
0x7fff87177000 - | 0x7fff8717dff7 libunwind.dylib (30.0.0 - compatibility 1.0.0) <1E9C6C8C-CBE8-3F4B-A5B5-E03E3AB53231> /usr/lib/system/libunwind.dylib | |
0x7fff874e6000 - | 0x7fff87521fff libsystem_info.dylib (??? - ???) <35F90252-2AE1-32C5-8D34-782C614D9639> /usr/lib/system/libsystem_info.dylib | |
0x7fff88f7c000 - | 0x7fff88f83fff libcopyfile.dylib (85.1.0 - compatibility 1.0.0) <0AB51EE2-E914-358C-AC19-47BC024BDAE7> /usr/lib/system/libcopyfile.dylib | |
0x7fff89181000 - | 0x7fff89189fff libsystem_dnssd.dylib (??? - ???) <998E3778-7B43-301C-9053-12045AB8544D> /usr/lib/system/libsystem_dnssd.dylib | |
0x7fff89e40000 - | 0x7fff89e45fff libcompiler_rt.dylib (6.0.0 - compatibility 1.0.0) <98ECD5F6-E85C-32A5-98CD-8911230CB66A> /usr/lib/system/libcompiler_rt.dylib | |
0x7fff8b7b8000 - | 0x7fff8b7b9fff libdnsinfo.dylib (395.7.0 - compatibility 1.0.0) <37FEFE78-BCB5-37EC-8E99-747469BCA4C7> /usr/lib/system/libdnsinfo.dylib | |
0x7fff8b7ba000 - | 0x7fff8b7bbfff libunc.dylib (24.0.0 - compatibility 1.0.0) <337960EE-0A85-3DD0-A760-7134CF4C0AFF> /usr/lib/system/libunc.dylib | |
0x7fff8be72000 - | 0x7fff8be77ff7 libsystem_network.dylib (??? - ???) <5DE7024E-1D2D-34A2-80F4-08326331A75B> /usr/lib/system/libsystem_network.dylib | |
0x7fff8bfaa000 - | 0x7fff8bfaefff libmathCommon.A.dylib (2026.0.0 - compatibility 1.0.0) <FF83AFF7-42B2-306E-90AF-D539C51A4542> /usr/lib/system/libmathCommon.A.dylib | |
0x7fff8c799000 - | 0x7fff8c79aff7 libsystem_sandbox.dylib (??? - ???) <5087ADAD-D34D-3844-9D04-AFF93CED3D92> /usr/lib/system/libsystem_sandbox.dylib |
External Modification Summary:
Calls made by other processes targeting this process:
task_for_pid: 0 |
thread_create: 0 |
thread_set_state: 0 |
Calls made by this process:
task_for_pid: 0 |
thread_create: 0 |
thread_set_state: 0 |
Calls made by all processes on this machine:
task_for_pid: 527 |
thread_create: 0 |
thread_set_state: 0 |
VM Region Summary:
ReadOnly portion of Libraries: Total=50.0M resident=28.2M(56%) swapped_out_or_unallocated=21.8M(44%)
Writable regions: Total=21.9M written=96K(0%) resident=180K(1%) swapped_out=0K(0%) unallocated=21.8M(99%)
REGION TYPE | VIRTUAL |
=========== | ======= |
MALLOC | 12.2M |
MALLOC guard page | 16K |
STACK GUARD | 56.0M |
Stack | 9752K |
__DATA | 472K |
__LINKEDIT | 47.6M |
__TEXT | 2492K |
shared memory | 12K |
=========== | ======= |
TOTAL | 128.2M |
From the crash log it appears that the abort took place at the moment there is waited for the worker threads to finish. This typically always is the case. However, not all runs of this program will lead to an abort.
These abort crashes where not found at previous MacOS X versions (Leopard, Tiger), and also not at Linux and Windows.
Could you please indicate what could be the problem here?
Many thanks in advance,
Kind regards,
Hans Blom,
Scientific Volume Imaging
llvm-gcc (xcode)-OTHER, Mac OS X (10.7.3)