Want to highlight a helpful answer? Upvote!

Did someone help you, or did an answer or User Tip resolve your issue? Upvote by selecting the upvote arrow. Your feedback helps others! Learn more about when to upvote >

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Lost events bug in kqueue?

Hi all,


I have been trying to track down some odd behavior in a multithreaded mac os x program in which the program gets stuck waiting for events on files forever. I wrote a small C program that exhibits the same problem and I think that this may be causing the problem in my original program, too.


The test program is a tiny "web server" and you can find it here: https://github.com/AndreasVoellmy/epollbug/blob/master/kqueueserver.c


The test program simply creates NUM_WORKER+1 threads where 1 thread runs a loop to accept new connections and the other NUM_WORKER "worker" threads actually serve the connections. Note that the accept thread distributes the accepted TCP connections among the different workers and the workers serve disjoint sets of connections. Each worker thread runs a loop that uses kevent64 to wait for any of its sockets to be readable and uses the EV_ONESHOT flag when adding sockets to the watched files.


I test the program using the http server benchmarking program weighttp (http://redmine.lighttpd.net/projects/weighttp/wiki). I run it like this:


./weighttp -n 100 -c 10 -t 4 -k "10.12.0.8:8080"


where the IP address and port are where my test program are runnning. -n is the number of requests to make, -c is the number of concurrent requests to make and -t is the number of threads that weighttp will use to generate the http requests. weighttp makes the requests but gets stuck with some responses never coming back. For example, on a run with the above parameters, I get:


weighttp - a lightweight and simple webserver benchmarking tool



starting benchmark...

spawning thread #1: 3 concurrent requests, 25 total requests

spawning thread #2: 3 concurrent requests, 25 total requests

spawning thread #3: 2 concurrent requests, 25 total requests

spawning thread #4: 2 concurrent requests, 25 total requests

progress: 8% done

progress: 16% done

progress: 24% done

progress: 32% done

progress: 40% done

progress: 48% done

progress: 56% done

progress: 64% done

progress: 72% done

progress: 80% done

progress: 88% done

progress: 96% done



finished in 728 sec, 373 millisec and 310 microsec, 0 req/s, 0 kbyte/s

requests: 100 total, 100 started, 100 done, 95 succeeded, 5 failed, 0 errored

status codes: 95 2xx, 0 3xx, 0 4xx, 0 5xx

traffic: 30685 bytes total, 16340 bytes http, 14345 bytes data


In other words, 5 requests failed to get responses. On the test program (i.e. server) side, after 10 seconds, the test program prints out (using ioctl) which sockets have any data ready. Sure enough, I see that 5 sockets have a request waiting (weighttp requests are 90 bytes long):


socket 6 has 90 bytes of data ready

socket 8 has 90 bytes of data ready

socket 10 has 90 bytes of data ready

socket 12 has 90 bytes of data ready

socket 14 has 90 bytes of data ready


I would really appreciate getting some advice and help on this problem. I may be misunderstanding the semantics of kqueue operations and flags, or this may be a kernel bug. By the way, this program is based on a very similar program I wrote, which uncovered a bug in Linux, which has since been fixed (http://lkml.indiana.edu/hypermail/linux/kernel/1301.0/00065.html). To Linux's credit, the bug only appeared using lots of cores, whereas on os x it appears to occur using just two cores and two threads!


Regards,

Andi


P.S. in case it helps I also took a "sample" of the kqueueserver program at the point that it was waiting on the sockets while the sockets have data ready. Here is the result:


Sampling process 17982 for 3 seconds with 1 millisecond of run time between samples

Sampling completed, processing symbols...

Analysis of sampling kqueueserver (pid 17982) every 1 millisecond

Process: kqueueserver [17982]

Path: /Users/andre/repos/epollbug/kqueueserver

Load Address: 0x10b22b000

Identifier: kqueueserver

Version: ??? (???)

Code Type: X86-64 (Native)

Parent Process: bash [17853]


Date/Time: 2013-02-06 17:22:26.909 -0500

OS Version: Mac OS X 10.8.2 (12C3012)

Report Version: 7


Call graph:

2665 Thread_568458 DispatchQueue_1: com.apple.main-thread (serial)

+ 2665 start (in libdyld.dylib) + 1 [0x7fff9099c7e1]

+ 2665 main (in kqueueserver) + 85 [0x10b22cb15]

+ 2665 acceptLoop (in kqueueserver) + 334 [0x10b22c6ee]

+ 2665 __accept (in libsystem_kernel.dylib) + 10 [0x7fff8eff9996]

2665 Thread_568459

+ 2665 thread_start (in libsystem_c.dylib) + 13 [0x7fff85c63181]

+ 2665 _pthread_start (in libsystem_c.dylib) + 327 [0x7fff85c76742]

+ 2665 workerLoop (in kqueueserver) + 95 [0x10b22c8cf]

+ 2665 kevent64 (in libsystem_kernel.dylib) + 10 [0x7fff8effad2a]

2665 Thread_568460

2665 thread_start (in libsystem_c.dylib) + 13 [0x7fff85c63181]

2665 _pthread_start (in libsystem_c.dylib) + 327 [0x7fff85c76742]

2665 workerLoop (in kqueueserver) + 95 [0x10b22c8cf]

2665 kevent64 (in libsystem_kernel.dylib) + 10 [0x7fff8effad2a]


Total number in stack (recursive counted multiple, when >=5):


Sort by top of stack, same collapsed (when >= 5):

kevent64 (in libsystem_kernel.dylib) 5330

__accept (in libsystem_kernel.dylib) 2665


Binary Images:

0x10b22b000 - 0x10b22cfff +kqueueserver (??? - ???) <10550045-081D-3F86-A765-EE0D08E3F398> /Users/andreas/repos/epollbug/kqueueserver

0x7fff6ae2b000 - 0x7fff6ae5f93f dyld (210.2 - ???) <A40597AA-5529-3337-8C09-D8A014EB1578> /usr/lib/dyld

0x7fff857d2000 - 0x7fff85800ff7 libsystem_m.dylib (??? - ???) <B434BE5C-25AB-3EBD-BAA7-5304B34E3441> /usr/lib/system/libsystem_m.dylib

0x7fff859c0000 - 0x7fff859c0fff libkeymgr.dylib (25.0.0 - compatibility 1.0.0) <CC9E3394-BE16-397F-926B-E579B60EE429> /usr/lib/system/libkeymgr.dylib

0x7fff85ac4000 - 0x7fff85b13ff7 libcorecrypto.dylib (??? - ???) <CE0C29A3-C420-339B-ADAA-52F4683233CC> /usr/lib/system/libcorecrypto.dylib

0x7fff85c62000 - 0x7fff85d2efe7 libsystem_c.dylib (825.25.0 - compatibility 1.0.0) <8CBCF9B9-EBB7-365E-A3FF-2F3850763C6B> /usr/lib/system/libsystem_c.dylib

0x7fff8614d000 - 0x7fff86158fff libsystem_notify.dylib (98.5.0 - compatibility 1.0.0) <C49275CC-835A-3207-AFBA-8C01374927B6> /usr/lib/system/libsystem_notify.dylib

0x7fff86159000 - 0x7fff86161fff liblaunch.dylib (442.26.2 - compatibility 1.0.0) <2F71CAF8-6524-329E-AC56-C506658B4C0C> /usr/lib/system/liblaunch.dylib

0x7fff87246000 - 0x7fff8724dfff libcopyfile.dylib (89.0.0 - compatibility 1.0.0) <876573D0-E907-3566-A108-577EAD1B6182> /usr/lib/system/libcopyfile.dylib

0x7fff87437000 - 0x7fff8745cff7 libc++abi.dylib (24.4.0 - compatibility 1.0.0) <E7BD9363-1D25-3551-A68A-2E2FF6ABECD7> /usr/lib/libc++abi.dylib

0x7fff89df0000 - 0x7fff89df8ff7 libsystem_dnssd.dylib (??? - ???) <F0AE5B8E-C7CC-3393-927F-B65DB04016AA> /usr/lib/system/libsystem_dnssd.dylib

0x7fff89df9000 - 0x7fff89e61ff7 libc++.1.dylib (65.1.0 - compatibility 1.0.0) <20E31B90-19B9-3C2A-A9EB-474E08F9FE05> /usr/lib/libc++.1.dylib

0x7fff8a2a7000 - 0x7fff8a3bf92f libobjc.A.dylib (228.0.0 - compatibility 1.0.0) <90D31928-F48D-3E37-874F-220A51FD9E37> /usr/lib/libobjc.A.dylib

0x7fff8a8ff000 - 0x7fff8a901fff libquarantine.dylib (52.0.0 - compatibility 1.0.0) <4BE2E642-A14F-340A-B482-5BD2AEFD9C24> /usr/lib/system/libquarantine.dylib

0x7fff8af64000 - 0x7fff8af65ff7 libremovefile.dylib (23.1.0 - compatibility 1.0.0) <DBBFAF35-AC78-3856-92F6-6E4FD9DF14A2> /usr/lib/system/libremovefile.dylib

0x7fff8bc91000 - 0x7fff8bc9ffff libcommonCrypto.dylib (50000.0.0 - compatibility 1.0.0) <DA98AE56-BC3F-3AD0-9660-8CC79E4AECE6> /usr/lib/system/libcommonCrypto.dylib

0x7fff8c0db000 - 0x7fff8c111fff libsystem_info.dylib (??? - ???) <4FFCA242-7F04-365F-87A6-D4EFB89503C1> /usr/lib/system/libsystem_info.dylib

0x7fff8c61d000 - 0x7fff8c61eff7 libSystem.B.dylib (169.3.0 - compatibility 1.0.0) <365477AB-D641-389D-B8F4-A1FAE9657EEE> /usr/lib/libSystem.B.dylib

0x7fff8c6a9000 - 0x7fff8c6abff7 libunc.dylib (25.0.0 - compatibility 1.0.0) <92805328-CD36-34FF-9436-571AB0485072> /usr/lib/system/libunc.dylib

0x7fff8c6ac000 - 0x7fff8c6b2ff7 libunwind.dylib (35.1.0 - compatibility 1.0.0) <21703D36-2DAB-3D8B-8442-EAAB23C060D3> /usr/lib/system/libunwind.dylib

0x7fff8c9de000 - 0x7fff8ca00ff7 libxpc.dylib (140.41.0 - compatibility 1.0.0) <FAC04D8B-680E-325F-8F0C-DD69859D0E01> /usr/lib/system/libxpc.dylib

0x7fff8de1f000 - 0x7fff8de25fff libmacho.dylib (829.0.0 - compatibility 1.0.0) <BF332AD9-E89F-387E-92A4-6E1AB74BD4D9> /usr/lib/system/libmacho.dylib

0x7fff8efe8000 - 0x7fff8f003ff7 libsystem_kernel.dylib (2050.20.9 - compatibility 1.0.0) <EC0A9F5B-C9F5-336B-A7DD-49A718042F39> /usr/lib/system/libsystem_kernel.dylib

0x7fff8f02e000 - 0x7fff8f033fff libcache.dylib (57.0.0 - compatibility 1.0.0) <65187C6E-3FBF-3EB8-A1AA-389445E2984D> /usr/lib/system/libcache.dylib

0x7fff8f14e000 - 0x7fff8f15cff7 libsystem_network.dylib (??? - ???) <0D99F24E-56FE-380F-B81B-4A4C630EE587> /usr/lib/system/libsystem_network.dylib

0x7fff8f1aa000 - 0x7fff8f1abfff libsystem_blocks.dylib (59.0.0 - compatibility 1.0.0) <D92DCBC3-541C-37BD-AADE-ACC75A0C59C8> /usr/lib/system/libsystem_blocks.dylib

0x7fff8f71c000 - 0x7fff8f731ff7 libdispatch.dylib (228.23.0 - compatibility 1.0.0) <D26996BF-FC57-39EB-8829-F63585561E09> /usr/lib/system/libdispatch.dylib

0x7fff8f732000 - 0x7fff8f733ff7 libsystem_sandbox.dylib (??? - ???) <3C3B03CF-C525-3CB3-8557-62E91B93AC95> /usr/lib/system/libsystem_sandbox.dylib

0x7fff8f742000 - 0x7fff8f747fff libcompiler_rt.dylib (30.0.0 - compatibility 1.0.0) <08F8731D-5961-39F1-AD00-4590321D24A9> /usr/lib/system/libcompiler_rt.dylib

0x7fff8fd39000 - 0x7fff8fd85ff7 libauto.dylib (??? - ???) <73CDC482-16E3-3FC7-9BB4-FBA2DA44DBC2> /usr/lib/libauto.dylib

0x7fff9099a000 - 0x7fff9099dff7 libdyld.dylib (210.2.3 - compatibility 1.0.0) <F59367C9-C110-382B-A695-9035A6DD387E> /usr/lib/system/libdyld.dylib

0x7fff91130000 - 0x7fff91131ff7 libdnsinfo.dylib (453.18.0 - compatibility 1.0.0) <E7595861-ECF9-336E-9901-BED2620FAA80> /usr/lib/system/libdnsinfo.dylib

Sample analysis of process 17982 written to file /dev/stdout

MacBook Pro, OS X Mountain Lion (10.8.2)

Posted on Feb 6, 2013 2:51 PM

Reply

There are no replies.

Lost events bug in kqueue?

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.