"not responding" means what, exactly?

Question

Level 5

4,936 points

"not responding" means what, exactly?

As we all know, Application Monitor (as well as the Force-Quit dialog, and I think other places) shows those nifty "(not responding)" suffixes highlghted in red when a process is working ovetime or simply hung. I'm curious what triggers that, and how it can be captured from unix.

Here's what I think I know (from a lot of time spent on google and looking at man pages):

the 'not responding' cue means that the app is not reponding to system calls for roughly 2 seconds (with no explanation of what system calls are being tried or how they fail)
that Application Monitor mirrors the unix top utility (though there's nothing in top that correlates to 'not responding' - 'stuck' has a different meaning)
that nothing in the ps utility corresponds to 'not responding'
that this is one of those things that's going to drive me batty until I figure it out.

So what does 'not responding' in activity monitor *actually* mean, in unix terms? I'm not looking for a way to kill processes (I know how to do that), and the obvious things like top's 'stuck' indicator are wrong, and not what I'm after. I just want to know how I can leverage/mimic/replicate that 'not responding' thing that seems so simple in Application Monitor. Where-oh-where in unix (or anywhere else) can this result be produced?

DankeDankeDanke... 😀

MacBook Pro, OS X Mountain Lion (10.8.2)

Posted on Dec 7, 2012 10:34 PM

Reply

Answer 1

Best reply

BobHarris

Level 9

52,961 points

Dec 8, 2012 6:38 AM in response to twtwtw

"Not Responding" is NOT a Unix concept. Because if the process can be killed it is (Forced Quit), then it is still alive as far as Unix is concerned.

It is my guess that "Not Responding" means the application is not pulling items from its "Event Queue", so it will not see the "Quit" event, or any "Key Clicks", "Button Clicks", etc....

From a Unix perspective there are only 2 kinds of processes that cannot be killed. Those that are already dead (Zombies), and a process that is stuck in the kernel, and the kernel code is not checking to see if any kill signals have been issued against the process.

A Zombie is a process "In Name Only", as all its resources have been released, and it just has a small process structure that is waiting for its parent process to collect the process' termination status. A Zombie process CANNOT consume CPU time, it has not really memory to speak of, nor do any I/O. If there are Zombie processes, they are generally the result of a program that was not paying attention and did not issue Unix 'wait()' calls to collect the status from an subprocess they created. If the parent process exits without collecting their child process completion status, all the child processes get inherited by the Mac OS X /sbin/launchd process (PID 1) and /sbin/launchd will issue the wait() calls (on other Unix systems PID 1 is called 'init', but it provides the same orphaned child status collection services). So to have a Zombie, the parent must still exist, but not care enough to collect the child process completion status.

The stuck in a kernel process is a full process that most likely has tripped over some kind of kernel bug. It can consume CPU (its most likely talent), tie up RAM, and cannot be killed via Force-Quit (kill -9, which cannot be ignored). However, kill signals are delivered as a process is leaving kernel space and returning to user space. So if a process is stuck in kernel space, and does not check to see if a signal has been delivered (the only way signals are honored in kernel code), and does not return to user space, then it cannot be killed. However, this also signifies a bug in the kernel, as properly written kernel code should either quickly do its thing, and return, or if it needs to wait for something, check for signals and gracefully back-out.

If you have an unkillable process, the only way to cure that is to reboot.

A "Not Responding" app can generally be killed, so it neither a Zombie nor an unkillable Unix process. Thus my conclusion that "Not Responding" is not a Unix concept, and why I suspect it is an application not checking its "Event Queue".

An Event Queue is a user mode GUI framework concept. And has nothing to do with Unix, which is why there is no 'ps', no 'top', no anything Unixie that will report about "Not Responding".

I've been writing Unix kernel code since '95 for different Unix kernels, and I've been a Unix applications developer since '85. "Not Responding" is a higher level abstraction that Apple has implemented, and I think it is related to their GUI frameworks and "Event Queues". But it is all a guess based on knowing it is not Unix.

Reply

Answer 2

Dec 8, 2012 7:44 AM in response to twtwtw

I agree with Bob, not responding seems petty much to indicate an app that has stopped responding to the GUI. It in no way indicates a process that is hung or has stopped working. I don't think there is a UNIX equivalent.

I see this all the time with Aperture, when doing some heavy duty image processing it will stop responding to the GUI and Activity Monitor will report it as Not Responding but after some period of time it will go back to behaving normally and Activity Monitor will have removed the Not Responding label from it.

So all you need is the code to Activity Monitor to see what it is checking and then ...

oh wait, never mind (and yet another example of why open source beats the pants off propriety software 👿 )

Reply

Answer 3

Linc Davis

Level 10

209,535 points

Dec 8, 2012 8:01 AM in response to twtwtw

Activity Monitor has nothing to do with top or ps. The monitoring process is "activitymonitord".

Reply

Answer 4

twtwtw Author

Level 5

4,936 points

Dec 8, 2012 8:05 AM in response to BobHarris

Bob,

I don't really need a lecture in unix, though I appreciate the effort and will note that that's probably the best explanation of a stuck process that I've seen on the web. I know NR is not a unix concept, I'm simply trying to find a way to get the same result. It would be useful to be able to get this information: often processes that are frozen in the GUI level seem to be running smoothly from a unix perspective.

The event queue information is interesting. I suppose if I really wanted to I could pull out XCode and scramble up a utility given that information, assuming it actually works that way. I'll look into that further. But I'm still curious whether there's a built in handle for this.

Reply

Answer 5

twtwtw Author

Level 5

4,936 points

Dec 8, 2012 8:08 AM in response to Frank Caggiano

actually, I'm not certain source code for Activity Monitor would do it - the 'not responding' thing appears in the 'Force Quit' window as well, making me think they are both tapping into some system-wide service. but your point on open source software - as dreadfully tangential as it might be - is well taken. 😝

Reply

Answer 6

Dec 8, 2012 8:16 AM in response to twtwtw

"Not responding" means that the application is not responding to system events from the GUI.

These events are how user input arrives within the application, for instance.

This means that the human-facing part of the application — the GUI-facing bits — will appear to be stuck, or actually are stuck.

That the application is in a state that's colloquially known as "beachballed".

The documentation on iOS is a shade clearer on what's going on here, in that a watchdog timer fires when an application is too slow to launch, too slow to terminate, or too slow to respond to system events being sent to the application by the GUI.

This is usually an issue on the main thread within the application. Too much is going on, or what's going on is taking too long.

The application may not actually be stuck (as Frank Caggiano comments), but then the programmer has issues with what's happening on the main thread, or there's a low-level error within the system.

BobHarris is correct on a "pure" Unix setup; traditional Unix apps don't directly run as "clients" of the operating system. (This whole area is a little subtle, and has to do with the flow of control within the application. Within a Unix application, the application is has control over what's going on and is at the top of the activity "heap", once it's been launched. Within an OS X GUI environment, the application responds to stuff (events) that the GUI dispatches to it.

The closest (common) Unix equivalent to this sort of flow of control is probably a stuck X Windows connection, where the app does respond to events provided by the X software. The X busy cursor (or a stuck X cursor) is similar to the OS X beachball. The X terminology is reversed from what folks might expect, so the rough analog would inherently have to involve sort of a client-specific problem within the X Windows server (rare, but happens), or a problem in the X client application (rather more common).

Depending on your particular goal here, the developer documentation or Amit Singh's OS X Internals Book would be your next step. These will have details on the main thread, and on what's going on inside OS X.

Reply

Answer 7

twtwtw Author

Level 5

4,936 points

Dec 8, 2012 8:23 AM in response to MrHoffman

My 'goal' is mostly just annoyed curiosity. From an AppleScripter's perspective, this problem crops up occasionally - determining programmatically if an app you're trying to target is unresponsive. If there's a direct way to do it that would be great; if there's a way to get to the information otherwise I'd put that on my XCode projects cue for the next time I get irritated by the problem.

Reply

Answer 8

Dec 8, 2012 8:26 AM in response to twtwtw

Building on what linc posted looked for and found activitymonitored, there is even a man page for it

ACTIVITYMONITORD(8) BSD System Manager's Manual ACTIVITYMONITORD(8)

NAME

activitymonitord -- Activity Monitor daemon

SYNOPSIS

activitymonitord

DESCRIPTION

activitymonitord provides services for the Activity Monitor application.

There are no configuration options to activitymonitord. Users should not run activitymonitord manually.

Mac OS X

Not very helpful but perhaps a place to start.

BTW:

dreadfully tangential

nice turn of phrase 😀

Reply

Answer 9

twtwtw Author

Level 5

4,936 points

Dec 8, 2012 9:34 AM in response to Frank Caggiano

Frank Caggiano wrote:

BTW:

dreadfully tangential
nice turn of phrase 😀

I aim to please! 😉

Reply

Answer 10

twtwtw Author

Level 5

4,936 points

Dec 8, 2012 10:04 AM in response to twtwtw

For anyone who's interested, I worked out a hackish solution (asking something out loud always gives a fresh perspective). I'd still like a better answer to the question, because this isn't very elegant, but it seems to do the job and at least makes what I'm thinking about clearer.

try

with timeout of 2 seconds

-- 'activate', 'quit' and 'run' will work here as well, if that's what you're asking the app to do

-- 'idle' has the advantage of (maybe) not doing anything if the app isn't frozen

tell application "App Name" to idle

end timeout

display dialog "App is available"

on error errstrnumbererrnum

if errnum = -1712 then

-- apple event timeout; implies 2 seconds unresponsive

-- change time interval or loop around and count instances for longer checks

display dialog "App is frozen"

end if

end try

Note this will only work with apps that respond to the core apple events, and that asking an app to idle may have unintended side effects if the app is doing something specific with its idle calls.

Reply

Answer 11

Hiroto

Level 5

7,461 points

Dec 8, 2012 3:57 PM in response to twtwtw

Hello twtwtw,

As I understand, the said "not responding" state = "rainbow cursor" state.

#!/bin/bash
otool -tV '/Applications/Utilities/Activity Monitor.app/Contents/MacOS/Activity Monitor'

reveals two symbols which seem relevant:

_CGSEventIsAppUnresponsive
_CGSEventAppUnresponsiveStatus

which I think indicate "not responding" means not responding to Window Server events.

Both are private APIs of Core Graphics and you may google them for some information.

To feed my own curiosity I wrote a command line utility using CGSEventIsAppUnresponsive() as listed below.

In my brief tests under 10.5.8, it works as expected.

(In testing, I start OS Combo updater and leave the dialogue asking admin pasword alone which eventually puts the installer in "not responding" state. Then check its pid in Activity Monitor and run this utility with the pid. It returns 1 as intended.)

Here's the source code in C.

main.c

---------------------------------------------------------------------

/*    
    file
        main.c
    
    function
        check whether the application with the give process id is responding or not
        and return 1 if unresponsive, 0 otherwise.
    
    compile
        gcc -framework ApplicationServices -o unresponsive main.c
    
    usage e.g.
        ./unresponsive 354
*/

#include <ApplicationServices/ApplicationServices.h>

typedef int CGSConnectionID;
extern bool CGSEventIsAppUnresponsive(CGSConnectionID cid, const ProcessSerialNumber *psn);
extern CGSConnectionID _CGSDefaultConnection(void);

int
main (int argc, const char * argv[])
{
    if ( argc != 2 ) {
        fprintf(stderr, "Usage: %s <Unix Process ID>\n", basename(argv[0]));
        return 1;
    }
    OSStatus err;
    ProcessSerialNumber psn;
    pid_t pid = (pid_t) atoi(argv[1]);
    
    err = GetProcessForPID(pid, &psn);
    if ( err ) {
        fprintf(stderr, "Failed to get PSN for pid %d: error = %d\n", pid, err);
        return 2;
    }
    
    CGSConnectionID cid = _CGSDefaultConnection();    
    bool b = CGSEventIsAppUnresponsive(cid, &psn);
    printf("%d\n", b ? 1 : 0);
    return 0;
}

---------------------------------------------------------------------

Hope this feeds your curiosity as well 😉

H

Reply

Answer 12

twtwtw Author

Level 5

4,936 points

Dec 8, 2012 5:55 PM in response to Hiroto

wow, that is excessively undocumented; I can't even find the header files it's located in. But yes, it satisfies my curiosity, and adds one more piece of mostly useless information to my ever-growing stockpile. I'll mark this as solved, because unless the guy who wrote the code weighs in I think we've squeezed this topic dry. Thanks all!

Reply

Answer 13

etresoft

Level 8

46,068 points

Dec 8, 2012 6:23 PM in response to twtwtw

I didn't write the code but I can provide some additional information. BobHarris kind of hinted at it. Any application that has an official, high-level run loop (NSRunLoop) must respond to messages within a certain period of time. The system sends no-op messages to applications that have run loops just to see what happens. If it doesn't get a response, it marks the application as "non-responsive". That is why well-designed applications shouldn't do too much work in an event handler. I assume you could do the same thing just by greating a low-level event and posting it to a given application. If you don't get a response, it is unresponsive. Those private functions probably do just that.

Reply

Answer 14

Dec 8, 2012 7:11 PM in response to twtwtw

actually, I'm not certain source code for Activity Monitor would do it

Given that Hiroto found the symbols for his code by using otool on Activity Monitor having the source code would, it appears, have been very useful.

regards

BTW Hiroto a marvelous bit of reverse engineering, great work.

Reply

Answer 15

twtwtw Author

Level 5

4,936 points

Dec 8, 2012 7:28 PM in response to etresoft

Yes, that's pretty much what I figured, and if I were going to code it myself that's the approach I'd take. I was just looking for some nice, hidden-but-public tool that would do it for me. And yes, that's because I'm lazy. 🙂

Reply