panic: thread_call group 'high' reached max thread cap (500): active: 1, blocked: 499, idle: 0 @thread_call.c:409

Question

panic: thread_call group 'high' reached max thread cap (500): active: 1, blocked: 499, idle: 0 @thread_call.c:409

New Mac, hoping there would be less panics. As soon as the 7 day AppleCare+ expires, panics resume. This one is new to me --- all my others related to sleep/wakeup and some iSCSI kernel extensions I no longer run.

I did enable loading of kernel extensions: MacFuse and IIRC, Keybase.

Side question: does loading x86 kext into a aarch64 kernel even work? Rosetta in the kernel?

Back to main question: are others seeing this? I saw that there were some prior reports of this panic, some going back to 2019. No resolution I can see.

Here's the full report:

click

Mac Studio

Posted on Jul 4, 2022 6:51 PM

Reply

Answer 1

Top-ranking reply

a brody

Level 10

85,320 points

Jul 4, 2022 7:52 PM in response to Damon Permezel

Did you use Migration Assistant to bring over Apps from a prior Mac?

It is entirely possible this is being caused by RAM failing. Run the Apple Startup Diagnostics.

The Mx processors were not around in 2019. So if the problem is kernel extensions which you aren't aware of they have been transitioned to Mx processor compatibility, that is not the same error as in 2019. Similar, but not the same. In general, I'd make a clone backup of all your software, and then wipe and reinstall the OS. Slowly reinstall components to see if the issue is reproducible. If your system ever has a black background with white lettering screen that comes up with either a blank background, or on top of the existing image, then you have a kernel panic, the error indicated by the log.

The kernel panics can be caused by kernel extensions that are corrupt, or incompatible, or hardware internally which has failed. isolating the cause can be time consuming, but if you can find the exact culprit, you can send the author of the program a bug report.

Reply

Answer 2

Jul 4, 2022 9:55 PM in response to BobHarris

I am not doing anything unusual. I wasn't using the machine when it karked. Just had a few browser tabs open and a few VScode instances open and a few Terminal.app tabs open plus Emacs. The VScode instances had some attached rust analysis processes, and I imagine various Emacs buffers also had rust analysis processes attached.

Which `sysctl` parameters in particular do you suggest? https://opensource.apple.com/source/xnu/xnu-7195.81.3/osfmk/kern/thread_call.c has a panic at line 409. The code below is an excerpt. In general, manifest constants cannot be adjusted by `sysctl`, so unless there is a newer version of this file which just happens to have the same panic at line 409, but has been changed so as to allow `sysctl` to change `THREAD_CALL_GROUP_MAX_THREADS`, .... I also checked all the `sysctl -a`

and cannot find anything remotely applicable.

#define THREAD_CALL_GROUP_MAX_THREADS   500

...

static bool
thread_call_group_should_add_thread(thread_call_group_t group)
{
    if ((group->active_count + group->blocked_count + group->idle_count) >= THREAD_CALL_GROUP_MAX_THREADS) {
        panic("thread_call group '%s' reached max thread cap (%d): active: %d, blocked: %d, idle: %d",
            group->tcg_name, THREAD_CALL_GROUP_MAX_THREADS,
            group->active_count, group->blocked_count, group->idle_count);
    }

I'd be interested in being able to determine what all 499 blocked threads are blocked on. Clearly, the code expects 500 threads to be sufficient to handle the "high" queue, and rather than just returning `false` and adding the work item, it panic()s.

I'm not at all familiar with this code, so any comments I make regarding it are likely to be rubbish.

Reply

Answer 3

BobHarris

Level 9

56,951 points

Jul 4, 2022 8:40 PM in response to Damon Permezel

Have you looked at changing sysctl to specify more threads per group?

Mac OS X Manual Page For sysctl(3)

you are doing something atypical for a typical macOS user, so you may need to adjust some kernel configuring settings

Reply

Answer 4

Jul 4, 2022 8:44 PM in response to a brody

I did not use migration assistant. I did (several) full timeMachine backups of old machine onto external SSDs, then setup the new machine, added my user, and copied $HOME from backup into ~/FromLust (`lust` was the old machine) on the new machine. I'll eventually pull things from ~/FromLust as I need them.

At this point, I can't see a strong indication that this has anything to do with kernel extensions. Turns out the `Keybase` kext does have an /arm/ variant:

bash-3.2# pwd
/Library/Filesystems/kbfuse.fs/Contents/Extensions/11/kbfuse.kext/Contents/MacOS
bash-3.2# file *
kbfuse: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit kext bundle x86_64
- Mach-O 64-bit kext bundle x86_64] [arm64e:Mach-O 64-bit kext bundle arm64e
- Mach-O 64-bit kext bundle arm64e]
kbfuse (for architecture x86_64):	Mach-O 64-bit kext bundle x86_64
kbfuse (for architecture arm64e):	Mach-O 64-bit kext bundle arm64e

Nothing really indicates memory failing. I can't figure out if there is ECC support or not in the studio. I have been buying only ECC-based machines for decades now, when possible. My TrashCanPro had 64GB ECC memory, but I never saw any "single bit corrected" messages nor any "uncorrectable error" messages, but I'm sure they would be lost in the torrent of system messages anyway (except the "uncorrectable" one, natch).

Reply

Answer 5

BobHarris

Level 9

56,951 points

Jul 5, 2022 8:32 AM in response to Damon Permezel

You have more than the average user

LittleSnitch
Bitwarden
keybase
iStat Menus
MacFuse (not sure what file systems you have plug-ins for)
VMware Fusion
MacPorts
GPG tools
Dropbox
1Password
Alfred
MagnetLauncher
And you are running VScode (not exactly a light weight app)
Emacs (an operating system that happens to edit files)

You have a lot more going on they the typical Mac user.

As for what sysctl value to change, I do not know. I generally look at the names, and guess/experiment.

sysctl -a

The things I have played with (not related to your issue) are:

kern.vm_page_free_target=262144
vm.vm_page_background_target=262144
kern.vm_page_free_min=3500

kern.maxproc=2500
kern.maxprocperuid=1500
kern.tty.ptmx_max=255

kern.maxfiles
kern.maxfilesperproc

These are not recommendations, rather some sysctl values I've played with in the past. At the moment I DO NOT have any sysctl settings active. The VM changes were when I had a company 16GB intel Mac. I have a company M1 Pro model at the moment. The max files values were back in the Snow Leopard 10.6 days. The max proc values were also from my Snow Leopard days.

Reply

Answer 6

Jul 12, 2022 12:00 AM in response to a brody

I believe Keybase.app has their own MacFuse version. Not sure why.

If I ever reinstall pCloud Drive, it uses MacFuse.

If I ever reinstall CryptoMator, it uses MacFuse.

When I eventually reinstall Borg, I might again want MacFuse, for `borg mount`.

There are probably all sorts of other "drive" apps I might or might not reinstall which would want to use it.

I never want to have much of anything to do with NTFS formatted drives.

Currently, I have turned back on full security and am not loading 3rd party kernel extensions. I use Keybase mainly for the `git` integration with `pass`, so don't care about that.

None of this diversion regarding MacFuse has anything to do with the panic from above.

As I indicated, it would be helpful to display all the "blocked" work items. At panic() time, printing them before the panic() call would make sense. It may be time for me to figure out (again) how to attach a kernel debugger to this box.

If I do ever get a repeat of this panic, I will enable kernel debugging, wait for another panic, and see if I can dump the relevant data structures and, if possible, perform per-thread stack traces for all the dedicated service threads that are blocked. It may be that it is reasonable that they are blocked. Some users of the `thread_call` are for delay timers. Not clear if they are "high" group, (the default group when the call struct is created), but certainly possible to have more than 500 ACK timers, for example, or more than 500 mach port timers, currently active. I see no reason to panic in that case, other than to defer the writing of the code to handle things.

It does seem expensive to me to have one kernel thread dedicated per active timer.

Again: not at all familiar with this code, so anything I say is bound to be rubbish.

Reply

Answer 7

a brody

Level 10

85,320 points

Jul 10, 2022 5:50 PM in response to BobHarris

MacFuse is I believe so you can write to NTFS formatted drives.

Reply

Answer 8

BobHarris

Level 9

56,951 points

Jul 11, 2022 2:03 PM in response to a brody

MacFUSE can be used to write to a lot of file systems, or act as a network file system tool via sftp

Reply

panic: thread_call group 'high' reached max thread cap (500): active: 1, blocked: 499, idle: 0 @thread_call.c:409

Similar questions