Running LLMs on M4 MAX, more GPU resources, future plans
I'm able to run some LLMs on my Mac M4 MAX, though they are noticeably slow due to the need for more resources. On the Intel CPUs, we were able to expand GPU resources externally. The Apple Silicon design, as I understand, does not permit this.
Since running LLMs is becoming commonplace, I wonder what other approaches we might be able to use on this platform to expand those resources, or if that will be possible.
MacBook Pro 16″, macOS 15.3