Apple silicon Mac gpu vram

Question

Level 1

4 points

Apple silicon Mac gpu vram

I'm trying to run some ML models for inference through the metal interface on my 32GB RAM M1 Max and it caps out at 16GB allocated of VRAM. Is there any way to increase the unified memory allocation for Metal/Graphics manually in MacOS?

Also does every model of Mac with the new Apple silicon only allocate half of the RAM to handle graphics?

MacBook Pro (M1, 2020)

Posted on Jul 8, 2023 5:51 PM

Reply

Answer 1

Nov 25, 2023 10:50 PM in response to helge289

This is an ongoing issue. I believe Apple has a hard coded limit of 75% of physical memory.

More on this here (fast forwarded to the relevant part): Metal Compute on MacBook Pro - Tech Talks - Videos - Apple Developer

An example table of the limits here: Metal Compute on MacBook Pro - Tech Talks - Videos - Apple Developer

This function will tell you how must you can actively use on your own machine:

recommendedMaxWorkingSetSize | Apple Developer Documentation

Apple's solution is to break up the jobs, but apparently that's not an easy task with LLMs -- more on that here: https://github.com/ggerganov/llama.cpp/issues/1870

There is a hack that lets you adjust the VRAM/RAM split, but it unfortunately requires keeping SIP disabled:

https://github.com/ggerganov/llama.cpp/discussions/2182

Hopefully someone at Apple will find this and give us a means to adjust ourselves.

Reply

Answer 2

Johnb-one

Level 6

9,180 points

Jul 8, 2023 6:35 PM in response to helge289

Hi helge289-

short answer to question 1- no

question 2, well… this might help you a little bit, but it only applies to the 16” m1 macBook Pro’s: https://9to5mac.com/2021/10/28/high-power-mode-macbook-pro-m1-max

as for allocation of RAM to handle graphics, this sorta helps explain it : https://www.macrumors.com/guide/apple-silicon/

and no, Apple Silicon products do not support eGPU’s or Boot Camp

john b

Reply

Answer 3

helge289 Author

Level 1

4 points

Jul 8, 2023 6:43 PM in response to Johnb-one

Hello, John. Thank you for your reply.

The article you linked to does not seem to answer my question as to if all of the Mac models halves the available RAM for the GPU or if it is just my model in particular. It doesn't seem to be relevant to my question.

As for external GPU support and boot camp, I am well aware, but thank you anyway.

Reply

Answer 4

Jul 8, 2023 7:14 PM in response to helge289

There is no user control over “VRAM” allocation. And the allocation is dynamic.

Some background: M1 Max doesn’t have the concept of dedicated VRAM. VRAM arises within a GPU and system design that doesn’t have a fast path from the GPU to main memory. In such a design, the GPU needs to keep that data local to the GPU for performance reasons. So it has its own memory. A hardware cache, if you will.

M1 Max has a fast path to main (unified) memory: 400 GBps among GPU and CPU. That’s close to the PlayStation 5 memory bandwidth at 448 GBps, for comparison.

Viewed overly simplistically, the GPU can have as much of the (unified) memory pool as it needs, less what macOS and the non-GPU parts of the active apps need for those non-GPU activities. When passing around the data, the GPU doesn’t need to copy stuff to main memory. It can pass a pointer to the existing “VRAM” data already located in main (unified) memory.

Put differently, you’re going to want to profile and see where the ML app is utilizing both its time and memory.

Reply

Answer 5

helge289 Author

Level 1

4 points

Jul 8, 2023 7:19 PM in response to MrHoffman

Thank you, that makes sense! I will do some more digging around with the models and profile the cpu and memory usage to see if I can fit some bigger models into the RAM

Reply

Apple silicon Mac gpu vram

Similar questions