Apple Event: May 7th at 7 am PT

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Mac Pro and ECC Ram, basic questions

Hi all,

one of the reasons why i bought a used mac pro, has been the ecc ram.


When motherboard, firmware, and os work perfectly together, ecc ram not only can detect ram errors, they can be corrected completely transparently and invisible to the user.


But other scenarios can be implemented too...


- the system can abort all user processes and shut down the system immediately.

- the system can say "ram error detected, decide yourself to a) continue, b) shutdown, c)...

- the system can correct the ram error and continue, and maybe somewhere in the system a "ram error counter" can be increased for an early warning "ram module xyz has errors, please exchange"


So my question: What will actually happen in a mac pro (2010/5.1, with 10.11.4) when a ram error is detected?


greetings from germany

Chris

Mac Pro, OS X El Capitan (10.11)

Posted on May 5, 2016 1:13 AM

Reply
Question marked as Best reply

Posted on May 5, 2016 1:49 AM

If only a few errors then the user is not even notified.

However, if there are so many errors that they can't be correct that usually results in a kernel panic with many banks reporting errors.

Sometimes errors will be detected during the self test on startup and some times there will be a diagnostic led lit by the bad memory stick. Also if you open System Profiler/About this Mac and go to Memory part it will not say OK



User uploaded file

6 replies
Question marked as Best reply

May 5, 2016 1:49 AM in response to Christian Stueben

If only a few errors then the user is not even notified.

However, if there are so many errors that they can't be correct that usually results in a kernel panic with many banks reporting errors.

Sometimes errors will be detected during the self test on startup and some times there will be a diagnostic led lit by the bad memory stick. Also if you open System Profiler/About this Mac and go to Memory part it will not say OK



User uploaded file

May 5, 2016 9:43 AM in response to Christian Stueben

Single-bit errors in a word are detected by fast logic in the Xeon processor and are corrected, in Hardware, in one extended memory cycle. No software intervention is required, except that at low priority in the background, an error will usually be tabulated. No extra in-line software is executed and it does NOT generate an Interrupt (either of which would slow your computing).


One subtle twist is that the corrected data are NOT rewritten to RAM, so Reading that same location later will require a correction later (unless the corrected version is still in the processor cache).


Double-bit errors are detected, but most are not correctable. Uncorrectable errors will cause your Mac to HALT on a kernel panic, machine check, error overflow or uncorrected error, status 4 or 5, and almost always detected simultaneously by multiple processors. This is by design to keep such errors from poisoning your data.


The kernel panic reports are not identical, but are still fairly distinctive.

May 5, 2016 11:36 AM in response to Grant Bennet-Alder

The error correction logic is used very aggressively at Startup. ANY errors that occur (correctable or not) during the very short time of the Power-ON Self Test will cause that module-slot to be declared "Empty" and it will not be used by Mac OS X.


if you see the some memory seem to be "missing" running


 menu > About this Mac > Report > Memory


(the same report lllaass posted above) ... may show those slots as "Empty".


The next time you start your Mac, the same test will be run again, without referencing any History, and the BAD modules may then test OK and be used. This does not mean they have healed! They should still be replaced.

May 5, 2016 11:34 AM in response to Grant Bennet-Alder

Hi Grant,

when bits tilt to the opposite, that does not mean the ram module is defective. It is a normal (and yet unwanted) behaviour that they can occur by warmth, natural background radiation, or cosmic radiation. Specialists shurely will be able to name more reasons. This bit will tilt once, and then never again.

In normal pc environment, such single bit errors must be expected one to two times a month. But data in ram is NOT always be read again, so they are undetected. Not by ecc logic, not by your destroyed pictures, databases, or mp3.

In workstations with much more ram you should expect these spontaneous (and not by defect issued) ram errors once a day. Unfortunately, the ram data in workstations is used more often than in the typical use of personal computers, so the consequences are more severe.


And that is the reason why i asked "what will happen in a mac pro". I have seen many different implementations of how ecc detections are implemented, and i didn´t know the "mac pro way". I simply wanted to know if my decision for the used mac pro was a good decision. And yes, your and Illas remarks told me it was a good decision 😉


Yes, ram modules can be defective too. But that will be a different chapter, i hopefully never have to read.


Thank you for your remarks how osx will react on multiple (and thereby uncorrectable) errors.


greetings from germany

Chris

Mac Pro and ECC Ram, basic questions

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.