How to Determine which DIMM to Replace Given Kernel Panic Report
I have a dual cpu, mid-2010 Mac Pro with 32GB of Kingston RAM (8 4gb modules). Once a week or so, it suffers a kernel panic that appears to be caused by a RAM problem. Here is a typical one:
Mon Jul 18 10:01:26 2011
Machine-check capabilities (cpu 17) 0x0000000000001c09:
family: 6 model: 44 stepping: 2 microcode: 15
Intel(R) Xeon(R) CPU X5670 @ 2.93GHz
9 error-reporting banks
threshold-based error status present
extended corrected memory error handling present
Machine-check status 0x0000000000000004:
machine-check in progress
MCA error-reporting registers:
IA32_MC0_STATUS(0x401): 0x0000000000000800 invalid
IA32_MC1_STATUS(0x405): 0x0000000000000800 invalid
IA32_MC2_STATUS(0x409): 0x0000000000000000 invalid
IA32_MC3_STATUS(0x40d): 0x0000000000000000 invalid
IA32_MC4_STATUS(0x411): 0x0000000000000000 invalid
IA32_MC5_STATUS(0x415): 0x0000000000000000 invalid
IA32_MC6_STATUS(0x419): 0x0000000000000000 invalid
IA32_MC7_STATUS(0x41d): 0x0000000000000000 invalid
Package 1 logged:
IA32_MC8_STATUS(0x421): 0xfe0000800001009f valid
Channel number: 15 (unknown)
Memory Operation: read
Machine-specific error: Read ECC
COR_ERR_CNT: 4
Status bits:
Processor context corrupt
ADDR register valid
MISC register valid
Error enabled
Uncorrected error
Error overflow
IA32_MC8_ADDR(0x422): 0x0000000408aa4340
IA32_MC8_MISC(0x423): 0x29d7aef000040383
DIMM: 0
Channel: 1
Syndrome: 0x29d7aef0
Package 0 logged:
IA32_MC8_STATUS(0x421): 0x0000000000000000 invalid
They all point to DIMM 0, Channel 1, and indicate addresses like 0x00000004xxxxxxxx. My question is: If I am interested in replacing the suspect DIMM, which one is it?
In the User Guide, it indicates I have 8 memory slots, organized in two chunks, each chunk adjacent to a cpu. The diagram says that Slots 1-4 are adjacent to one CPU, and Slots 5-8 are adjacent to the other. I, of course, filled all of them myself. System Profiler tells me about DIMMs 1-8. The kernel panic report seems to tell me that DIMM 0 is to blame, perhaps DIMM 0 in Channel 1. Here is a representation of the slot layout in the User Guide:
slot 5 CPU
slot 6 slot 4
slot 7 slot 3
slot 8 slot 2
CPU slot 1
latch latch
where the bottom of the diagram is toward the side of the tray that has the latches that release the tray. The form of answer I'd like to see is something like: Replace the DIMM in Slot 1. Or perhaps: Replace the DIMMs in slot 1 and slot 5.
Mac Pro, Mac OS X (10.6.8)