Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Mac Pro 2009 hangs intermittently

I have a Mac Pro (early 2009) that has lived a very cushy life because I don't use it a lot). On 10.8.5 (Mtn Lion) is started hanging intermittently. So I installed an SSD drive and did a clean OS install of Mtn Lion on it and ran updates. It still hangs intermittently, so the OS installation/disk drives aren't the problem.


Here is the trace, what is the likely issue?


Thanks,

CJ


Interval Since Last Panic Report: 53 sec

Panics Since Last Report: 1

Anonymous UUID: 4D4C7BB1-ECDA-A646-91B0-47B042CFF6AD


Sun Oct 4 03:22:26 2015

Machine-check capabilities 0x0000000000001c09:

family: 6 model: 26 stepping: 5 microcode: 17

Intel(R) Xeon(R) CPU E5520 @ 2.27GHz

9 error-reporting banks

threshold-based error status present

extended corrected memory error handling present

Processor 0: no valid machine-check state

Processor 1: no valid machine-check state

Processor 2: no valid machine-check state

Processor 3: no valid machine-check state

Processor 4: no valid machine-check state

Processor 5: no valid machine-check state

Processor 6: no valid machine-check state

Processor 7: no valid machine-check state

Processor 8: machine-check status 0x0000000000000004:

machine-check in progress

MCA error-reporting registers:

IA32_MC0_STATUS(0x401): 0x0000000000000800 invalid

IA32_MC1_STATUS(0x405): 0xbe00000000400e0f valid

MCA error code: 0x0e0f

Model specific error code: 0x0040

Other information: 0x00000000

Threshold-based status: Undefined

Status bits:

Processor context corrupt

ADDR register valid

MISC register valid

Error enabled

Uncorrected error

IA32_MC1_ADDR(0x406): 0x000000e1e7e1e000

IA32_MC1_MISC(0x407): 0x0000000001000000

IA32_MC2_STATUS(0x409): 0x0000000000000000 invalid

IA32_MC3_STATUS(0x40d): 0x0000000000000000 invalid

IA32_MC4_STATUS(0x411): 0x0000000000000000 invalid

IA32_MC5_STATUS(0x415): 0x0000000000000000 invalid

IA32_MC6_STATUS(0x419): 0x0000000000000000 invalid

IA32_MC7_STATUS(0x41d): 0x0000000000000000 invalid

IA32_MC8_STATUS(0x421): 0x0000000000000000 invalid

Processor 9: machine-check status 0x0000000000000004:

machine-check in progress

MCA error-reporting registers:

IA32_MC0_STATUS(0x401): 0x0000000000000800 invalid

IA32_MC1_STATUS(0x405): 0xbe00000000400e0f valid

MCA error code: 0x0e0f

Model specific error code: 0x0040

Other information: 0x00000000

Threshold-based status: Undefined

Status bits:

Processor context corrupt

ADDR register valid

MISC register valid

Error enabled

Uncorrected error

IA32_MC1_ADDR(0x406): 0x000000e1e7e1e000

IA32_MC1_MISC(0x407): 0x0000000001000000

IA32_MC2_STATUS(0x409): 0x0000000000000000 invalid

IA32_MC3_STATUS(0x40d): 0x0000000000000000 invalid

IA32_MC4_STATUS(0x411): 0x0000000000000000 invalid

IA32_MC5_STATUS(0x415): 0x0000000000000000 invalid

IA32_MC6_STATUS(0x419): 0x0000000000000000 invalid

IA32_MC7_STATUS(0x41d): 0x0000000000000000 invalid

IA32_MC8_STATUS(0x421): 0x0000000000000000 invalid

Processor 10: machine-check status 0x0000000000000004:

machine-check in progress

MCA error-reporting registers:

IA32_MC0_STATUS(0x401): 0x0000000000000800 invalid

IA32_MC1_STATUS(0x405): 0xbe00000000400e0f valid

MCA error code: 0x0e0f

Model specific error code: 0x0040

Other information: 0x00000000

Threshold-based status: Undefined

Status bits:

Processor context corrupt

ADDR register valid

MISC register valid

Error enabled

Uncorrected error

IA32_MC1_ADDR(0x406): 0x000000e1e7e1e000

IA32_MC1_MISC(0x407): 0x0000000001000000

IA32_MC2_STATUS(0x409): 0x0000000000000000 invalid

IA32_MC3_STATUS(0x40d): 0x0000000000000000 invalid

IA32_MC4_STATUS(0x411): 0x0000000000000000 invalid

IA32_MC5_STATUS(0x415): 0x0000000000000000 invalid

IA32_MC6_STATUS(0x419): 0x0000000000000000 invalid

IA32_MC7_STATUS(0x41d): 0x0000000000000000 invalid

IA32_MC8_STATUS(0x421): 0x0000000000000000 invalid

Processor 11: machine-check status 0x0000000000000004:

machine-check in progress

MCA error-reporting registers:

IA32_MC0_STATUS(0x401): 0x0000000000000800 invalid

IA32_MC1_STATUS(0x405): 0xbe00000000400e0f valid

MCA error code: 0x0e0f

Model specific error code: 0x0040

Other information: 0x00000000

Threshold-based status: Undefined

Status bits:

Processor context corrupt

ADDR register valid

MISC register valid

Error enabled

Uncorrected error

IA32_MC1_ADDR(0x406): 0x000000e1e7e1e000

IA32_MC1_MISC(0x407): 0x0000000001000000

IA32_MC2_STATUS(0x409): 0x0000000000000000 invalid

IA32_MC3_STATUS(0x40d): 0x0000000000000000 invalid

IA32_MC4_STATUS(0x411): 0x0000000000000000 invalid

IA32_MC5_STATUS(0x415): 0x0000000000000000 invalid

IA32_MC6_STATUS(0x419): 0x0000000000000000 invalid

IA32_MC7_STATUS(0x41d): 0x0000000000000000 invalid

IA32_MC8_STATUS(0x421): 0x0000000000000000 invalid

Processor 12: machine-check st

Model: MacPro4,1, BootROM MP41.0081.B07, 8 processors, Quad-Core Intel Xeon, 2.26 GHz, 32 GB, SMC 1.39f5

Graphics: NVIDIA GeForce GT 120, NVIDIA GeForce GT 120, PCIe, 512 MB

Memory Module: DIMM 1, 4 GB, DDR3 ECC, 1066 MHz, 0x802C, 0x33364A445A533531323732505A3147344631

Memory Module: DIMM 2, 4 GB, DDR3 ECC, 1066 MHz, 0x802C, 0x33364A425A53353132373250593147344431

Memory Module: DIMM 3, 4 GB, DDR3 ECC, 1066 MHz, 0x802C, 0x33364A425A53353132373250593147344431

Memory Module: DIMM 4, 4 GB, DDR3 ECC, 1066 MHz, 0x802C, 0x33364A425A53353132373250593147344431

Memory Module: DIMM 5, 4 GB, DDR3 ECC, 1066 MHz, 0x802C, 0x33364A425A53353132373250593147344431

Memory Module: DIMM 6, 4 GB, DDR3 ECC, 1066 MHz, 0x802C, 0x33364A425A53353132373250593147344431

Memory Module: DIMM 7, 4 GB, DDR3 ECC, 1066 MHz, 0x802C, 0x33364A445A533531323732505A3147344631

Memory Module: DIMM 8, 4 GB, DDR3 ECC, 1066 MHz, 0x802C, 0x33364A445A533531323732505A3147344631

AirPort: spairport_wireless_card_type_airport_extreme (0x14E4, 0x8E), Broadcom BCM43xx 1.0 (5.106.98.100.17)

Bluetooth: Version 6.1.7f5 15859, 3 service, 21 devices, 3 incoming serial ports

Network Service: Wi-Fi, AirPort, en2

PCI Card: NVIDIA GeForce GT 120, sppci_displaycontroller, Slot-1

PCI Card: pci1057,3410, sppci_othermultimedia, Slot-2@8,4,0

PCI Card: pci1057,3410, sppci_othermultimedia, Slot-2@8,5,0

PCI Card: pci1057,3410, sppci_othermultimedia, Slot-2@8,6,0

PCI Card: pci1a00,1, sppci_othermultimedia, Slot-4

PCI Card: pci1057,3410, sppci_othermultimedia, Slot-3@4,4,0

PCI Card: pci1057,3410, sppci_othermultimedia, Slot-3@4,5,0

PCI Card: pci1057,3410, sppci_othermultimedia, Slot-3@4,6,0

Serial ATA Device: HL-DT-ST DVD-RW GH41N

Serial ATA Device: Samsung SSD 850 EVO 500GB, 500.11 GB

Serial ATA Device: Hitachi HDS722020ALA330, 2 TB

Serial ATA Device: Hitachi HDS723030ALA640, 3 TB

Serial ATA Device: WDC WD740GD-00FLC0, 74.36 GB

USB Device: Keyboard Hub, apple_vendor_id, 0x1006, 0xfd500000 / 3

USB Device: Kensington Expert Mouse, 0x047d (Kensington), 0x1020, 0xfd530000 / 7

USB Device: Apple Keyboard, apple_vendor_id, 0x0220, 0xfd520000 / 6

USB Device: hub_device, 0x0409 (NEC Corporation), 0x005a, 0xfd300000 / 2

USB Device: v125w, 0x03f0 (Hewlett Packard), 0x3307, 0xfd310000 / 5

USB Device: hub_device, 0x0409 (NEC Corporation), 0x005a, 0xfd340000 / 4

USB Device: hub_device, apple_vendor_id, 0x9102, 0x1a200000 / 2

USB Device: hub_device, apple_vendor_id, 0x9118, 0x1a210000 / 3

USB Device: iLok, 0x088e, 0x5036, 0x1a211000 / 5

USB Device: Studio Display, apple_vendor_id, 0x9218, 0x1a213000 / 4

USB Device: BRCM2046 Hub, 0x0a5c (Broadcom Corp.), 0x4500, 0x5a100000 / 2

USB Device: Bluetooth USB Host Controller, apple_vendor_id, 0x8215, 0x5a110000 / 3

FireWire Device: built-in_hub, 800mbit_speed

Mac Pro, OS X Mountain Lion (10.8.5)

Posted on Oct 4, 2015 3:45 AM

Reply
Question marked as Best reply

Posted on Oct 4, 2015 11:02 AM

A machine check typically means a hardware problem. Since multiple CPUs are reporting problems that typically means a memory problem. If you go to System Profiler

OS X: About System Information and System Profiler - Apple Support

look under Memory and do all sticks say OK?

You have to periodically check by doing this again since the display only updates when you open profiler.

23 replies
Question marked as Best reply

Oct 4, 2015 11:02 AM in response to bithead2

A machine check typically means a hardware problem. Since multiple CPUs are reporting problems that typically means a memory problem. If you go to System Profiler

OS X: About System Information and System Profiler - Apple Support

look under Memory and do all sticks say OK?

You have to periodically check by doing this again since the display only updates when you open profiler.

Oct 4, 2015 11:02 AM in response to bithead2

lllaass has it right. This is a RAM Memory error problem. As is typical, the 'which module' information is inconclusive.


The error-correction Hardware built into the Xeon Processor can correct single-bit errors on the fly. Double-bit errors cause a halt on a kernel panic, (like the one you posted) to avoid poisoning your data. This is what it might look like if you are accumulating errors on a DIMM:


User uploaded file


-- graphic from anandtech.com


The DIMMs accumulating errors are Bad, and must be replaced, because one additional error in the same word and the machine halts on a kernel panic again.

Oct 4, 2015 11:29 AM in response to bithead2

that report is STATIC, so it must be invoked again to get new data.


Remember that the Hardware is watching each and every Read from RAM Memory, and noting any corrections that need to be made. Eventually, it will spot the problem.


At Startup, the Error Correction Hardware is used very aggressively. ANY problem, correctable or not, causes the module slot to be declared "Empty". (such modules are Bad. But the next time it is tested, it may not fail and could be used again).


Half Interval Search (division the suspect group in half again and again) is a good method for finding which group, and successively refining until you find the module.


One downside is that removing half the memory make a very subtle change to memory timing.

Oct 4, 2015 12:55 PM in response to bithead2

Since the Hardware error correction is watching every read from RAM, running an artificial memory test is not any more productive than running the things you usually run.


In addition, most memory tests will pass with flying colors (because they don't understand that memory error correction is fixing any problems that may be occurring).


It certainly would be nice if once a memory error occurred, the module would just fall over dead and be immediately detectable. But that is not the nature of these problems. Sometimes you can run detection software for days and find nothing. as you said, it is highly variable.

Oct 4, 2015 1:48 PM in response to Grant Bennet-Alder

OK here is what I did. I found this: OS X Mountain Lion: Use Apple Diagnostics or Apple Hardware Test


Then I ran those tests using the CD that came with my Mac. After 20 minutes it said:


Apple Hardware Test has detected an error.


4MEM/9/40000006: 0x712cf298


It seems to have quit testing right there.


Does this mean on the Memory page that "DIMM 4" as identified in the System Profiler utility is the bad one? There are 8 4GB DIMMs installed.


Thanks, Chris

Oct 14, 2015 6:30 PM in response to lllaass

OK here is the deal. I swapped all the RAM out and put in the original RAM. Still locked up. So it isn't the memory. Installed a new SSD and new OS. Still locks up. So I took it to the Genius Bar and they ran some tests that said MCP Die sensor was a problem. They took the machine in and ran more tests. What they say is that there is a broken temperature cable and that the processor tray has to be replaced for $299. For a broken cable? Anyway, is it possible that the fix is as simple as finding the broken cable and fixing it?


Thanks,

CJ

Oct 14, 2015 7:20 PM in response to bithead2

put in the original RAM. Still locked up. So it isn't the memory.

That does not follow from the evidence you presented. That just says that there may be problems in the memory you put back, or you may have additional problems.


A Bad sensor cable would fail the diagnostic, and the fans would run at high speed all the time. If the cable is bad, it is like one black wire, so yes, the fix should be easy. Apple does repair-by-replacement, they don't solder anything.

Mac Pro 2009 hangs intermittently

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.