Pheidius1

Q: Cursed GTX 980 TI

My GTX 980 TI flashed by Mac Vid cards is cursed. First, it did not boot at all when plugged into internal power which is not the norm for this card. Then, once running on external power, it began to KP with Nivida web drivers listed as number one in the backtrace and a slew of references to Nvidia or Apple IO following. All the KP logs read virtually the same save for whether or not it was CPU 0 or CPU One that initiated the panic.  The time to boot chime was about thirty seconds. Now it does not boot chime at all. It just hangs in POST. On a PC machine I would pull the card, reset all the ram, and pull all the other cards. I would likely uninstall and reinstall the drivers. My PCIe slot configuration has the GTX in lane 1. An OWC E2 SSD blade card in slot 2. A Rocketraid 4322 in slot three and a Sonnet USB 3.0 card in slot 4. This configuration works fine with the original ATI HD I am too slow card. Any other ideas?

MAC Pro 2010, OS X El Capitan (10.11.4), 12 core 3.43, 128 GB

Posted on Jul 4, 2016 5:07 PM

Close

Q: Cursed GTX 980 TI

  • All replies
  • Helpful answers

  • by Grant Bennet-Alder,

    Grant Bennet-Alder Grant Bennet-Alder Jul 4, 2016 7:28 PM in response to Pheidius1
    Level 9 (60,976 points)
    Desktops
    Jul 4, 2016 7:28 PM in response to Pheidius1

    so how much power on what connectors is it looking for, and how much have you provided? two six-pins? one six, one eight? two eights?

  • by Pheidius1,

    Pheidius1 Pheidius1 Jul 4, 2016 10:28 PM in response to Grant Bennet-Alder
    Level 1 (15 points)
    Desktops
    Jul 4, 2016 10:28 PM in response to Grant Bennet-Alder

    I gave it one six pin and one 8 pin off a 750 watt power supply. This is looking as if it is another RMA as the card has just repeated its same behavior in a  Windows PC. In the process of troubleshooting,however, I ran AHT and also got a memory error that reads 4mem/62/40000006 0x71076318.  Does this mean that the Dimm in slot 4 is throwing an error. It doesn't err on the quick test and does not light up any LED on board. I guess I will pull the tray and reset the memory. This may be why I have been getting such slow POST times.

  • by lllaass,Solvedanswer

    lllaass lllaass Jul 5, 2016 12:39 AM in response to Pheidius1
    Level 10 (189,016 points)
    Desktops
    Jul 5, 2016 12:39 AM in response to Pheidius1

    You asked "Does this mean that the Dimm in slot 4 is throwing an error."

    No. All error codes start with a 4

    http://www.macissues.com/2014/03/21/how-to-run-and-interpret-apples-hardware-tes ts-on-your-mac/

    I have not seen a way to narrow down what DIMM/pair of DIM caused the error.

    I would try pulling out half the RAM and run AHT again. The repeat as necessary to find good and bad DIMMs

  • by Pheidius1,

    Pheidius1 Pheidius1 Jul 5, 2016 1:30 AM in response to lllaass
    Level 1 (15 points)
    Desktops
    Jul 5, 2016 1:30 AM in response to lllaass

    O joy. This has certainly been a slow start. All this memory is brand new OWC memory. That makes two brand new cards now that have failed in this machine and at least one Dimm. It must not be a fatal error as there is no red led showing a flat lined dimm and no memory associated kernel panics.

  • by Grant Bennet-Alder,Helpful

    Grant Bennet-Alder Grant Bennet-Alder Jul 5, 2016 5:01 PM in response to Pheidius1
    Level 9 (60,976 points)
    Desktops
    Jul 5, 2016 5:01 PM in response to Pheidius1

    The Mac Pro with Xeon processor uses Errror-Correcting memory. Single-bit errors are corrected by Hardware on the fly and tabulated by a background process. Accumulated errors since the last Startup are available in this STATIC (new information require the report be invoked again) report from

     menu > about This Mac > (system Report) > memory:

    eccerrors2.jpg

     

    graphic from anandtech.com

     

    Only uncorrectable errors cause a kernel panic, machine check, often with multiple processors reporting the error.

     

    At startup, error-correction is used very aggressively, and any errors that occur in the few seconds of memory testing cause the modules involved to have their slots declared "empty", and Mac OS does not use them. This state is reset at next Startup (they don't stay on the bad list).

  • by Pheidius1,

    Pheidius1 Pheidius1 Jul 5, 2016 4:51 PM in response to Grant Bennet-Alder
    Level 1 (15 points)
    Desktops
    Jul 5, 2016 4:51 PM in response to Grant Bennet-Alder

    Well that was fun. I just checked that and the current boot was missing both Dimm one and DimM 2. I reset Dimm 1 and it was recognized. Rebooted and dim 2 was recognized. Now I am back to 128 again.

  • by Grant Bennet-Alder,Helpful

    Grant Bennet-Alder Grant Bennet-Alder Jul 5, 2016 7:22 PM in response to Pheidius1
    Level 9 (60,976 points)
    Desktops
    Jul 5, 2016 7:22 PM in response to Pheidius1

    Modules found to be bad will need to be replaced soon.

     

    They do not suddenly become fine after a restart, they just get tested again, and do not always get detected the next time around.

  • by Pheidius1,

    Pheidius1 Pheidius1 Jul 5, 2016 7:21 PM in response to Grant Bennet-Alder
    Level 1 (15 points)
    Desktops
    Jul 5, 2016 7:21 PM in response to Grant Bennet-Alder

    Concur. OWC offers a good warranty but I need to wait until they are failing consistently  to effectively get them replaced. I need for them to fail OWC testing as well.

  • by Grant Bennet-Alder,

    Grant Bennet-Alder Grant Bennet-Alder Jul 6, 2016 6:32 AM in response to Pheidius1
    Level 9 (60,976 points)
    Desktops
    Jul 6, 2016 6:32 AM in response to Pheidius1

    I need to wait until they are failing consistently  to effectively get them replaced.

    what?

    I need for them to fail OWC testing as well.

     

    Nonsense on both counts. You know which modules are failing. Send them for exchange now. OWC will put them in a memory tester to confirm, you do not need to do their work for them. They will send you working replacements.