feel free to redistribute, I waive any copyright on the post below :
I would like to raise a principal issue on how Apple handles these problems having gone through the MBP 2011 GPU issue full cycle twice:
in a synopsis I have reason to believe that wide-spread issues such apparently this one are being systematically:
(a) not acknowledged,
(b) not communicated in symtoms and resultion within the service/support organization,
(c) only handled reactively by design and
(d) only taken care off in a transparent and proactive way once pressure through too many publicly reported incidences has reached significant proportions as for prior instances such as the buzzing first Intel motherboards that I experienced first hand.
Here are my experiences in 4 acts substantiating this proposition through observations of how my 2 rMBP 2011 cases were handled:
Act 1: multi-month "diagnosis" phase
- when a rMBP 2011 started to throw graphics-related issues -- first symptoms being frequent _reproducible multi-monitor, hot-plugging triggered crashes October 2012 I:
* searched public information -> no obvious pattern through the google index
* suspecting software then I filed a diligently described bug report against the OS in http://bugreport.apple.com
* a week later the bug screeners asked for post-crash sysdiagnose output
* I could provide no sydiagnose dumps though as triggering them post-crash with Shift-Control-Option-Command-Period was not processed by the machine any more. Looking through telling log entries in /var/log/system.log (I am unix-macosx fluent) did not yield cause candidates over weeks.
Act 2: Trying to diagnose the issue myself for 4 Months
The next 4 months I tried to diagnose the computer's crashes diligently and in the process spending significant time on it.With sometimes several crashes per day with reproducible crashes on monitor hot-plugs at a rate of around 7 out of 10 tries there was plenty of opportunity to try things. Still, I suspected software considering there still was no obvious problem pattern visible through the google index. My interaction with the SW-Dev side of the Apple business did also not hint on hardware as a cause candidate. So besides the obvious of hoping for each OS service dot release to include a fix, I continued trying to find kernel module driver culprits, running HW diagnostics (yes you never know), watching the trigger pattern and for the most part the OS logs. In hindsight there were cues to a GPU issue as freezes appeared more often with GPU-related operations, but hot-plugging remained the only reproducible fault. Fast forward 4 months to the 3rd act.
Act 3: 1st motherboard swap
Having thoroughly exhausted my options on how to handle sometimes several crashes per day, I at last contacted Apple support. Not wanting to appear that I am looking for quick uninformed chancy HW swap I informed phone support on the issue history including the bug report# logged against the OS. I of course made note of the amount of time I had spent on it already.
* asked twice if there is anything in the customer support database(s) on a related issue - negative I heard.
The calls left off with the token on me to reinstall again and re-contact if the issue persists. Still, I would rate the handling on part of phone support perfectly right as after all I had just contacted them and the person on the other end of the line clearly did not have more information than what the google index provided to me (or to Apple support maybe as there was a hint that they look for things there too).
About 5 weeks later I finally gave up on the progress of getting the notebook's crash proneness resolved and contacted Apple phone support on feeling that pressing a hardware resolution is sensible having _really done everything possible from my side. The again dedicated!! 1st line phone service person on the ticket (where I would rate the phone service response again highly and this is not meant sarcastically) promised to talk to the repair shop if necessary. He even followed-up by email how things are going.
Getting a motherboard swap from the repair shop was no self-starter. I had talked the service counter through the multi-month ordeal. Yet, I got the sense that I had to be grateful for them swapping the mother board as the evidence of a HW issue was not water-tight. So I received the rMBP back with a new motherboard - the serial stayed the same, checked/asked - seems to be standard procedure to flash a new board with the previous one. Following the repair graphics related issues persisted, although slightly different, going to act #4.
Act 4: Towards the 2nd motherboard swap
The machine showed _fewer (then how do you catgorize the repair?) monitor HDMI/DisplayPort/VGA over DisplayPort hot-swap issues then before, at a rate of maybe one crash every 3 days-they persisted though. I logically? concluded that the issue was software after all. Feeling bad having caused an "unnecessary" mother board swap with an OK board to land in electronics scrap I kept quiet an endured. The sure emergence of a next major release 10.9 made enduring with a hoped for fix bearable. Desolation set in when 10.9 did not resolve the crashes. I am not sure how this would have gone on had the HW not died conscutively.
Attaching a 30" Apple Cinema Display with an Apple DisplayPort adapter which had always worked not counting visible HF noise on the dual-link DVI started to fail. First it would not sync every so and so, a few weeks later attaching it and getting a picture failed completely. I cross-tested with other Macbooks and a Mac Pro, with no issue there. Days following loss of the ability to attach the Apple Display the rMBP started to show fails on booting with the built-in display only; it got stuck in the boot processes with too little verbosity in the console messages to be able to tell. Yet again, despite being exceedingly unlikely at that stage I wanted to rule out software and spent time (...) trying to find which was the last OS startup initilization.
Act 5: the 2nd swap
* Finally, another new symptom showed which made it utterly clear that it is a graphics-path HW issue; booting a rescue system off external media (conclusively ruling out any software cause) brought an audible sound in late boot as for an ok system being rescue booted? but no screen. I brought it to repairs, explaining the clarity of the case and how easy it was to reproduce, and expected a quick turn-around without issues. However, a few days into the repair I was called and asked to provided the purchase receipt. Provided that it was purchased abroad by a relative that was not an easy undertaking. A week into the repair I could provide a level of purchase documentation and the repair shop decided to ask Apple for a motherboard swap authorization (at t+7 calendar days into the repair). I of course provided information about the _long history of problems, the prior swap and for simplicity the repair was logged with the same shop that done the first repair. The Apple response was negative though.
* Issue now was that the system was already more than 1 year old. While 2 years extended warranty is statutory law in the EU, the warranty right is against the seller and not the manufacturer. The machine had been bought in Japan.
* I recontacted my Apple Service person from the 1st swap. Unfortunately, email bounced and calling in showed that he was not with the company any more. I then went through 1st line phone support again, explaining -- this time with hard facts that it _IS a HW issue. Through the serial number it was quickly triaged (...) to be out of the 1yr manufacturer warranty window. I explained it needing repairs, the reasonableness of the request given the history of the case and so on. 1st level handed it to a 2nd level manager who clearly was authorized to grant repairs which de-jure do not have coverage. Again, the handling was professional. Nevertheless, I was in the pleading role and that when it appeared edgy to go the wrong way for a net of 30 minutes talking to the 1st line staffer and the manager on the phone. When 2nd level hinted she would probably approve, but "first needs to talk to the repair shop just to be correct" I knew I was close to a 2nd repair without having the de-facto total loss of the machine due to replacing a broken motherboard probably not being economical. Following such a path or procedure (?) was perfectly correct. The motherboard was finally swapped for "GPU problems" shortly thereafter (total time for the repair approx. 9 calendar days). The experience left me with a manufacturer to-be-perceived notion that I should be grateful for the free repair. While I stayed cordial in the whole process, the gratitude stayed within the dry limit of having safed a 1.5 yr. old machine from scrapping had the motherboard not been replaced for free.
Knowing now that many people are affected and surmising that there is an organization within Apple, that having the data -- see Epilogue -- knew all along poses a question. Why were at least the service 1st-and 2nd line, let alone external repair left in the cold information-wise for this problem. I leave it partially to the reader what to make of it.
Epilogue
Again, I want to underline that the cases were handled very well in the scope of the involved people's roles (Apple support 1st and 2nd line, repair shop and possibly development in software ticket screening). In other words, if someone in Apple management reads this and -- and had I provided the case #s -- the right reaction would NOT be to reprimand the involved employees; they did everything right given the information and probably guidelines given to them, responsively and professionally. Had I rated them on one of the ubiquitous post-contact customer satisfaction surveys, I would have given them a perfect score. So that is not where the problem is.
Still, the 2 issues were grossly misshandled, knowing now that discrete GPU issues in this model type are wide-spread.
There is an organization within Apple which with little lag probably clearly sees the data on problem reports and repair incidents around the world. With near certainty, those are being tracked meticulously as substantial costs are involved. They probably have all the lab resources to reproduce and find root-causes, it not alone then with e.g. the GPU manufacturers. If they have more information and see high fault incidents, and are able to extrapolate the numbers of affected machines as it is simple sample to population statistics, what do they do? Apparently, this part of Apple or Apple as such does not proactively act on clearly significant field hardware problems that are affecting very many machines. This is a conjecture based on my experiences above, and please don't sue me on it.
- Cheers,
an early 2011 rMBP owner with 2 motherboard swaps