818166 Views Previous 1 … 20 21 22 23 24 … Next 2,269 Replies Latest reply: Feb 21, 2015 4:45 AM by oGr3 Go to original post Branched to a new discussion.
Some more info for those who are considering returning their machines and whose who asked about windows issues.
After much stress testing on a windows partition I can confirm that I was not able to get my MacBook to freeze under Windows, even with sustained temps of 100 degrees plus on the CPU and graphics intensive tasks running.
This all points to driver/firmware issues under OSX as indicated above.
For people who want to test this under windows for themselves, simple program attached that will max out 8 cores;
static void Main(string args)
Parallel.For(0, 8, Run);
private static void Run(int thread)
int a = 0;
Good to know Lithast, i really hope this is a Mac software problem that can be rectified by a software update.. If you are having no problems in a windows environment then it makes me feel better in sick sadistic way :P
Come on Apple, this is a pro machine for pro users.. step up and deliver what your shovelling!!
i bought 2.2ghz i7 with amd 6750 macbook pro, my laptop locked up every time I close world of warcraft after playing for like one or two hours... I called apple, and they told me to get replacement machine. The replacement machine have the same locking up problem.
This is really unacceptable for an EXPENSIVE laptop such as macbook pro. I am simply done with crapple product.....
Be careful drawing conclusions with off the shelf memory tests.
Intel's CPU's run a very sophisticated training algorithm at boot-up to tune the timing relationship between the memory controller and the DRAMs. At 1333, the timing is very tight, and the timing relationships move slightly with temperature and voltage. They have a margin system whereby the timing at one temperature suggests timing will be met at all temperatures.
When the Nehalem processor came out, which has 3 memory channels as opposed to 2, the timing appeared to be solid. However, over time it was discovered that the processor socket produced a cross-talk issue between some bits of two of the memory channels. This was both board layout specific and also depended on hitting a very difficult to produce data pattern. A somewhat hacked algorithm was added to try and shift the two channels in such as way as to minimize the impact. It was not completely avoided, but it did in effect help the noise margin to an acceptable level for most cases. Standard memory tests (which ran for ~ 1 day) did not hit the conditions necessary to reveal the problem.
Not that standard memory tests do not tell a lot about the memories etc, they are very good when they indicate fail. I don't rely on them when they indicate pass, I just feel a lot better.
First, your memory is probably fine.
Just a note about hardware tests. There are two outcomes from a hardware test:
The first outcome is that the test returns an error: This is generally a good indication there is something wrong with the hardware.
The second outcome is that the hardware passes the test: this is generally a good indication the test didn't turn up anything wrong. Notice the wording there. Passing a test *does not mean there is no problem*.
Think of it like a pregnancy test: If it says a woman is pregnant then she probably is. If it says she's not she still could be and didn't wait long enough to test.
If anyone wants to definitively test if their machines are affected use an app like "smallluxgpu", and set the cores to 8 + GPU & run the interactive test. This will draw a ray-traced image (stressing the GPU & CPU to 100%) and almost immediately bring the system to a halt. To those who believe this is a firmware issue related to the fans kicking in late...download smcfancontrol and max the fans out BEFORE the test & you'll still experience this issue once the CPU core temp approaches 100C. All it does is delay the hang by @ most ~15s. I have a replacement in the mail and hope they solved this...maybe a bad batch of thermal paste? too much thermal paste applied? Or a driver related to the GPU may be @ fault since audio continues in the background. I'm not too hopeful though as i was able to reproduce the issue on a 2.3 quad i7, stock, on display @ the apple store. This demonstration almost floored the genius.
In summary, the smallluxgpu scenario may not be common in life but real world applications like video calling in skype also crashes the machine after 20-30mins. Converting a video and running VMWARE....bad idea. Backing up your hard drive while watching a video.....bad idea again. Come on apple, i'm even nervous to update the machine when 10.6.7 comes out for of it crashing during the update.
Chullo - you hit it dead on man... That would explain why all the other tweaks help but don't solve.
Frustrated - this definitely seems more of a hardware nightmare than a software/firmware/programatic issue.
So do they issue a software update to limit cycles and keep things within range thus slowing down machine performance so that they don't have to replace everyone's systems - or do they start mailing all of us new boxes?
I've got 3 systems here, 1 x 2.3 and 2 x 2.2 and the 2.3 system fails within 30 seconds, the 2.2's fail within 2-3 minutes. The heavy tests don't crash my wife's 2009 box. Time to get on eBay and try to buy my old 2010 back?
In the famous words of Lee Corso, "Not so fast, my friend!"
I just ran smalluxgpu gor 10 minutes total. 5 minutes on "integrated only"; 5 minutes on "discrete only". Surely my system struggled to do anything but that raytrace...but it never lost stability. I was able to end both tests with the "esc" key, and the temps went right back to normal.
During the test, I had no problem opening and closing other apps. Temps never got above 90C on my machine for the CPU and the GPU never got higher than 72C.
Your test is inconclusive...unless I was supposed to test it for longer than 10 minutes.
But really...the issues people are having are not temperature-dependent. My computer randomly locks up when the CPU is under 60C and I'm not doing anything special at all just as it does if I'm exiting a game or something. It's completely random. Looking for the a problem exclusively secondary to temperatures is barking up the wrong tree.
Heck, I could game in Bootcamp where the temperatures get up to 99C sustained...and never have any stability issues.
It's tough man. After this new unit comes in i'll decide what i'm going to do. As the genius said, if we opt for a replacement a new 2wks grace period kicks in to return the machine with 100% refund. That'll give apple about a month to figure something out. The sad thing is i love how the machine operates, but makes absolutely no sense to have a 2500$ machine that basically becomes a paperweight when skyping! Not to mention the more demanding tasks that entails a typical week for me.
iFixit confirms that the new machines have a very generous application of thermal paste [ http://www.cultofmac.com/ifixits-2011-macbook-pro-teardown-better-repairability- but-may-be-prone-to-overheating/83649 ]. So that could be the achilles heel of everything, since having too much thermal paste basically results in a reduction of conduction difference between the heat sink and the processor. This translates into it taking a lot longer for the heat to be transferred to the heat sink and remain on the heat sink side. The CPU appears to handle the increased temperature OK (since audio plays, etc), but the GPU appears to have an issue with this. I'll run a test in the future tracking the GPU temp & also a test using only the integrated chip.
Adrian -- I like your results. Basically indicates that not all machines are affected, which is welcome news to us. The test will run properly on any machine that isnt affected by a heating issue since that will generate maximum heat possible from the CPU & GPU (i.e. running @ 100%). My test needs to be run for like 1min, if the machine survives for that long w/o any external cooling then you can basically pass your machine as O.K. I hope my replacement machine behaves like yours . Any other issues would be immaterial and can be addressed by a firmware update.
Btw, adrian confirm that you ran it @ 8cores + GPU. I was demonstrating it to the genius & was baffled why it was surviving then i realized i was running @ 2cores. Secondly, i forgot to give you the image size of 1024*768. So settings.....smallluxgpu @ 8cores+GPU @ 1024*768 in interactive mode. This brings my 2.2 to a halt in about 10s and the stock 2.3 on display @ apple (in A/C) in about 25s.
Message was edited by: Chullo
Adrian - I'm wondering if different machines are effected to different degrees. I have 3 brand spanking new machines here and all 3 fail in <4 minutes. (I also just got my AMEX statement and now get to pay for them and have a credit for the next couple of months - ugh)...
My main system (2.3GHz) is locking up daily and has been since I received it. One day it locked up 11 times - no joke, I've got a counter now. During some freezes usage did feel more limited while others were more intense - which is making this difficult to figure out.
But consistently that test (or me spawning 16 simultaneous YouTube vids) brings on the pain.
I do suppose that we are all grabbing at straws here. I, like everyone else, just need something that works because I can't unfortunately bill my clients for reboots and lost work. And I am also scrambling to to catch up for lost time reinstalling from scratch, trouble shooting with techs and engineers and the other OCD time tinkering and trying to nail down what the heck to avoid so as not to lose more work.
Anyhow, I'm tempted to run over to the Apple store and find a machine like yours that doesn't overheat during this test. You ought to go out and buy a lottery ticket while all of us other saps struggle...
could be, gentlemen. Just to make sure, I ran it again. this time for about 800 seconds. here are 2 screen shots (full resolution, 1920x1200):
One shows the render going for 412 seconds, the other shows only 53. the second should have shown around 800, but I decided to change the rendering angle (just to do something) and it reset my numbers. during the test, I was listening to music on the open music player. no problem changing songs or anything though the whole system was crawling. you'll also note the temperature in the bar at the top. system never got hotter than 91C. After 800 seconds (approx. 13 minutes), I'd assume it won't get any hotter than that.
Still...it's the unpredictability of it all. Mine was frozen when I woke up this morning and it froze again maybe an hour after one of my recent posts here. it's been fine since. I've been mixing it up between integrated and discrete. idk, guys.