Hi, I'm back. As I mentioned I'm stubbourn.
The reason people are seeing issues now is that Lion/ML (through its drivers) are calling on the GPU more heavily. A lot more smooth scrolling/animation, etc. In my case it was happening in SL as well.
Andew, I went and poured through all of the links that you provided in your fix to see if I could glean anything more. And I think I did. I read everything. Then read everything associated with the tools for the bios adjustments and flashes to get a better understanding of what I was attempting. Here's what I came up with...
The most prominant/consistent theory and observations reported, was that this card "jumps" modes trying to save resources (ram/energy) by clocking itself down. And it is durning these jumps where things get out of sync with the drivers causing the card to cycle and resync itself. This was the cause of the artifacts, and eventually a communication loop between the driver/card each trying to re-sync that lead to crashes.
In my observations, on my machine, this happens in both directions - when it is idle/low use then needs to ramp up (for example when a window opens it needs to call animation) and then the reverse, when it is in high mode (playing a video) and then goes quiet (switching to typing text in Mail for example).
So it occured to me to try something.
I went back into the RBE editor. It shows 4 modes there for the various states (reg, power saving, etc.) that are editable. Two were similar, two different.
So I adjusted the settings and made the first three 00, 01, 02 all the same 698/1000 which was what the 00 default was. The last one was similar but not exact. I flashed and saved and rebooted.
And so, guess what?
For the first time in months, I've been up and running for 24hrs. without a crash. Very stable.
Not perfect yet though. I am still getting the occational grey squares but they are predictable - I can make them happen by switching from what I callied low to high use above. And I can clear them by switching to the dashboard or changing spaces.
So what I'm going to do is continue to test my theory.
First I will use the maching for the day and cycle it on/off to a few times to make sure that this is not an anomoly that it is running smoothly, then finally I will go back into the RBE and make that last setting the same as all the others. From there if it clears up the artifacts I will be good, or at the very least I can then try to play with the settings globally (the same everywhere) and see if I can get the clocking to behave. The only inconsistency would be that second setting 01 where the voltage setting is different from the other three (all are at 1.0 except that one which is 0.9), I'm not comfortable changing that one yet until I know more.
For sake of completeness, the temperature in the machine is relatively the same, no major increase in heat, just a few degrees.
I will report back later.
A