tensorflow-metal on M1: runs for 16 minutes, then hangs
Yesterday I seemed to succeed installing components to run TensorFlow/Keras on my M1 MacBook Pro. I started with another recipe, but it was this one that seemed to work: “Getting Started with tensorflow-metal PluggableDevice” (Tensorflow Plugin - Metal - Apple Developer).
I was able to run a simple model training run (on MINST) which took about 2 minutes, and seemed to work fine.
This morning I modified a Jupyter notebook from my own project and launched a training run. I expected it to run about 17 hours. After 16 minutes and 45 seconds it just hung. I tried it again and got the same result. Exactly the same as near as I could tell. It was on epoch 2/100 and batch (sub-batch?) 2493/4000 of the training run. This evening I rebooted the laptop and tried again. Same result.
Most of this is new to me so it is easy to believe I made some mistake during the installation. But I don’t have much to go on. Any hints/suggestions would be appreciated.
MacBook Pro Apple Silicon