I think you are right that the marker "stays". I don't have any special insight to how this was coded by Apple, but it appears to me that you initially attach the audio to a certain frame in a certain clip. Later you can drag the audio forward or backward, and it may remain attached to the initial frame, but be given an offset. For example, the audio is attached to frame five of clip three, but the audio actually begins at -30 or 30 frames to the left of the anchor frame.
I think if you experimented with it, you might find that if you delete the clip with the anchor frame, the audio is deleted, too, but I don't have time to test this right now.