Basically, on Windows, the Microsoft SWF things tap into the various messages that the OS sends to the application.
Mono has to use the different messaging systems on the various OSs that it supports and yet create the same end-user effects.
My understanding is that they have managed to do this for the common (Li)(u)nix systems which use X11 as their underlying messaging system, but not yet for the native OSX GUI. (If you look at the mono web pages there is a brief mention of something going on in this area but I suspect that the information is rather old and probably out of date).
So, unless you (or someone) wants to get into the depths of this and write the 'driver', then X11 is necessary, at least for now.
The 'normal' terminal within OSX does not start the X11 display system, but the X11 terminal does. If you want to, I've found that you can start the X11 terminal and then shut it down (the X11 daemons etc will continue to run in the background) and the mono SWF code works well.
BTW, SWF is now built over the System.Drawing namespace and this is where the system interaction occurs.
Hope this explains the situation...
Susan