We discuss the synergetic effects that can be obtained in an integrated multimodal interface framework comprising on one hand a visual language-based modality and on the other natural language analysis and generation components. Besides a visual language with high expressive power, the framework includes a cross-modal translation mechanism which enables mutual illumination of interface language syntax and semantics. Special attention has been payed to how to address problems with robustness and pragmatics through unconventional methods which aim to enable user control of the discourse management process.