X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers