Multimodal Understanding of Sentiment and Emotion (MUSE): a comprehensive exploration of emotion and sentiment recognition in conversations using the MELD dataset, comparing baseline architectures, alternated training strategies, and joint multimodal learning with context modeling.