Image Data
- Image Classification and Recognition – (object classification, scene recognition)
Multimodal Data
- Vision-Language Tasks – (image-text retrieval, captioning, VQA)
Speech Data
- ASR (Automatic Speech Recognition) – (transcription, multilingual ASR, code-switching)
Text Data
- Text Classification – (sentiment, emotion, hate speech, topic, intent detection)