10th International Congress on Information and Communication Technology in concurrent with ICT Excellence Awards (ICICT 2025) will be held at London, United Kingdom | February 18 - 21 2025.
Authors - Ishaan Bhattacharjee, Pranav H P, Harish Satheesh, Disha Jain, Bhaskarjyoti Das Abstract - Satirical content is notoriously difficult to detect, even humans often struggle to discern satire from genuine news. While significant strides have been made in computationally modeling textual satire using supervised learning, the challenge of detecting satire in multimodal content—combining both text and images—remains largely unexplored. In our research, we aim to address this gap by leveraging existing frameworks and tools to detect and differentiate multimodal satire from true news content. Satire builds on two key factors, i.e., knowledge and incongruity. Knowledge has two parts, i.e., local knowledge that is resident in image and text and global contextual knowledge that is not part of the content. Incongruity typically occurs between the first and second parts of the text. In this work, we present a three-step framework. First, we investigate multimodal frameworks such as BLIP, relying on its global knowledge without explicitly modeling the incongruity. Second, we attempt to model incongruity by focusing on the semantic gap between two parts of the text content while using a large language model in knowledge enhancement and next-sentence prediction. Finally, we combine the above two models utilizing local knowledge, global knowledge, and incongruity to offer class-leading performance. The investigations described in this work offer novel insights into the detection of satire in complex, multimodal content.