10th International Congress on Information and Communication Technology in concurrent with ICT Excellence Awards (ICICT 2025) will be held at London, United Kingdom | February 18 - 21 2025.
Wednesday February 19, 2025 11:45am - 1:15pm GMT
Authors - Francisco Seipel-Soubrier, Jonathan Cyriax Brast, Eicke Godehardt, Jorg Schafer
Abstract - We propose an architecture of a proof-of-concept for automated video summarization and evaluate its performance, addressing the challenges posed by the increasing prevalence of video content. The research focuses on creating a multi-modal approach that integrates audio and visual analysis techniques to generate comprehensive video descriptions. Evaluation of the system across various video genres revealed that while video-based large language models show improvements over image-only models, they still struggle to capture nuanced visual narratives, resulting in generalized output for videos without a strong speech based narrative. The multi-modal approach demonstrated the ability to generate useful short summaries for most video types, but especially in speech-heavy videos offers minimal advantages over speech-only processing. The generation of textual alternatives and descriptive transcripts showed promise. While primarily stable for speech-heavy videos, future investigation into refinement techniques and potential advancements in video-based large language models holds promise for improved performance in the future.
Paper Presenters
Wednesday February 19, 2025 11:45am - 1:15pm GMT
Virtual Room C London, United Kingdom

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link