Long-Form Video Understanding through Multi-Modal Large Language Models

Date:

Invited talk at Tsinghua University on long-form video understanding with multimodal LLMs.