Arxiv Compvid | Shristi Das Biswas

Our new work on efficient video understanding has been accepted to WACV 2026! The paper, Learning Unified Spatio-temporal Representations for Efficient Compressed Video Understanding, proposes a lightweight framework that learning video representations directly from compressed videos. This avoids the need for decompression and achieves state-of-the-art performance with up to 15x faster inference! 🎉