Compute-efficient video models operating directly on compressed streams; strong accuracy–latency trade-offs.