Fine-grained video retrieval for multi-clip video

Publication
Proc. Workshop on Closing the Loop Between Vision and Language (CLVL) at ICCV