BERT representations for video question answering

Publication
Proc. IEEE Winter Conference on Applications of Computer Vision