[1]
Wu, P.Y. and Mebane, W.R. 2022. MARMOT: A Deep Learning Framework for Constructing Multimodal Representations for Vision-and-Language Tasks. Computational Communication Research. 4, 1 (May 2022), 275–322.