SANG, H.; HAI, G. A Framework: Region-Frame-Attention-Compact Bilinear Pooling Layer Based S2VT For Video Description. European Journal of Applied Sciences, v. 7, n. 4, p. 17-30, 8 Sep. 2019.