Sang, Haifeng, and Ge Hai. 2019. “A Framework: Region-Frame-Attention-Compact Bilinear Pooling Layer Based S2VT For Video Description”. European Journal of Applied Sciences 7 (4), 17-30. https://doi.org/10.14738/aivp.74.6717.