Return to Article Details A Framework: Region-Frame-Attention-Compact Bilinear Pooling Layer Based S2VT For Video Description Download Download PDF