Top-down Spatial Attention for Visual Search: Novelty Detection-Tracking Using Spatial Memory with a Mobile Robot


  • Nevrez Imamoglu Chiba University
  • Enrique Dorronzoro University of Seville Chiba University
  • Masashi Sekine Chiba University
  • Kahori Kita Chiba University
  • Wenwei Yu Chiba University



Novelty Detection and Tracking, Visual Attention Computational Models, Space based Spatial Saliency, Robotics, At-Home Monitoring


Assistive robotics technologies have been growing impact on at-home monitoring services to support daily life. One of the main research fields is to develop an autonomous mobile robot with the tasks detection, tracking, observation and analysis of the subject of interest in the indoor environment. The main challenges in such daily monitoring application, thus in visual search, are that the robot should track the subject successfully in several severe varying conditions. Recent color and depth image based visual search methods can help to handle part of the problems, such as changing illumination, occlusion, and etc. but these methods generally use large amount of training data by checking the whole scene with high redundancy to find the region of interest. Therefore, inspired by the idea that spatial memory can reveal novelty regions for finding the attention points as in Human Visual System (HVS), we proposed a simple and novel algorithm that integrates Kinect and Lidar(Light Detection And Ranging) sensor data to detect and track novelties using the environment map of the robot as a top-down approach without the necessity of large amount of training data. Then, novelty detection and tracking is achieved based on space based saliency map representing the novelty on the scene. Experimental results demonstrated that the proposed visual attention based scene analysis can handle various conditions stated and achieve high accuracy of novelty detection and tracking.


Rahidi, P. and A. Mihailidis, A survey on ambient-assisted living tools for older adults. IEEE Journal of Biomedical and Health Informatics, 2013. 17(3): p. 579-590.

Gross, H.-M., et al., Progress in developing a socially assistive mobile home robot companion for the elderly with mild cogntive impairment. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), 2011. p. 2430-2437.

Cesta, A., et al., The ROBOCARE assistive home robot: Environment, features, and evaluation, The ROBOCARE Technical Reports, 2004. RC-TR-0906-6.

Jayawardena, C., et al., Design, implementation and field tests of a socially assistive robot for the elderly. The Fourth IEEE RAS/EMBS Int. Conf. On Biomedical Robotics and Biomechatronics, 2012. p. 1837-1842.

Li, R. and M. A. Oskoei, H. Hu, Towards ROS based multi-robot architecture for ambient assisted living, IEEE Int. Conf. on Systems, Man, and Cybernetics, 2013. p. 3458-3463.

Simpson, R. C., et al., NaVChair: An assistive wheelchair navigation system with automatic adaptation, Assistive Technology and AI, Lecture Notes in Artificial Intelligence (LNAI), 1998. 1458: p. 235-255.

Wei, Z., W. Chen, and J. Wang, Semantic mapping for smart wheelchairs using RGB-D camera, Journal of Medical Imaging and Health Informatics, 2013. 3(1): p.94-100.

Liu, H., S. Chen, and N. Kubota, Intelligent video systems and analytics: A survey. IEEE Trans. on Industrial Informatics, 2012. 8(1): p. 49-60.

Luo, R. Cc. and C.-C. Chang, Multisensor fusion and integration: A review on approaches and its applications in mechatronics. IEEE Trans. on Industrial Informatics, 2013. 9(3): p. 1222-1233.

Yilmaz, A., O. Javed, and M. Shah, Object tracking: A survey. ACM Computing Surveys, 2006. 38(4) - Article 13: p. 1-45.

Gupta, A.M., et al., An on-line visual human trcking algorithm using SURF-based dynamic object model. IEEE Int. Conf. on Image Processing (ICIP), 2010. p. 3875-3879.

Bay, H., et al., Speeded-up robust features (SURF). Computer Vision and Image Understanding, 2008. 110: p. 346-359.

Lowe, D.G., Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision (IJCV), 2004. 60(2): p. 91-110.

Borji, A., et al., Adaptive object tracking by learning background context. IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), 2012. p. 23-30.

Nguyen, D.T., et al., Object detection using non-redundant local binary patterns. IEEE Int. Conf. on Image Processing (ICIP), 2010. p. 4609-4612.

Ning, J., et al., Scale and orientation adaptive mean shift tracking. IET Computer Vision, 2010. 6(1): p. 52-61.

Talha, M. and R. Stolkin, Particle filter tracking of camouflaged targets by adaptive fusion thermal and visible spectra camera data. IEEE Sensors Journal, 2014. 14(1): p. 159-166.

Garcia, G.M., et al., Adaptive multi-cue 3D tracking of arbitrary objects. Pattern Recognition, Lecture Notes in Computer Science (LNCS), 2012. 7476: p. 357-366.

Jia, S., et al., Robust human detecting and tracking using varying scale template matching. IEEE Int. Conf. on Information and Automation, 2012. p. 25-30.

Liu, J., et al., Real-time human detection and tracking in complex environmentss using single RGBD camera. IEEE Int. Conf. on Image Processing (ICIP), 2013. p. 3088-3092.

Fang, Y., et al., Saliency detection in the compressed domain for adaptive image retargeting. IEEE Trans. on Image Processing (IEEE TIP), 2012. 21(9): p. 3888-3901.

Fink, G.R., et al., Space-based and object-based visual attention: shared and specific neural domains. Brain, 1198. 120: p. 2013-2028.

Logan, G.D., The CODE theory of Visual Attention: An integration of space-based and object-based attention. Psychological Review, 1996. 103(4), p. 603-649.

Mozer, M.C. and S.P. Vecera, Space-and object-based attention. Neurobiology of Attention, Academic Press, 2005. p. 130-134.

Frintrop, S., et al., A component based approach to visual person tracking from a mobile platform. International Journal of Social Robotics, 2010. 2(1), p. 53-62

Zhang, G., Z. Yuan, and N. Zheng, Key object discovery and tracking based on context aware saliency. International Journal of Advanced Robotic Systems, 2013. 10(15): p. 1-12.

Yang, J. and M.-H. Yang, Top-down visual saliency via joint CRF and dictionary learning. IEEE Int. Conf. on Computer Vision and Pattern recognition (CVPR), 2012. p. 2296-2303.

Wang, J., et al., A computational model of stereoscopic 3D visual saliency. IEEE Trans. on Image Processing (IEEE TIP), 2013. 22(6): p. 2151-2165.

Chamaret, C., et al., Adaptive 3D rendering based on region-of-interest. Proc. of SPIE, 2010. 7524: p. 75240V.

Ouerhani, N. and H. Hugli, Computing visual attention from scene depth. IEEE 15th International Conf. on Pattern Recognition, 2000, 1: p. 375 -378.

Fang, Y., et al., Saliency detection for stereoscopic images. IEEE Trans. on Image Processing (IEEE TIP), 2014. 23(6): p. 2625-2635.

Begum, M., et al., Object- and space-based visual attention: An integrated framework for autonomous robots. IEEE/RSJ Inter. Conf. on Intelligent Robots and Systems (IROS), 2008. p. 301-306.

Garcia, G.M. and S. Frintrop, A computational framework for attentional 3D object detection. Proc. of the Annual Meeting of the Cognitive Science Society, 2013.

Chun, M.M. and J.M. Wolfe, Visual Attention. Handbook of Sensation and Perception (Chapter 9), Edited by E. B. Goldstein, Blackwell Publishing, 2005. p. 273-310.

Chun, M.M. and K. Nakayama, On the functional role of implicit visual memory for the adaptive deployment of attention across scenes. Visual Cognition, 2000. 7: p. 65-81.

Oh, S.-H. and M.-S. Kim, The role of spatial working memory in visual search efficiency. Psychonomic Bulletin & Review, 2004. 11(2): p. 275-281.

Johnston, W.A., et al., Attention capture by novel stimuli. Journal of Experimental Psychology, 1991. 119(4): p. 397-411.

Zhou, Y., et al., Region based high-level semantics extraction with CEDD. 2nd IEEE Int. Conf. on Network Infrastructure and Digital Content, 2010. p. 404-408.

Nuchter, A. and J. Hertzberg, Towards semantic maps for mobile robots. Robotics and Autonomous Systems, 2008. 56: p. 915-926.

Mason, J. and B. Marthi, An object-based semantic world model for long-term change detection and semantic querying, IEEE/RSJ Inter. Conf. on Intelligent Robots and Systems (IROS), 2012. p. 3851-3858.

Robot Operating System (ROS),


Grisetti, G., C. Stachniss, and W. Burgard, Improved techniques for grid mapping with rao-blackwellized particle filters. IEEE Transactions on Robotics, 2007. 23: p. 34-46.


Thrun, S., W. Burgard, and D. Fox, Probabilistic robotics. The MIT Press Cambridge, 2005.


Transformation Matrix,


Additional Files



How to Cite

Imamoglu, N., Dorronzoro, E., Sekine, M., Kita, K., & Yu, W. (2014). Top-down Spatial Attention for Visual Search: Novelty Detection-Tracking Using Spatial Memory with a Mobile Robot. European Journal of Applied Sciences, 2(5), 36–53.