Hierarchical sparse representation for object recognition

Authors

  • Toru Nakashika Graduate School of System Informatics, Kobe University
  • Takeshi Okumura Graduate School of System Informatics, Kobe University
  • Tetsuya Takiguchi Graduate School of System Informatics, Kobe University
  • Yasuo Ariki Graduate School of System Informatics, Kobe University

DOI:

https://doi.org/10.14738/tmlai.21.95

Abstract

Recently, generic object recognition that achieves human-like vision has being looked to for use in robot vision, automatic categorization of images, and image retrieval. In object recognition, semi-supervised learning, which incorporates a large amount of unsupervised training data (unlabeled data) along with a small amount of supervised data (labeled data), is regarded as an effective tool to reduce the burden of manual annotation. However, some unlabeled data in semi-supervised models contain outliers that negatively affect the parameter estimation during the training stage. Such outliers often cause an over-fitting problem especially when a small amount of training data is used. Furthermore, another problem that occurs when using the conventional methods is that when labeling an image based on super-pixel representation, the lack of discrimination of the image features and the scale variance of the objects decreases  the recognition accuracy because the feature extraction is based on the mono-scale segmentation. In this paper, we propose an object recognition method for solving both problems. For the former problem, our method prevents the over-fitting associated with the semi-supervised based approach by using sparse representation to suppress existing outliers in the data. For the latter problem, we employ Tree Conditional Random Field to construct the hierarchical structure of an image. Experiment results using two datasets confirm the effectiveness of our method.

References

. X. Zhu, Semi-supervised learning literature survey. Proc. IEEE International Conference on Machine Learning (ICML), tutorial, 2007.

. T. Joachims, Transductive inference for text classification using support vector machines. Proc. IEEE Inter6national Conference on Machine Learning (ICML), pp. 200–209, 1999.

. K. Nigam, A. Mccallum, and T. Mitchell, Semi-supervised text classification using EM. In Semi- supervised Learning, pp. 33–56, 2006.

. A. Kimura, et al., Semicca: Efficient semi-supervised learning of canonical correlations. Proc. IEEE International Conference on Pattern Recognition (ICPR), pp. 2933–2936, 2010.

. M. Culp and G. Michailidis, An iterative algorithm for extending learners to a semi-supervised setting. Journal of Computational and Graphical Statistics, pp. 545– 571, 2008.

. M. Elad, M. A. T. Figueiredo, and M. Yi, On the role of sparse and redundant representations in image processing. Proc. IEEE Special Issue on Applications of Sparse Representation and Compressive Sensing, pp. 972–982, 2010.

. D. Needell and R. Vershynin, Uniform uncertainty principle and signal recovery via regularized orthogonal matching pursuit. Foundations of Computational Mathematic, pp. 317–334, 2009.

. X. Ren, J. Malik, Learning a classification model for segmentation. Proc. IEEE International Conference on Computer Vision (ICCV), pp. 10-17, 2003.

. P. F. Felzenszwalb, D. P. Hutenlocher, Efficient graph-based image segmentation. International Journal of Computer Vision (IJCV), vol. 59, no. 2, pp. 167-181, 2004.

. Takeshi Okumura, Tetsuya Takiguchi, Yasuo Ariki, Generic Object Recognition by Tree Conditional Random Field Based on Hierarchical Segmentation. ICPR2010, pp. 3025-3028, 2010.

. E. Sharon, A. Brandt, and R. Basri, Fast multiscale image segmentation. Proc. IEEE Computer Vision and Pattern Recognition, pp 70-77, 2000.

. J. Shi and J. Malik, Normalized cuts and image segmentation. Proc. IEEE Computer Vision and Pattern Recognition, pp. 731–737, 1997.

. J. Wright, A. Ganesh, S. Rao, and Y. Ma, Exact recovery of corrupted low-rank matrices by convex optimization. Proc. IEEE, 2009.

. A. Y. Yang, R. Jafari, S. S. Sastry, and R. Bajcsy, Distributed recognition of human actions using wearable motion sensor networks. Journal of Ambient Intelli- gence and Smart Environments, pp. 103–115, 2009.

. J. D. Lafferty, A. McCallum, and F. C. N. Pereira, Conditional random fields: Probabilistic models for segmenting and labeling sequence data. Proc. International Conference on Machine Learning, 2001.

. C. M. Bishop, Pattern Recognition and Machine Learning. Springer, Chapter. 8, 2006.

. Stanford artificial intelligence robot (stair) image dataset. http://cs.stanford.edu/group/stair/

. P. Duygulu, K. Barnard, J. de Freitas, and D. Forsyth, Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. ECCV, pp. 97-112, 2002.

. J. Shotton, J. Winn, C. Rother, and A. Criminisi, Textonboost: joint appearance, shape and context modeling for multi-class object recognition and segmentation., Proc. IEEE European Conference on Computer Vision, pp. l-15, 2006.

. S. Gould, J. Rodgers, D. Cohen, G. Elidan and D. Koller, Multi-Class segmentation with relative location prior. International Journal of Computer Vision, pp. 300-316, 2008.

Downloads

Published

2014-02-10

How to Cite

Nakashika, T., Okumura, T., Takiguchi, T., & Ariki, Y. (2014). Hierarchical sparse representation for object recognition. Transactions on Engineering and Computing Sciences, 2(1), Page:46–60. https://doi.org/10.14738/tmlai.21.95

Most read articles by the same author(s)