Masking the Backgrounds to Produce Object-Focused Images and Comparing that Augmentation Method to Other Methods

Authors

  • Ahmad Hammoud Global University
  • Ahmad Ghandour American University of Beirut

DOI:

https://doi.org/10.14738/tmlai.103.12245

Keywords:

Machine Learning, Computer Vision, Object Classification, Image Augmentation

Abstract

Image augmentation is a very powerful method to expand existing image datasets. This paper presents a novel method for creating a variation of existing images, called Object-Focused Image (OFI). This is when an image includes only the labeled object and everything else is made white. This paper elaborates on the OFI approach, explores its efficiency, and compares the validation accuracy of 780 notebooks. The presented testbed makes use of a subset of ImageNet Dataset (8,000 images of 14 classes) and incorporates all available models in Keras. These 26 models are tested before augmentation and after applying 9 different categories of augmentation methods. Each of these 260 notebooks is tested in 3 different scenarios: scenario A (ImageNet weights are not used and network layers are trainable), scenario B (ImageNet weights are used and network layers are trainable) and scenario C (ImageNet weights are used and network layers are not trainable). The experiments presented in this paper show that using OFI images along with the original images can be better than other augmentation methods in 16.4% of the cases. It was also shown that OFI method could help some models learn although they could not learn when other augmentation methods were applied. The conducted experiments also proved that the Kernel filters and the color space transformations are among the best data augmentation methods.

References

E. K. Wang, S. P. Xu, C. M. Chen, and N. Kumar, Neural Architecture Search Based Multiobjective Cognitive Automation System, IEEE Systems Journal, https://doi.org/10.1109/JSYST.2020.3002428, 2020.

F. Zhang, T.-Y. Wu, J.-S. Pan, G. Ding, and Z. Li, Human Motion Recognition Based on SVM in VR Art Media Interaction Environment, Human-centric Computing and Information Sciences,9: 40, 2019.

E. K. Wang, X. Zhang, F. Wang, T.-Y. Wu, and C.-M. Chen, Multilayer Dense Attention Model for Image Caption, IEEE Access, vol. 7, pp. 66358-66368, 2019.

Y. Cui, M. Jia, T. Lin, Y. Song and S. Belongie, Class-Balanced Loss Based on Effective Number of Samples,2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 9260-9269, 2019.

Müller, S. G., & Hutter, F. (2021). TrivialAugment: Tuning-free yet state-of-the-art data augmentation. Retrieved from http://arxiv.org/abs/2103.10158

Shorten, C., Khoshgoftaar, T.M. A survey on Image Data Augmentation for Deep Learning. J Big Data 6, 60 (2019). https://doi.org/10.1186/s40537-019-0197-0

Naveed, H. Survey: Image Mixing and Deleting for Data Augmentation. arXiv 2021, arXiv:2106.07085

Khosla, C., & Saini, B. S. (2020, June). Enhancing performance of deep learning models with different data augmentation methods: A survey. Proceedings of the International Conference on Intelligent Engineering and Management (ICIEM), London, United Kingdom (pp. 79-85). doi:10.1109/iciem48762.2020.9160048

Girshick, R., Radosavovic, I., Gkioxari, G., Dollar, P., and He, K., Detectron. https://github.com/facebookresearch/detectron, 2018.

XIE, Q. Q. Le., Dai, Z. D., Eduard Hovy, Thang Luong, and Quoc Le., Hovy, E., Luong, T., & Le, Q. (2020). Unsupervised data augmentation for consistency training. In H. Larochelle, M. Ranzato, & R. Hadsell (Eds.), Advances in Neural Information Processing Systems (pp. 6256–6268). Curran Associates, Inc.

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009, June). ImageNet: A large-scale hierarchical image database. Proceedings of 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops), Miami, FL. doi:10.1109/cvpr.2009.5206848

Dosovitsk, A., Fischer, P., Springenberg, J. T., Riedmiller, M., & Brox, T. (2016). Discriminative unsupervised feature learning with exemplar convolutional neural networks. Transactions on Pattern Analysis and Machine Intelligence, 38(6), 1734–1747.

Graham, B. (2014). Fractional Max-Pooling. Retrieved from http://arxiv.org/abs/1412.6071

Sajjadi, M., Tasdizen, T., & Javanmardi, M. (2016). Regularization with stochastic transformations and perturbations for deep semi-supervised learning. Neural Information Processing Systems.

Ratner,R.J, Ehrenberg,H,Hussain,Z., Dunnmon,J. &Re,C. (2017). Learning to compose domain-specific transformations for data augmentation. Neural Information Processing Systems.

Khalifa, N. E., Loey, M., & Mirjalili, S. (2021). A comprehensive survey of recent trends in deep learning for digital images augmentation. Artificial Intelligence Review, 55(3), 1–27. doi:10.1007/s10462-021-10066-4

Perez, L., & Wang, J. (2017). The effectiveness of data augmentation in image classification using deep learning. Retrieved from http://arxiv.org/abs/1712.04621

Zhang, H., Cisse, M., Dauphin, Y. N., & Lopez-Paz, D. (2017). mixup: Beyond Empirical Risk Minimization. Retrieved from http://arxiv.org/abs/1710.09412

DeVries, T., & Taylor, G. W. (2017). Improved regularization of convolutional neural networks with cutout. Retrieved from http://arxiv.org/abs/1708.04552

Vyas A., Yu S., Paik J. (2018) Fundamentals of digital image processing. Signals Commun Technol. https://doi.org/10.1007/978-981-10-7272-7_1

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84–90. doi:10.1145/3065386

Chen, P., Liu, S., Zhao, H., & Jia, J. (2020). GridMask Data Augmentation. Retrieved from http://arxiv.org/abs/2001.04086

Provos, N., & Honeyman, P. (2003). Hide and seek: An introduction to steganography [Review of Hide and seek: An introduction to steganography]. IEEE Security & Privacy, 1(3), 32–44.

Zhong, Z., Zheng, L., Kang, G., Li, S., & Yang, Y. (2020). Random Erasing Data Augmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 13001–13008. https://doi.org/10.1609/aaai.v34i07.7000

Shijie, J., Ping, W., Peiyi, J., & Siping, H. (2017, October). Research on data augmentation for image classification based on convolution neural networks. Proceedings of the Chinese Automation Congress (CAC). Jinan, China.

Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U-Net: Convolutional networks for bio-medical image segmentation. Proceedings of the MICCAI: International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany.

Mikołajczyk, A., & Grochowski, M. (2018, May). Data augmentation for improving deep learning in image classification problem. Proceedings of the International Interdisciplinary PhDWorkshop (IIPhDW) (pp. 117–122). Swinoujscie, Poland.

Moreno-Barea, F. J., Strazzera, F., Jerez, J. M., Urda, D., & Franco, L. (2018, November). Forward noise adjustment scheme for data augmentation. Proceedings of the IEEE Symposium Series on Computational Intelligence (SSCI), Bangalore, India. (pp. 728–734) doi:10.1109/ssci.2018.8628917

Hendrycks, D., Mu, N., Cubuk, E. D., Zoph, B., Gilmer, J., & Lakshminarayanan, B. (2019). AugMix: A simple data processing method to improve robustness and uncertainty. Retrieved from http://arxiv.org/abs/1912.02781

Inoue, H. Data Augmentation by Pairing Samples for Images Classification. arXiv 2018, arXiv:1801.02929.

Xie, T., Cheng, X., Wang, X., Liu, M., Deng, J., Zhou, T., & Liu, M. (2021, October 17). Cut-thumbnail: A novel data augmentation for convolutional neural network. Proceedings of the 29th ACM International Conference on Multimedia, Virtual Event China. doi:10.1145/3474085.3475302

Takahashi, R., Matsubara, T., & Uehara, K. (2018). Ricap: Random image cropping and patching data augmentation for deep CNNs. Asian Conference on Machine Learning, 1, 786–798.

Yun, S., Han, D., J, S., Chun, S., Choe, J., & Yoo, Y. (2019). Cutmix: Regularization strategy to train strong classifiers with localizable features [Review of Cutmix: Regularization strategy to train strong classifiers with localizable features]. In EEE International Conference on Computer Vision (pp. 6023–6032).

Harris, E., Marcu, A., Painter, M., Niranjan, M., Prügel-Bennett, A., & Hare, J. (2020). FMix: Enhancing mixed Sample Data Augmentation. Retrieved from http://arxiv.org/abs/2002.12047

Kim, J.-H., Choo, W., & Song, H. O. (2020). Puzzle Mix: Exploiting saliency and local statistics for optimal mixup. Retrieved from http://arxiv.org/abs/2009.06962

Walawalkar, D., Shen, Z., Liu, Z., & Savvides, M. (2020). Attentive Cutmix: An Enhanced Data Augmentation Approach for Deep Learning Based Image Classification. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 3642–3646).

Kang, G., Dong, X., Zheng, L., & Yang, Y. (2017). PatchShuffle Regularization. Retrieved from http://arxiv.org/abs/1707.07103

Zhou, Y. (2019). Slot Based Image Augmentation System for Object Detection. http://arxiv.org/abs/1907.12900

Ghiasi, G., Cui, Y., Srinivas, A., Qian, R., Lin, T.-Y., Cubuk, E. D., … Zoph, B. (2020). Simple Copy-paste is a strong data augmentation method for instance segmentation.http://arxiv.org/abs/2012.07177

Goodfellow IJ et al. (2014) Generative adversarial nets. Proceedings of the 27th International Conference on Neural Information Processing Systems - Vol 2, 2014, pp. 2672–2680.

Jing Y., Yang Y., Feng Z., Ye J., Yu Y., Song M. (2019) Neural style transfer: a review. IEEE Trans Visual Comput Graphics. doi:10.1109/TVCG.2019.2921336

Frans K., Ho J., Chen X., Abbeel P., Schulman J. (2018) Meta learning shared hierarchies.

Cubuk ED., Zoph B., Mane D., Vasudevan V., Le QV. (2019) Autoaugment: learning augmentation strategies from data, doi:10.1109/CVPR.2019.00020

Lemley, J., Bazrafkan, S., & Corcoran, P. (2017). Smart Augmentation Learning an Optimal Data Augmentation Strategy. IEEE access: practical innovations, open solutions, 5, 5858–5869. doi:10.1109/access.2017.2696121

Cubuk, E. D., Zoph, B., Shlens, J., & Le, Q. V. (2020). Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 702–703).

Tian, K., Lin, C., Sun, M., Zhou, L., Yan, J., & Ouyang, W. (2020). Improving auto-augment via augmentation-wise weight sharing. In Advances in Neural Information Processing Systems (p. 33).

Ho, D., Liang, E., Chen, X., Stoica, I., & Abbee, P. (2019). Population based augmentation: Efficient learning of augmentation policy schedules. In International Conference on Machine Learning, (pp. 2731–2741). PMLR.

Zhang, X., Wang, Q., Zhang, J., & Zhong, Z. (2020). Adversarial autoaugment. In International Conference on Learning Representations.

Hoffer, E., Itay, T. B.-N. H., Giladi, N., Hoefler, T., & Soudry, D. (2020). Augment your batch: Improving generalization through instance repetition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. CVPR.

Lim, S., Kim, I., Kim, T., Kim, C., & Kim, S. K. (2019). Fast autoaugment. In H. Wallach, H. Larochelle, A. Beygelzimer, F. D’Alch´e-Buc, E. Fox, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 32, pp. 6665–6675). Curran Associates.

Lin, C., Guo, M., Li, C., Yuan, X., Wu, W., Yan, J., Lin, D., & Ouyang, W. (2019). Online hyper-parameter learning for auto-augmentation strategy. In proceedings of the IEEE/CVF International Conference on Computer Vision. ICCV.

Gupta, A., Vedaldi, A., & Zisserman, A. (2016). Synthetic Data for Text Localisation in Natural Images. Retrieved from http://arxiv.org/abs/1604.06646

Dwibedi, D., Misra, I., & Hebert, M. (2017). Cut, paste and learn: Surprisingly easy synthesis for instance detection. Retrieved from http://arxiv.org/abs/1708.01642

Dvornik, N., Mairal, J., & Schmid, C. (2018). Modeling Visual Context is Key to Augmenting Object Detection Datasets. Retrieved from http://arxiv.org/abs/1807.07428

Tripathi, S., Chandra, S., Agrawal, A., Tyagi, A., Rehg, J. M., & Chari, V. (2019). Learning to generate synthetic data via compositing. Retrieved from http://arxiv.org/abs/1904.05475

Downloads

Published

2022-05-12

How to Cite

Hammoud, A., & Ghandour, A. (2022). Masking the Backgrounds to Produce Object-Focused Images and Comparing that Augmentation Method to Other Methods. Transactions on Machine Learning and Artificial Intelligence, 10(3), 1–24. https://doi.org/10.14738/tmlai.103.12245