Content-Based Image Retrieval : A Comprehensive Study

Authors

  • Abhishek Swaroop  Head of Department, Department of Information Technology, Bhagwan Parshuram Institute of Technology, New Delhi, Delhi, India
  • Aman  Student, Department of Information Technology & Engineering, GGSIPU, Bhagwan Parshuram Institute of Technology, Affiliated to GGSIPU, New Delhi, India
  • Amit Rawat  Student, Department of Information Technology & Engineering, GGSIPU, Bhagwan Parshuram Institute of Technology, Affiliated to GGSIPU, New Delhi, India
  • Ashwin Giri  Student, Department of Information Technology & Engineering, GGSIPU, Bhagwan Parshuram Institute of Technology, Affiliated to GGSIPU, New Delhi, India
  • Hardik Gothwal  Student, Department of Information Technology & Engineering, GGSIPU, Bhagwan Parshuram Institute of Technology, Affiliated to GGSIPU, New Delhi, India

DOI:

https://doi.org//10.32628/CSEIT1952275

Keywords:

Content-Based Image Retrieval, Convolutional Neural Networks, Feature Illustration

Abstract

Learning efficient options illustrations and equivalency metric measures are imperative to the searching performance of a content-based image retrieval (CBIR) machine. Despite in depth analysis efforts for many years, it remains one amongst the foremost difficult open issues that significantly hinders the success of real- world CBIR systems. The key issue has been associated to the commonly known “linguistic gap” problem that exists between low-level image pixels captured by machines and high-level linguistics ideas perceived by humans. Among varied techniques, machine learning has been actively investigated as a potential direction to bridge the linguistics gap in the long run. Motivated by recent success of deep learning techniques for computer vision and other applications, In this paper, we'll conceive to address an open problem: if deep learning could be a hope for bridging the linguistics gap in CBIR and the way a lot of enhancements in CBIR tasks may be achieved by exploring the progressive deep learning methodologies for learning options illustrations and equivalency measures. Speci?cally, we'll investigate a framework of deep learning with application to CBIR tasks with an extensive set of empirical studies by examining a progressive deep learning technique (Convolutional Neural Networks) for CBIR tasks in varied settings. From our empirical studies, we found some encouraging results and summarized some vital insights for future analysis. CBIR tasks may be achieved by exploring the progressive deep learning techniques for learning options illustrations and equivalency measures.

References

  1. D. H. Ackley, G. E. Hinton, and T. J. Sejnowski. A learning algorithm for boltzmann machines*. Cognitive science,9(1):147–169,1985.
  2. A. Bar-Hillel, T. Hertz, N. Shental, and D. Weinshall. Learning distance functions using equivalence relations. In ICML,pages11–18,2003.
  3. H. Bay, T. Tuytelaars, and L. J. V. Gool. Surf: Speeded up robust features. In ECCV (1), pages 404–417, 2006. 4B. C. Becker and E. G. Ortiz. Evaluating open-universe face identification on the web. In CVPR Workshops, pages 904–911,2013 . 5]Y.Bengio, A.C.Courville, and P. Vincent. Unsupervised feature learning and deep learning: A review and new perspectives.CoRR,abs/1206.5538,2012.
  4. H. Chang and D.-Y. Yeung. Kernel-based distance metric learning for content-based image retrieval. Image andVisionComputing,25(5):695–703,2007.
  5. G. Chechik, V. Sharma, U. Shalit, and S. Bengio. Large scale online learning of image similarity through ranking. Journal of Machine Learning Research, 11:1109–1135, 2010.
  6. D. C. Ciresan, A. Giusti, L. M. Gambardella, and J. Schmidhuber. Deep neural networks segment neuronal membranes in electron microscopy images. In NIPS, pages 2852–2860,2012.
  7. K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. Journal of Machine Learning Research, 7:551–585, 2006.
  8. J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, M. Z. Mao, M. Ranzato, A. W. Senior, P. A. Tucker, K. Yang, and A. Y. Ng. Large scale distributed deep networks. In NIPS, pages 1232–1240, 2012. 11L. Deng. A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Transactions on Signal and Information Processing, 3:e2, 2014. 12C. Domeniconi, J. Peng, and D. Gunopulos. Locally adaptive metric nearest-neighbor classification. IEEE Trans. Pattern Anal. Mach. Intell., 24(9):1281–1285, 2002. 13J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. CoRR, abs/1310.1531,2013.
  9. R. B. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR, abs/1311.2524, 2013. 15M. Guillaumin, J. J. Verbeek, and C. Schmid. Is that you? metric learning approaches for face identification. In ICCV,pages498–505,2009.
  10. G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. Signal Processing Magazine, IEEE, 29(6):82–97,2012.
  11. G. E. Hinton, S. Osindero, and Y. W. Teh. A fast learning algorithm for deep belief nets. Neural Computation,18(7):1527–1554,2006.
  12. S. C. H. Hoi, W. Liu, M. R. Lyu, and W.-Y. Ma. Learning distance metrics with contextual constraints for image retrieval. In CVPR (2), pages 2072–2078, 2006. 19E. H. Huang, R. Socher, C. D. Manning, and A. Y. Ng. Improving word representations via global context and multiple word prototypes. In ACL (1), pages 873–882, 2012.
  13. G. B. Huang, M. Ramesh, T. Berg, and E. Learned- Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst, October 2007.
  14. A. K. Jain and A. Vailaya. Image retrieval using color and shape. Pattern Recognition, 29(8):1233–1244, 1996. 22 P. Jain, B. Kulis, I. S. Dhillon, and K. Grauman. Online metric learning and fast similarity search. In NIPS, pages 761–768, 2008.
  15. H. Jégou, F. Perronnin, M. Douze, J. Sánchez, P. Pérez, and C. Schmid. Aggregating local image descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell., 34(9):1704–1716, 2012.
  16. R. Jin, S. Wang, and Y. Zhou. Regularized distance metric learning: Theory and algorithm. In NIPS, pages 862–870, 2009.
  17. Y. Jing and S. Baluja. Visualrank: Applying pagerank to large-scale image search. IEEE Trans. Pattern Anal. Mach. Intell., 30(11):1877–1890, 2008.
  18. A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, pages 1106–1114, 2012.
  19. N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar. Attribute and simile classifiers for face verification. In ICCV, pages 365–372, 2009.
  20. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
  21. J.-E. Lee, R. Jin, and A. K. Jain. Rank-based distance metric learning: An application to image retrieval. In CVPR, 2008.
  22. M. S. Lew, N. Sebe, C. Djeraba, and R. Jain. Content- based multimedia information retrieval: State of the art and challenges. TOMCCAP, 2(1):1–19, 2006.
  23. D. G. Lowe. Object recognition from local scale- invariant features. In ICCV, pages 1150–1157, 1999.
  24. B. S. Manjunath and W.-Y. Ma. Texture features for browsing and retrieval of image data. IEEE Trans. Pattern Anal. Mach. Intell., 18(8):837–842, 1996.
  25. A. S. Mian, Y. Hu, R. Hartley, and R. A. Owens. Image set based face recognition using self-regularized non-negative coding and adaptive distance metric learning. IEEE Transactions on Image Processing, 22(12):5252– 5262, 2013.
  26. T. Mikolov, W. tau Yih, and G. Zweig. Linguistic regularities in continuous space word representations. In HLT-NAACL, pages 746–751, 2013.
  27. M. Norouzi, D. J. Fleet, and R. Salakhutdinov. Hamming distance metric learning. In NIPS, pages 1070– 1078, 2012.
  28. A. Oliva and A. Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3):145–175, 2001.
  29. A. Oliva and A. Torralba. Scene-centered description from spatial envelope properties. In Biologically Motivated Computer Vision, pages 263–272, 2002.
  30. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In CVPR, 2007.
  31. A. S. Razavian, H. Azizpour, J. Sullivan, and S. Carlsson. Cnn features off-the-shelf: an astounding baseline for recognition. CoRR, abs/1403.6382, 2014.
  32. R. Salakhutdinov and G. E. Hinton. Deep boltzmann machines. In AISTATS, pages 448–455, 2009.
  33. R. Salakhutdinov and G. E. Hinton. Semantic hashing. Int. J. Approx. Reasoning, 50(7):969–978, 2009.
  34. R. Salakhutdinov, A. Mnih, and G. E. Hinton. Restricted boltzmann machines for collaborative filtering. In ICML, pages 791–798, 2007.
  35. P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun. Overfeat: Integrated recognition, localization and detection using convolutional networks. CoRR, abs/1312.6229, 2013.
  36. J. Sivic, B. C. Russell, A. A. Efros, A. Zisserman, and W. T. Freeman. Discovering objects and their localization in images. In ICCV, pages 370–377, 2005.
  37. A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain. Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell., 22(12):1349–1380, 2000.
  38. D. Wang, S. C. H. Hoi, P. Wu, J. Zhu, Y. He, and C. Miao. Learning to name faces: a multimodal learning scheme for search-based face annotation. In SIGIR, pages 443–452, 2013.
  39. Z. Wang, Y. Hu, and L.-T. Chia. Learning image-to-class distance metric for image classification. ACM TIST,4(2):34, 2013.
  40. K. Q. Weinberger, J. Blitzer, and L. K. Saul. Distance metric learning for large margin nearest neighbor classification. In NIPS, 2005.
  41. J. Wu and J. M. Rehg. Centrist: A visual descriptor for scene categorization. IEEE Trans. Pattern Anal. Mach. Intell., 33(8):1489–1501, 2011.
  42. L. Wu and S. C. H. Hoi. Enhancing bag-of-words models with semantics-preserving metric learning. IEEE MultiMedia, 18(1):24–37, 2011.
  43. L. Wu, S. C. H. Hoi, and N. Yu. Semantics-preserving bag-of-words models and applications. IEEE Transactions on Image Processing, 19(7):1908–1920, 2010.
  44. P. Wu, S. C. H. Hoi, H. Xia, P. Zhao, D. Wang, and C. Miao. Online multimodal deep similarity learning with application to image retrieval. In ACM Multimedia, pages 153–162, 2013.
  45.  H. Xie, Y. Zhang, J. Tan, L. Guo, and J. Li. Contextual query expansion for image retrieval. IEEE Transactions on Multimedia, 16(4):1104–1114, 2014.
  46. J. Yang, Y.-G. Jiang, A. G. Hauptmann, and C.-W. Ngo. Evaluating bag-of-visual-words representations in scene classification. In Multimedia Information Retrieval, pages 197–206, 2007.
  47. D. Yu, M. L. Seltzer, J. Li, J.-T. Huang, and F. Seide. Feature learning in deep neural networks - a study on speech recognition tasks. CoRR, abs/1301.3605, 2013.
  48. M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. CoRR, abs/1311.2901, 2013.
  49. L. Zhang, Y. Zhang, X. Gu, J. Tang, and Q. Tian. Scalable similarity search with topology preserving hashing. IEEE Transactions on Image Processing, 23(7):3025–3039, 2014.
  50. Y. Zhang, L. Zhang, and Q. Tian. A prior-free weighting scheme for binary code ranking. IEEE Transactions on Multimedia, 16(4):1127–1139, 2014.

Downloads

Published

2019-04-30

Issue

Section

Research Articles

How to Cite

[1]
Abhishek Swaroop, Aman, Amit Rawat, Ashwin Giri, Hardik Gothwal, " Content-Based Image Retrieval : A Comprehensive Study, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 5, Issue 2, pp.1073-1081, March-April-2019. Available at doi : https://doi.org/10.32628/CSEIT1952275