Data Annotation in Large-Scale Datasets with Supervision

Authors

  • T. Satya Kiranmai  Assistant Professor, Computer Science and Engineering, CMR College of Engineering and Technology, Kandlakoya, Medchal, Telangana, India
  • Swetha Koduri  Assistant Professor, Information Technology, Malla Reddy College of Engineering and Technology, Maisammaguda, Dhulapally, Telangana, India

Keywords:

Multi-Set, Annotation, Image Dataset

Abstract

We display a way to deal with adequately utilize a huge number of pictures with uproarious comments in conjunction with a little subset of neatly clarified pictures to learn intense picture portrayals. One regular way to deal with consolidates spotless and loud information is to first pretrain a system utilizing the extensive uproarious dataset and afterward tweak with the clean dataset. We demonstrate this approach does not completely use the data contained in the spotless set. In this manner, we exhibit how to utilize the perfect comments to decrease the clamour in the vast dataset before adjusting the system utilizing both the spotless set and the full set with diminished commotion. The approach includes a multi-undertaking system that together figures out how to clean loud explanations and to precisely order pictures. We assess our approach on the as of late discharged Open Images dataset, containing 9 million pictures, different explanations per picture and more than 6000 extraordinary classes. For the little clean arrangement of comments, we utilize a fourth of the approval set with 40k pictures. Our outcomes show that the proposed approach unmistakably outflanks coordinate calibrating over every single significant classification of classes in the Open Image dataset. Further, our approach is especially successful for countless with extensive variety of commotion in comments (20-80% false positive explanations).

References

  1. Awni Y. Hannun Andrew L. Maas and Andrew Y. Ng. Rectifier nonlinearities improve neural network acoustic models. ICML Workshop on Deep Learning for Audio, Speech, and Language Processing,  2013
  2. Nicolas Ballas, Li Yao, Chris Pal, and Aaron Courville. Delving deeper into convolutional networks for learning video representations. In the Proceedings of ICLR. arXiv preprint arXiv:1511.06432, 2016.
  3. Chenyi Chen, Ari Seff, Alain Kornhauser, and Jianxiong Xiao. Deepdriving: Learning affordance for direct perception in autonomous driving. In Proceedings of the IEEE International Conference on Computer Vision, pp. 2722–2730, 2015.
  4. Andrew J Conley, Rolando Garcia, Doug Kinnison, Jean-Francois Lamarque, Dan Marsh, Mike Mills, Anne K Smith, Simone Tilmes, Francis Vitt, Hugh Morrison, et al. Description of the ncar community atmosphere model (cam 5.0). 2012.
  5. C. Bucilu, R. Caruana, and A. Niculescu-Mizil. Show pressure. In Proceedings of the twelfth ACM SIGKDD universal meeting on Knowledge revelation and information mining, pages 535-541. ACM, 2006.
  6. [6].Michael D. Dettinger, Fred Martin Ralph, Tapash Das, Paul J. Neiman, and Daniel R. Cayan. Atmospheric rivers, floods and the water resources of california. Water, 3(2):445, 2011. ISSN 2073-4441. URL http://www.mdpi.com/2073-4441/3/2/445.
  7. Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. Long-term recurrent convolutional networks for visual recognition and description. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
  8. Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, and Scott Reed. Ssd: Single shot multibox detector. arXiv preprint arXiv:1512.02325, 2015.
  9. Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, and Ian J. Goodfellow. Adversarial autoencoders. CoRR, abs/1511.05644, 2015. URL http://arxiv.org/abs/1511.05644.
  10. R. Fergus, Y. Weiss, and A. Torralba. Semi-administered learning in massive picture accumulations. In Advances in neural data preparing frameworks, pages 522-530, 2009.
  11. Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.    

Downloads

Published

2017-12-31

Issue

Section

Research Articles

How to Cite

[1]
T. Satya Kiranmai, Swetha Koduri, " Data Annotation in Large-Scale Datasets with Supervision, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 6, pp.295-298, November-December-2017.