Data Classification by KNN using Mapreduce In Hadoop

Authors

  • Aishwarya M R  ISE,New Horizon College of Engineering, Bangalore, Karnataka,India
  • Debaswini Khuntia  ISE,New Horizon College of Engineering, Bangalore, Karnataka,India
  • Preethi J D  ISE,New Horizon College of Engineering, Bangalore, Karnataka,India

Keywords:

KNN, MapReduce

Abstract

Recent works have focused on efficient solutions using MapReduce programming model because it is suitable for distributed large scale data processing. For same problem this work provide different solutions with particular constraints and properties. According to this paper, we compute KNN on MapReduce to compare different approaches through experimental evaluation then we analyse the impact of data volume, data dimension for different perspectives like time complexity, space complexity and accuracy. Therefore, MapReduce through its Hadoop implementation is well suited for batch processing of static data.

References

  1. D. Li, Q. Chen, and C.-K. Tang, "Motion-aware knn laplacian for video matting," in ICCV’13, 2013.
  2. G. Song, J. Rochas, F. Huet, and F. Magoules, "Solutions for Processing K Nearest Neighbor Joins for Massive Data on MapReduce," in 23rd Euromicro International Conference on Parallel, Distributed and Network-based Processing, Turku, Finland, Mar. 2015.
  3. M. I. Andreica and N. T. pus, "Sequential and mapreduce-based algorithms for constructing an in-place multidimensional quadtree index for answering fixed-radius nearest neighbor queries," 2013.
  4. C. Ji, T. Dong, Y. Li, Y. Shen, K. Li, W. Qiu, W. Qu, and M. Guo, "Inverted grid-based knn query processing with mapreduce," in Proceedings of the 2012 Seventh Grid Annual Conference, ser. CHINAGRID ’12. Washington, DC, USA: IEEE Computer Society, 2012, pp. 25–32. Online]. Available: http://dx.doi.org/10.1109/ChinaGrid.2012.19
  5. A. S. Arefin, C. Riveros, R. Berretta, and P. Moscato, "Gpu-fs-knn: A software tool for fast and scalable knn computation using gpus," PLOS ONE, 2012.
  6. W. Lu, Y. Shen, S. Chen, and B. C. Ooi, "Efficient processing of k nearest neighbor joins using mapreduce," Proc. VLDB Endow., 2012.
  7. C. Zhang, F. Li, and J. Jestes, "Efficient parallel knn joins for large data in mapreduce," in Extending Database Technology, 2012.
  8. G. Song, Z. Meng, F. Huet, F. Magoules,' L. Yu, and X. Lin, "A hadoop mapreduce performance prediction method," in HPCC’13, 2013
  9. Q. Du and X. Li, "A novel knn join algorithms based on hilbert r-tree in mapreduce," in Computer Science and Network Technology (ICCSNT), 2013

Downloads

Published

2017-06-30

Issue

Section

Research Articles

How to Cite

[1]
Aishwarya M R, Debaswini Khuntia, Preethi J D, " Data Classification by KNN using Mapreduce In Hadoop, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 3, pp.297-300, May-June-2017.