Data Classification by KNN using Mapreduce In Hadoop

Authors(3) :-Aishwarya M R, Debaswini Khuntia, Preethi J D

Recent works have focused on efficient solutions using MapReduce programming model because it is suitable for distributed large scale data processing. For same problem this work provide different solutions with particular constraints and properties. According to this paper, we compute KNN on MapReduce to compare different approaches through experimental evaluation then we analyse the impact of data volume, data dimension for different perspectives like time complexity, space complexity and accuracy. Therefore, MapReduce through its Hadoop implementation is well suited for batch processing of static data.

Authors and Affiliations

Aishwarya M R
ISE,New Horizon College of Engineering, Bangalore, Karnataka,India
Debaswini Khuntia
ISE,New Horizon College of Engineering, Bangalore, Karnataka,India
Preethi J D
ISE,New Horizon College of Engineering, Bangalore, Karnataka,India

KNN, MapReduce

  1. D. Li, Q. Chen, and C.-K. Tang, "Motion-aware knn laplacian for video matting," in ICCV’13, 2013.
  2. G. Song, J. Rochas, F. Huet, and F. Magoules, "Solutions for Processing K Nearest Neighbor Joins for Massive Data on MapReduce," in 23rd Euromicro International Conference on Parallel, Distributed and Network-based Processing, Turku, Finland, Mar. 2015.
  3. M. I. Andreica and N. T. pus, "Sequential and mapreduce-based algorithms for constructing an in-place multidimensional quadtree index for answering fixed-radius nearest neighbor queries," 2013.
  4. C. Ji, T. Dong, Y. Li, Y. Shen, K. Li, W. Qiu, W. Qu, and M. Guo, "Inverted grid-based knn query processing with mapreduce," in Proceedings of the 2012 Seventh Grid Annual Conference, ser. CHINAGRID ’12. Washington, DC, USA: IEEE Computer Society, 2012, pp. 2532. Online]. Available:
  5. A. S. Arefin, C. Riveros, R. Berretta, and P. Moscato, "Gpu-fs-knn: A software tool for fast and scalable knn computation using gpus," PLOS ONE, 2012.
  6. W. Lu, Y. Shen, S. Chen, and B. C. Ooi, "Efficient processing of k nearest neighbor joins using mapreduce," Proc. VLDB Endow., 2012.
  7. C. Zhang, F. Li, and J. Jestes, "Efficient parallel knn joins for large data in mapreduce," in Extending Database Technology, 2012.
  8. G. Song, Z. Meng, F. Huet, F. Magoules,' L. Yu, and X. Lin, "A hadoop mapreduce performance prediction method," in HPCC’13, 2013
  9. Q. Du and X. Li, "A novel knn join algorithms based on hilbert r-tree in mapreduce," in Computer Science and Network Technology (ICCSNT), 2013

Publication Details

Published in : Volume 2 | Issue 3 | May-June 2017
Date of Publication : 2017-06-30
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 297-300
Manuscript Number : CSEIT172335
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

Aishwarya M R, Debaswini Khuntia, Preethi J D, "Data Classification by KNN using Mapreduce In Hadoop", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 3, pp.297-300, May-June-2017.
Journal URL :

Article Preview