Dynamic Job Ordering and Slot Configurations for Mapreduce Workloads Using Heuristic Algorithm

Authors

  • M. Praveen Kumar  Assistant Professor, Department of Information Technology, Rathinam Technical Campus, Coimbatore, Tamil Nadu, India
  • S. P. Santhoshkumar  Assistant Professor, Department of Computer Science and Engineering, Rathinam Technical Campus, Coimbatore, Tamil Nadu, India
  • S. Syed Shajahaan  Head of the Department, Department of Information Technology, Rathinam Technical Campus, Coimbatore, Tamil Nadu, India

Keywords:

MapReduce, Hadoop, Flow-Shops, Scheduling Algorithm, Job Ordering.

Abstract

MapReduce is a popular parallel computing paradigm for large-scale data processing in clusters and data centers. A MapReduce workload generally contains a set of jobs, each of which consists of multiple map tasks followed by multiple reduce tasks. Due to 1) that map tasks can only run in map slots and reduce tasks can only run in reduce slots, and 2) the general execution constraints that map tasks are executed before reduce tasks, different job execution orders and map/reduce slot configurations for a MapReduce workload have significantly different performance and system utilization. This paper proposes two classes of algorithms to minimize the makespan and the total completion time for an offline MapReduce workload. Our first class of algorithms focuses on the job ordering optimization for a MapReduce workload under a given map/reduce slot configuration. In contrast, our second class of algorithms considers the scenario that we can perform optimization for map/reduce slot configuration for a MapReduce workload. We perform simulations as well as experiments on Amazon EC2 and show that our proposed algorithms produce results that are up to 15 _ 80 percent better than currently unoptimized Hadoop, leading to significant reductions in running time in practice

References

  1. http://hadoop.apache.org/docs/r1.2.1/fairscheduler.html
  2. http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop yarnsite/CapacityScheduler.html
  3. Ching-Chi Lin, Pangfeng Liu, and Jan-JanWu. Energy-aware virtual machine dynamic provision and scheduling for cloud,. In Cloud Computing(CLOUD), 2011 IEEE Inter national Conference on, pages 736–737, july 2011.
  4. Anton Beloglazov and Rajkumar Buyya. Energy efficient allocation of virtual machines in cloud data centers, In 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, pages 577–578, 2010.
  5. YibinWei,Ling Tian , Research on cloud design resources scheduling based on Genetic Algorithm, 2012 International Conference on systems and informatics(ICSAI 2012)
  6. Chen, K. ; Powers, J. ; Guo, S. ; Tian, F. CRESP: Towards Optimal Resource Provisioning for MapReduce Computing in PublicClouds , IEEETransactions on Parallel and Distributed Systems ,Volume: 25 , Issue: 6 Publication Year: 2014 , Page(s): 1403–1412.
  7. Xiaohong Zhang ; Yuhong Feng ; Shengzhong Feng ;Jianping Fan ; Zhong Ming An effective data locality aware task scheduling method for MapReduce framework in heterogeneous environments, 2011 International Conference on Cloud and Service Computing (CSC)Year: 2011 , Page(s):235-242
  8. Sewoog Kim ; Dongwoo Kang ; Jongmoo Choi ; Junmo Kim Burstiness-aware I/O scheduler for MapReduce framework on virtualized environments , 2014 International Conference on Big Data and Smart Computing (BIGCOMP) Publication Year: 2014 , Page(s): 305–308.
  9. Hammoud, M. ; Rehman, M.S. ; Sakr, M.F. Center-of Gravity Reduce Task Scheduling to Lower MapReduce Network Traffic , 2012 IEEE 5th International Conference on Cloud Computing (CLOUD) Publication Year: 2012 , Page(s): 49–58.
  10. J. Polo, D. Carrera, Y. Becerra, J. Torres, E. Ayguad ́e, M. Steinder, and I. Whalley. Performance-driven task co-scheduling for MapReduce environments. In 12th IEEE/IFIP Network Operations and Management Symposium. ACM, 2010.
  11. L. Phan, Z. Zhang, B. Loo, and I. Lee. Real-time MapReduce Scheduling. Tech. Report No. MS-CIS-10-32, UPenn, 2010.
  12. B. Palanisamy, A. Singh, L. Liu, and B. Jain. Purlieus: localityaware resource allocation for MapReduce in a cloud. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2011.
  13. Resource management with VMware DRS http://www.vmware.com/pdf/vmware_drs_wp.p

Downloads

Published

2017-10-31

Issue

Section

Research Articles

How to Cite

[1]
M. Praveen Kumar, S. P. Santhoshkumar, S. Syed Shajahaan, " Dynamic Job Ordering and Slot Configurations for Mapreduce Workloads Using Heuristic Algorithm, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 5, pp.67-72, September-October-2017.