Improving the performance of Hadoop MapReduce Applications via Optimization of Concurrent Containers Per Node

Htay, Than Than; Phyu, Sabai

UCSYRR Home
/
Conferences
/
International Conference on Computer Applications (ICCA)
/
Eighteenth International Conference On Computer Applications (ICCA 2020)
/
View Item

Improving the performance of Hadoop MapReduce Applications via Optimization of Concurrent Containers Per Node

Htay, Than Than; Phyu, Sabai

URI: https://onlineresource.ucsy.edu.mm/handle/123456789/2562

Date: 2020-02-28

Abstract:

Apache Hadoop is a distributed platform for storing, processing and analyzing of big data on commodity machines. Hadoop has tunable parameters and they affect the performance of MapReduce applications significantly. In order to improve the performance, tuning the Hadoop configuration parameters is an effective approach. Performance optimization is usually based on memory utilization, disk I/O rate, CPU utilization and network traffic. In this paper, the effect of MapReduce performance is experimented and analyzed by varying the number of concurrent containers (cc) per machine on yarn-based pseudo-distributed mode. In this experiment, we also measure the impact of performance by using different suitable Hadoop Distributed File System (HDFS) block size. From our experiment, we found that tuning cc per node improve performance compared to default parameter setting. We also observed the further performance improvement via optimizing cc along with different HDFS block size.

Show full item record