Abstract:
Nowadays, replication technique is widely used in data center
storage systems to prevent data loss. Data popularity is a key factor in data
replication as popular files are accessed most frequently and then they become
unstable and unpredictable. Moreover, replicas placement is one of key issues
that affect the performance of the system such as load balancing, data locality
etc. Data locality is a fundamental problem to data-parallel applications that
often happens and this problem leads to the decrease in performance. To address
these challenges, this paper proposes a dynamic replication management scheme
based on data popularity and data locality; it includes replica allocation and
replica placement algorithms. Data locality, disk bandwidth, CPU processing
speed and storage utilization are considered in the proposed data placement
algorithm in order to achieve better data locality and load balancing effectively.
Our proposed scheme will be effective for large-scale cloud storage.