Abstract:
Healthcare industry is an ever-increasing rise
in the large amount of records such as doctors,
patients, medicines and medical history. Although
previous medical records are beneficial for not only
individual but also human society, maintaining and
analyzing large amount of such data is a big
problem. Traditional data mining tools are
inadequate for such amount of data. Big data
analysis can be used in various applications with
different domains like education, security and health
care. This system uses MapReduce based C4.5
Decision Tree algorithm for health care big data to
manage, analyze and extract the most suitable data
for right conditions. The classification rules
produced by this system can be used to classify the
particular disease. Tuberculosis disease is used as a
case study in the system.