UCSY's Research Repository

Efficient Data Partitioning for Entity Resolution Systems

Show simple item record

dc.contributor.author Mon, Aye Chan
dc.contributor.author Thwin, Mie Mie Su
dc.date.accessioned 2019-07-11T03:23:32Z
dc.date.available 2019-07-11T03:23:32Z
dc.date.issued 2013-02-26
dc.identifier.uri http://onlineresource.ucsy.edu.mm/handle/123456789/672
dc.description.abstract Entity Resolution is the task of identifying duplicated records that refer to the same real-world entity. It is costly process that can take up to days for large datasets. Various Blocking Methods have been applied in Entity Resolution Systems to reduce the number of record pairs for comparison. It is still a big issue because a good blocking key is critical to the success of a blocking method and will ideally result in lots of small blocks. The efficiency of a blocking method is hindered by these large blocks since the resulting number of record pairs is dominated by the sizes of these large blocks. So, the researchers are still doing researches on handling the problems of large blocks. To overcome these problems, we would like to propose an efficient data partitioning system by introducing “Dynamic Block Based Structure” to enhance the blocking efficiency. en_US
dc.language.iso en en_US
dc.publisher Eleventh International Conference On Computer Applications (ICCA 2013) en_US
dc.subject entity resolution en_US
dc.subject data matching en_US
dc.subject data linkage en_US
dc.subject indexing en_US
dc.subject pre-processing en_US
dc.title Efficient Data Partitioning for Entity Resolution Systems en_US
dc.type Article en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


My Account